Regular expressions training

I found Regex Crossword the other day, and immediately got addicted to it. So addicted, that my current ranking is 286, and just in a couple of days. Most puzzles are squares, some of the puzzles are hexagonal, many of them are extremely interesting, even using Unicode characters (e.g., Checkmate, Atheists). They pushed my knowledge of regular expressions to the limit. I have to admit there were several cases where I had to use an engine (I find regex101 pretty cool and easy to work with) to make sure I understand correctly what the regex wants.

Unicode can be tricky to work with when building a regular expression. Mostly because of the apparently infinite number of characters having similar functions: 17 white spaces, 24 dashes, 1984 lowercase letters etc. Luckily, you can rely on the fact that every single Unicode character is assigned to one of the (currently) 31 categories. You can view this list of categories, if interested in seeing what they are. And luckier us, most languages that support regexes have included these categories in their engines, to some extent, where they are equivalent to character classes.

If you look at this Word Character Class crossword, you will notice lots of, obviously, character classes. Imagine you would need to write those regexes for all letters, explicitly. Computer says no! It’s much easier to both write and understand if you just use \p{Ll} for lowercase letters, whichever script they might be in.

Happy regexing!