I tweeted about an interesting blog post regarding Regex and email validation.
Just stop validating email addresses http://t.co/Cmbfw8zIHb— Giovanni Lodi (@mokagio) November 25, 2014
The point of the post makes is don’t even try to validate an email address because basically everything with one @ is a valid email address.
Nevertheless there are several Regexs in there that I’m not familiar with, so lets dig deeper!
\z are patterns to match the beginning and end of a string.
[^@] is a character set
 for any character except
^ the character
+ means: match one or more of the preceding token. So in our case one or more set of characters that don’t include
@ simply matches
Recap: so far we’re matching the start part of the email address and it’s @.
([^@\.]+\.)+ deserves to be split:
()is a capturing group, and it groups multiple tokens together and creates a capture group for extracting a substring or using a backreference. That’s why it’s followed by
[^@\.]+as seen before is a character set for any character except
\.stands for is the escaped
., we need to escape it because
.is a special Regex character that matches any character except line breaks. Again we have the
+, which we explained already.
\.at the end of any number of group of characters that do not include
.we need to have a
Recap: this second part matches the second half of the email address, and says any number of groups of characters that doesn’t include
. and end with
.. Something like mokagio42@_email.provider.that.is.nice._com.
[^@\.]+we’ve already seen all the elements of this group, and it’s function is to match the “termination” of the email address.
Wrapping it up
Given an email address like
email@example.com we have: