Simply, some sites don't enforce the full set of RFC rules, as such people actually have non-RFC-compliant email addresses that are valid.
How can you 'compile' a regular expression?
For very simple regular expressions, they might be decently fast, but as soon as you start pulling out the more complicated regular expressions needed for parsing, you get slower. Even simple repeats can have a lot of overhead if not used correctly, have a look at "Looking Inside The Regex Engine" at this link http://www.regular-expressions.info/repeat.html. An equivalent parser doesn't need to do any form of backtracking, and doesn't care about the structure. For example, I've seen an application use regular expressions for html parsing. After spending a while figuring out what they actually did, I found the source html had changed its whitespace, but not the DOM structure, which broke the regular expressions.
As for my reasoning above, I think a lot of 'abstraction' libraries would be faster by operating directly on the data, instead of just converting it to regular expressions. The beauty of regular expressions is the speed at which they can be written.
How can you 'compile' a regular expression?
For very simple regular expressions, they might be decently fast, but as soon as you start pulling out the more complicated regular expressions needed for parsing, you get slower. Even simple repeats can have a lot of overhead if not used correctly, have a look at "Looking Inside The Regex Engine" at this link http://www.regular-expressions.info/repeat.html. An equivalent parser doesn't need to do any form of backtracking, and doesn't care about the structure. For example, I've seen an application use regular expressions for html parsing. After spending a while figuring out what they actually did, I found the source html had changed its whitespace, but not the DOM structure, which broke the regular expressions.
As for my reasoning above, I think a lot of 'abstraction' libraries would be faster by operating directly on the data, instead of just converting it to regular expressions. The beauty of regular expressions is the speed at which they can be written.