There's only ONE way to validate an email address
The only thing that you can reliably do to validate an email address is to send it an email. YOU SEND IT AN EMAIL! That’s the only way you can do it. I know what you’re thinking,
“I have the best regular expression for this!”
Any sufficiently advanced regular expression is indistinguishable from magic. pic.twitter.com/qKsLtwL9K5
— Changelog (@changelog) March 26, 2017
No, you do not. You think you do, but you don’t. Your regular expression is invalid; it’s not good enough. You know the old adage:
“A developer, when faced with a problem, thought ‘I know. I’ll use regular expressions.’ Now he has two problems.”
That’s what you have - you have two problems. I’ve known this for years, and yet I was still convinced recently to add a regular expression-based email validation server-side;
(First of all, never trust a client, right? You can do it all you want there, but it can bypass all your checks. It’s gotta be server-side.)
When you only enforce input validations client-side pic.twitter.com/TjjL9bNVEo
— Changelog (@changelog) February 23, 2017
I put a regular expression-based email validation and I thought “This one’s pretty good.”
In fact – man, I don’t know what came over me; I was actually even talked into copy-pasting one off of a gist! 😭
It looked pretty good, and it covered most of the bases, and sure enough, last week I got an email from a prospective user saying
“Hey, I’m trying to sign up for Changelog Weekly, but it says my email address isn’t valid, and it obviously is valid, because I’m emailing you with it right now…”
And I thought, “I’m an idiot. Why did I put a regular expression-based email validation on my system?”
So don’t do that. I know you can find one on Stack Overflow… I’ll tell you right now, it’s not good enough. Email addresses are SO complicated. There’s so many valid things…
If you’re going to do it – and I’ll admit that I kept it in there, but I just check that there’s some stuff, and then an @
, and then some stuff.
~r/^\S+@\S+\.\S+$/
That’s pretty much what you’re gonna be able to do… And that’s just to basically make sure that you don’t get some junk into your database… 🙅♀️
But still, all you’ve gotta do is send them an email, and if they click on it, well that’s a valid email address. If they don’t click on it, then who cares…? That’s a hard-learned lesson!
If you want to validate an email address, send it an email. Problem solved.
Until bots start clicking on emails. Then we’re gonna have a whole new issue… But so far I don’t think there are bots that will
- create a fake email address
- sign up for your thing, and then
- access that email address and click on the link
When we get there, then we’ll have to come up with something else. But until then, just send it an email.
What you’ve just read is an excerpt from JS Party #39. I fixed up the formatting a bit for readability, but these (almost) exact words were spoken by me during the Pro Tips segment of that episode. In addition to tips like this one, we also discuss news & trends, interview awesome guests, teach each other things like we’re 5, and have lots of fun doing it. You should totally come party with us live on Thursdays or subscribe to the produced version! Take a listen and let us know what you think. 💚
Discussion
Sign in or Join to comment or subscribe
Marques Johansson
South Jersey, US
Principal Software Engineer @ Equinix Metal
2019-06-21T10:47:19Z ago
Wildcard and suffix email addresses make the bot making easier. account+random@gmail.com still goes to account@gmail.com. It’s useful for sorting your mail.
You’re regex permits @@.@ :-)
I’ve copied the regex pattern that ships in browsers for their html5 input validation. Yes, someone will have an email address that doesn’t validate, but that will be rare.
Email validation by a link embedded in the email has other benefits. The user is verifying that their email filters allow your messages to make it to the correct place.
Jerod Santo
Bennington, Nebraska
Jerod co-hosts The Changelog, crashes JS Party & takes out the trash (his old code) once in awhile.
2019-06-21T13:18:06Z ago
💯 well said.
Natacha
2020-06-12T23:15:59Z ago
I think there was some sort of ego trip for me coming up with the “neatest” regexes and feel smart.
Turns out regexes are not a reliable indicator of intelligence I found out!
They make the code less readable unless you comment them, name them and make sure they’re safe.
It’s a lot of work for a dubious return on investment…