Regular expressions

I was trying to write a regular expression for an email.id and this is what i came up with.

(\w+)(\_|.)?(\w+)(@)(\w+)(\.)(\w+)

Is this right?
http://stackoverflow.com/questions/201323/using-a-regular-expression-to-validate-an-email-address
Some problems with your regex: valid e-mail name can contain any number of dots. It can contain + sign. username@mail.example.com is a valid e-mail address.
Read corresponding RFC linked in stackoverflow question.
Last edited on
"username@mail.example.com"

when i try this on regexr it does not match. Matches only till username@mail.example. I don't know what are you talking about??
Because it should match.
I can have email u.ser.na+me@mail.example.com and your regex will continue to tell that it is invalid while it completely valid.

Please look at the standard compliant regex example in the provided link. Actually regex is not the best chice to do email checking.
Recipe 3.9: Validating Email Addresses

Problem

Your program accepts an email address as input, and you need to verify that the supplied address is valid.

Solution

Scan the email address supplied by the user, and validate it against the lexical rules set forth in RFC 822.

Discussion

RFC 822 defines the syntax for email addresses. Unfortunately, the syntax is complex, and it supports several address formats that are no longer relevant. The fortunate thing is that if anyone attempts to use one of these no-longer-relevant address formats, you can be reasonably certain they are attempting to do something they are not supposed to do.

You can use the following spc_email_isvalid( ) function to check the format of an email address. It will perform only a syntactical check and will not actually attempt to verify the authenticity of the address by attempting to deliver mail to it or by performing any DNS lookups on the domain name portion of the address.

The function only validates the actual email address and will not accept any associated data. For example, it will fail to validate "Bob Bobson <bob@bobson.com>", but it will successfully validate "bob@bobson.com". If the supplied email address is syntactically valid, spc_email_isvalid( ) will return 1; otherwise, it will return 0.

TIP: Keep in mind that almost any character is legal in an email address if it is properly quoted, so if you are passing an email address to something that may be sensitive to certain characters or character sequences (such as a command shell), you must be sure to properly escape those characters.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
#include <string.h>

int spc_email_isvalid(const char *address) {
  int        count = 0;
  const char *c, *domain;
  static char *rfc822_specials = "()<>@,;:\\\"[]";

  /* first we validate the name portion (name@domain) */
  for (c = address;  *c;  c++) {
    if (*c == '\"' && (c == address || *(c - 1) == '.' || *(c - 1) == 
        '\"')) {
      while (*++c) {
        if (*c == '\"') break;
        if (*c == '\\' && (*++c == ' ')) continue;
        if (*c <= ' ' || *c >= 127) return 0;
      }
      if (!*c++) return 0;
      if (*c == '@') break;
      if (*c != '.') return 0;
      continue;
    }
    if (*c == '@') break;
    if (*c <= ' ' || *c >= 127) return 0;
    if (strchr(rfc822_specials, *c)) return 0;
  }
  if (c == address || *(c - 1) == '.') return 0;

  /* next we validate the domain portion (name@domain) */
  if (!*(domain = ++c)) return 0;
  do {
    if (*c == '.') {
      if (c == domain || *(c - 1) == '.') return 0;
      count++;
    }
    if (*c <= ' ' || *c >= 127) return 0;
    if (strchr(rfc822_specials, *c)) return 0;
  } while (*++c);

  return (count >= 1);
}


http://www.oreillynet.com/network/excerpt/spcookbook_chap03/index3.html


Excerpted from 'Secure Programming Cookbook for C and C++' by John Viega, Matt Messier
http://shop.oreilly.com/product/9780596003944.do
thank you.
Topic archived. No new replies allowed.