Regex

Due some stuff I had to do at work, I had to finally learn regex and holy crap. My mind is blown. I wish I would have actually taken the time a long time ago to learn regex, it is so amazing :O If you have not learned regex, I highly encourage all of you to go do so right this second!
I have a love/hate relationship with regex. I love what it does, but I hate having to use it.

Ever tried to follow someone else's regex? Without any comments? Ugh.
If you have not learned regex, I highly encourage all of you to go do so right this second!


Well that's my plans for the evening drastically and immediately changed...
I have not had to read someone elses regex yet, I imagine that would be terrible.

I had to use it scan a large amount of information for MAC addresses. Regex made it sooo easy.
closed account (Dy7SLyTq)
regex is amazing if you can get it to parse right. i was using it to generate lexemes in a lexer and my hair has just started to grow back

@ne: is the answer what the top and side can both generate?
Last edited on
ResidentBiscuit wrote:
I have not had to read someone elses regex yet, I imagine that would be terrible.


Do yourself (and everyone who will read your code in the future) a favor and heavily comment every regex you make. I don't mean like a quick one-liner... I mean like a paragraph.

Example from something I worked on in the past (Python):

1
2
3
4
5
6
7
8
9
10
        #begin extracting individual tags
        # multiline regex looks for "@tag:value\n@nexttag:"
        #   'tag' is one or more \w identifiers
        #   'value' is any string including whitespace and @, : characters
        #   'nexttag' is the following tag (hence why above dummy was appended)
        #   'tag' and 'value' are put in our dictionary.  whitespace before and
        #     after 'value' is dropped
        while True:
            m = re.search( r"@([A-Za-z0-9_]+):\s*(.+?)\s*\n\s*@[A-Za-z0-9_]+:",src, re.M + re.S)
            ...



Another one:

1
2
3
4
    # regex to look for @tags:[...]publish[...] followed by a ''', """, or @
    #  [...] can be any characters other than @ ' or "
    regex = re.compile( "@tags:[^@\'\"]*publish[^@\'\"]*((\'{3})|(\"{3})|@)", re.IGNORECASE )
    #                     tags   [...]  publish  [...]   '''   or  """  or @ 




regex without comments is hell on Earth.
Last edited on
Another problem with regexes: everyone and their mom has their own regex implementation. Perl regexes are the de facto standard, so you can usually expect them to be available, but some programs (e.g. Visual Studio) use their own syntax. At the same time, not every feature of Perl regexes may be available.
Topic archived. No new replies allowed.