problem with my lexer

closed account (Dy7SLyTq)
im having trouble writing my lexer. i want it to find for right now just strings and identifiers, but it keeps saying

terminate called after throwing an instance of 'std::regex_error'
  what():  regex_error
Aborted (core dumped)


this is the function:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
void Lexer::StartLex()
{
    while(!this->Source.empty())
    {
        smatch Match;

        if(regex_search(this->Source, Match, regex("\"[^\"]+\"")))
        {
            this->TokenList.push_back(Token("STRING", this->Source.substr(Match.position(0)), 1, Match.position(0)));
            this->Source = this->Source.substr(Match.position(0), this->Source.size());
        }

        else if(regex_search(this->Source, Match, regex("[a-z|A-Z|_][a-z|A-Z|_|0-9]")))
        {
            this->TokenList.push_back(Token("IDENTIFIER", this->Source.substr(Match.position(0)), 1, Match.position(0)));
            this->Source = this->Source.substr(Match.position(0), this->Source.size());
        }
    }
}
Last edited on
You don't need |s in character classes. "Or" is implied within those.

Keep in mind that your first regex will have issues because + is greedy. Change it to +? to make it lazy.
http://www.cplusplus.com/reference/regex/ECMAScript/

Ignore this, I misread the regex.

-Albatross
Last edited on
closed account (Dy7SLyTq)
thanks!
edit: its still giving the same error
Last edited on
Honestly, I'm not sure what's happening. I put together a program using bits of your code, compiled it using clang, and tried a few different test cases, and I couldn't reproduce your error.

This is what I ran:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
#include <iostream>
#include <string>
#include <regex>
int main ()
{
    try {
        std::cerr << "Enter a line to search:\n";
        std::string Source;
        std::getline(std::cin,Source);
        
        std::smatch Match;
        if(regex_search(Source, Match, std::regex("\"[^\"]+\"")))
            std::cerr << "Match STRING: " << Source.substr(Match.position(0)) << "\n";
        else if(regex_search(Source, Match, std::regex("[a-z|A-Z|_][a-z|A-Z|_|0-9]")))
            std::cerr << "Match IDENTIFIER: " << Source.substr(Match.position(0)) << "\n";

        std::cerr << "All is well.\n";
    }
    catch (const std::regex_error& e) {
        std::cerr << "Error code #" << e.code() << ": " << e.what() << "\n";
    }
    return 0;
}


This makes me wonder if the error isn't elsewhere.

-Albatross
May I ask why you use cerr for just normal outputs instead of for only error outputs?
Old habit of forgetting to use std::clog, which by default on Unixes redirects to the same place as std::cerr anyway.

Generally, I use STDERR for diagnostics/errors/messages/indications of the program's status and requests for user input. Why? To keep STDOUT free for the exclusive output of processed data that the user (usually me) might be interested in manipulating further.

I suppose I could have used std::cout here. But meh.

-Albatross
Topic archived. No new replies allowed.