Checking strings with specific format

I'm trying to check a command string with a specific format.
Following are the allowed format:
5n-3S+16ne2w
+4ne1s3n
NSEW is the direction (-3w is simply 3e, and nw/ne/sw/se are also valid), and +/- is optional before or after the digits. Numbers need be placed before the directions and cannot be bigger than 999.
I don't know how to check for the optional +/- symbol, how to check two char directions such as "se/sw/ne/nw" and how to make this function a loop so that it can check multiple commands. All I have now is the code below that can check for the range three digits and one single direction. I also included one main function to test if the bool function works.
Any help/hint will be appreciated.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
#include <iostream>
#include <string>
using namespace std;

bool isWellFormedCommandString(string commands)
{
    if (commands == "")
        return false;
    if (!isdigit(commands[0]) && commands[0] == '0')
        return false;
    if (isdigit(commands[0]) && commands[0] != '0')
    {
        if (commands [1] == 'w' || commands [1] == 'e' || commands[1] == 'n' || commands[1] == 's')
        {
            return true;
        }
        else if (isdigit(commands[1]))
        {
            if (commands [2] == 'e' || commands[2] == 'w'|| commands[2] == 'n' || commands[2] == 's')
            {
                return true;
            }
            if (isdigit(commands[2]))
            {
                if (commands[3] == 'e' || commands[3] == 'w' || commands[3] == 'n' || commands[3] == 's')
                {
                    return true;
                }
                else if (isdigit(commands[3]))
                {
                    return false;
                }
            }
        }
    }
    return false;
}

int main()
{
    string d;
    cout.setf( ios::boolalpha );
    for(;;)
    {
    cout << "Enter commands: ";
    getline(cin, d);
        if (d == "quit")
            break;
        cout << "isWellFormedCommandString returns ";
        cout << isWellFormedCommandString(d) << endl;
    }
}
Last edited on
Line 9: Will never be true. If commands[0] is not a digit it couldn't be '0'.
...

You may want to split your task into two parts, a scanner and a parser.

The scanner takes the input string and returns the next unscanned token each one concentrating a sequence of associated symbols. F.e. a sequence of digits results in a token called NUMBER associated with the scanned number. A sequence of direction letters will be called DIRECTION associated with the scanned one or two digit direction symbol. The operator symbol '+' results in a token called PLUS and '-' in MINUS.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
enum TokenName
{
    NUMBER,    // Any number
    DIRECTION, // A direction symbol
    PLUS,      // '+'
    MINUS,     // '-'
    END,       // End-of-input
    ERROR      // Invalid symbol detected
};

struct Token
{
    TokenName token;
    string    data;
};

The scanner may be an object like this:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
class Scanner
{
private:
    string _command; // The command string to be tokenized
    size_t _current; // Next index to _command to be scanned
    Token _ungetToken; // A Token returned by ungetToken()

public:
    Scanner(const string &theCommand); // Initializes all attributes
    Token nextToken(); // Returns the next token moving _current to the position
                       // immediately following the returned token. Returns _ungetToken
                       // if there was another one then END setting _ungetToken to END.
    bool ungetToken(const Token &theToken); // Return the last token received from nextToken()
                                            // Return false if there was just an unget token
                                            // leaving it unchanged, otherwise true.
};

The parser asks the scanner for the next token and has to look for a valid syntax. It may return a syntax tree or process the validated data in any other way. It loops over all tokens until nextToken() does return ERROR or END.


If you're familiar with scanner and parser generation you may want to use "flex" and "bison". The first one takes some kind of regular expression describing your tokens and does generate a scanner, usually in C. "bison" takes a grammar describing your syntax and does return a parser in C. This parser uses the flex generated scanner to do its job.
Topic archived. No new replies allowed.