Grammars and parsing

Hey guys,

I'm reading Bjarne's practices and principles and decided to attempt an exercise, the exercise is write a basic grammar to evaluate an English language grammar,

Sentence:
Noun Verb
Sentence conjunction Sentence

Conjunction:
"and"
"or"
"but"

noun:
"birds"
"fish"
"C++"

Verb:
"rules"
"fly"
"swim"

so obviously this program only should and does use the words above, but I could add more words at a later stage, and obviously some of these sentences may not make sense as the program does not check for future,present and past tenses so C++ flies and birds rules will be grammatically correct to the program, I found this one quite interesting and followed the approach Bjarne used to make a calculator that does multiplication and division before addition and subtraction.

if anybody could give me some tips, or how I could improve the grammar or even give me some useful sources they found helpful regarding grammars and parsing would be great

thanks

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220


#include <iostream>
#include <vector>

using namespace std;

vector<string> allowedWords;
vector<string> words;
int pos = 0;

void makeAllowedWords(){

    // nouns
    allowedWords.push_back("C++");
    allowedWords.push_back("birds");
    allowedWords.push_back("fish");

    //verbs
    allowedWords.push_back("rules");
    allowedWords.push_back("fly");
    allowedWords.push_back("swim");

    //conjunction
    allowedWords.push_back("or");
    allowedWords.push_back("and");
    allowedWords.push_back("but");

    // punctuation
    allowedWords.push_back(".");
}

void enterSentence()
{
    string sentence;

    while(true)
    {
        cin >> sentence;

        if(sentence == "*")
            break;

        words.push_back(sentence);
    }
}

bool checkWords(){

    bool wordFound = false;

    for(int i = 0; i < words.size(); i++){

        wordFound = false;

        for(int j = 0; j < allowedWords.size(); j++){

            if(allowedWords.at(j) == words.at(i)){

                wordFound = true;
                break;
            }
        }
        if(!wordFound){

            return false;
        }
    }
    return true;
}

bool correctGrammar = true;

int sentence();
int noun();

int conjunction()
{

    string s = words.at(pos);
    pos++;
    int correct = 0;

    if(correct == -1){

        return -1;
    }

    if(s == "." && s != words.at(1)){
        return -1;
    }


    if(s == "and" || s == "or" || s == "but"){


        if(words.at(pos) == "and" || words.at(pos) == "or" || words.at(pos) == "but"){

            correctGrammar = false;
        }

        correct = noun();

        if(correct == 0){
            pos--;
            correct = sentence();
        }
        if(correct == 1){
            return 1;
        }
        if(correct == -1){
            return -1;
        }
    }
    return 0;
}

int verb()
{
  string s = words.at(pos);
  pos++;
  int correct = 0;

  if(s == "rules" || s == "fly" || s == "swim"){

     correct = conjunction();
     if(correct == 1){

        return 1;
     }
  }
  if(correct == -1){
     return -1;
  }
   pos--;
   return 0;
}

int noun()
{

    string s = words.at(pos);
    pos++;
    int correct = 0;

    if(s == "."){
        return -1;
    }

    if((s.compare("birds") == 0)|| (s.compare("fish") == 0 ) || (s.compare("C++")==0))
    {
        correct = conjunction();
        if(correct == -1){
            return -1;
        }

        if(correct == 0)
        {
            pos--;
            correct = verb();
        }
        if(correct == 0){
            pos--;
        }
        if(correct == 1){
            return 1;
        }
    }
    if(correct == -1){
        return -1;
    }
    return 0;
}

int sentence()
{
    int correct = noun();

    if(correct == 0){
        correctGrammar = false;
    }
    return correct;
}

void promptUser(){

    cout << "enter a sentence" << endl;
    cout << "sentences must end with a space and . press shift * and enter key to enter the sentence" << endl;
}

int main()
{
    makeAllowedWords();

    while(true)
    {
        correctGrammar = true;
        pos = 0;

        promptUser();
        enterSentence();

        if(!checkWords())
        {
            cout << "line contained a word which is not allowed" << endl;
            return 1;
        }

        sentence();
        if(correctGrammar)
        {
            cout << "word is a proper sentence" << endl;
        }
        else
        {
            cout << "may not be grammatically correct" << endl;
        }
    }
}
Last edited on
lex, yacc (flex, bison) are informative and interesting to learn.
The Lex & Yacc Page | http://dinosaur.compilertools.net/

Compilers: Principles, Techniques, and Tools ("The Dragon Book")
https://www.amazon.ca/Compilers-Principles-Techniques-Tools-2nd/dp/0321486811
Hello adam201,

Some thought about your code as i went through it.

For now and more so in the future do your self a favor when you refer to something in "Bjarne's" book make a reference to the page number and anything else that helps to know what you are working from. Many people have his book, but do not want to spend time trying to find what you are working.

You left out the header file "string".

You could define your vector as:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
#include <iostream>
#include <string>
#include <vector>

//using namespace std;  // <--- Best not to use.

// <--- These worked fine where they were, but are better placed at the beginning of the functions.
int sentence();
int noun();

const std::vector<std::string> ALLOWEDWORDS
{
	"C++", "birds", "fish", // <--- Nouns.
	"rules", "fly", "swim", // <--- Verbs.
	"or", "and", "but",     // <--- Conjunctions.
	"."                     // <--- Punction.
};

// <--- A 2D vector where each row is a different type. Just a thought and some of the code would have to be adjusted.
//const std::vector<std::vector<std::string>> ALLOWEDWORDS
//{
//	{"C++", "birds", "fish"}, // <--- Nouns.
//	{"rules", "fly", "swim"}, // <--- Verbs.
//	{"or", "and", "but"},     // <--- Conjunctions.
//	{"."}                     // <--- Punction.
//};

// <--- Try to avoid using global variables like these.
std::vector<std::string> words;
int pos = 0;

The 2D vector is just an idea.

By defining the vector this way you can eliminate the need for the function "makeAllowedWords()" to populate the vector.

In you for loops at lines 52 and 56 you have for(int i = 0; i < words.size(); i++). This is likely to generate a warning because the "i" and "words.size()" are two different types. If you look up the ".size()" for something, i.e., a string you will find that the "size" function in most of the classes return a "size_t" or on occasion a "size_type" both are a type def for an "unsigned int". So comparing an "int" to an "unsigned int" is a slight problem, but not enough to stop the compile or keep the program from running.

The last thing I find confusing is:
1
2
std::cout << "enter a sentence" << std::endl;
std::cout << "sentences must end with a space and . press shift * and enter key to enter the sentence" << std::endl;

Mostly line 2. I am thinking that the period should be enough to end the while loop and not the "*". Just my opinion.

Hope that helps,

Andy
Thanks guys,

Dutch I will check out them links :)

Andy thanks for the tips, yeah I agree I think just the . would be sufficient without having to add a * to denote the end of the sentence.

using namespace std, I feel for trivial programs or learning the language this construct isn't too important but then again you could always argue don't get into bad habits,

so far I've never come into any problems with naming conflicts but that will probably change as I write more complex programs.

the page number is 191, or chapter 6 and the exercise is on page 217 and it is question 6

thanks
Last edited on
Topic archived. No new replies allowed.