Searching dynamic memory for a string literal.

I have a project where I am supposed to make a program that generates the object code for the lines in an assembly program (in the form of a .txt file), and I have been able to get the program to work up until I have tried this following function:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
void assemblytoopcode::machinecode()    //This function only gets called once!
{
    int y = 0;
    cout << "This is what is in the machine-code container: "<<endl;
    //for each symbol in the instructions field
    for (int x = 0; x < varstart; x++)
    {
        //search the array of data structures for it
        //Remember that the first character of a symbol can be a '+'. You should remove it and keep track of it.
        while (opcode[x].compare(str[y].symbol) != 0)
        {
            y++;
        }   //end while
        objectcode.push_back(str[y].value * 64);  //if found, fetch its value and load it, multiplied by 64
        y = 0;  //Reset control variable value
        //check the first character of the operand
        if (operands[x].at(0) == '@')   //if it is a '@'
        {
            objectcode.back() += 32;    //add 32 to fetched value
            //Take substring to be the rest of the operand.
            //Find it.
            //disp is calculated program-relative style
        }   //end if
        else if (operands[x].at(0) == '#')  //else if it is a '#'
        {
            objectcode.back() += 16;    //add 16 to fetched value
            //Take it literally!
            if (operands[x].find_first_not_of("1234567890") == operands[x].npos)  //First check it to see if it just a number
            {
                stringstream(operands[x].substr(1,operands[x].npos))>>disp;
            }   //end if
        }   //end else if
        //else, we assume that it is either index-relative or the operand is just a plain old label reference; this control structure checks
        //if it is so...
        if (operands[x].find_first_not_of("@#1234567890") != operands[x].npos)
        {
            pos = operands[x].find_first_of(',');
            tempstr = operands[x].substr(operands[x].find_first_not_of("@#"),pos);    //take the substring up until the comma (if there is one)
            if (pos != operands[x].npos)    //if there was a comma in the operand field, check the stuff after the comma for 'X'
            {
                //if the other part is a 'X', add 8 to the value
                if ((operands[x].substr(pos+2,2) == " X") || (operands[x].substr(pos+1,1) != "X")) objectcode.back() += 8;
                else    //else report error (for now)
                {
                    cout << "ERROR: Index-relative operands must be in the form of: REFERENCE, X" << endl;
                }   //end else
            }   //end if
        }   //end if
        //search for the corresponding label
        while((tempstr != labels[y]) && (y != numlines))    //loop searching for the label matching its reference (if there is one)...
        {
            y++;
        }   //end while
        if (y < numlines)
        {
            disp = locations[y] - (locations[x] + 3);  //if found, get its address (we'll use this approach for right now)....
            if (disp < 0) disp = 4096 - disp;   //making it two's complement format if it is negative
        }   //end if
        if (disp >= 4096)    //If the disp goes beyond the 12 bits allocated for it...
        {
            //if the first character of the symbol isn't '+'
            if (opcode[x].at(0) != '+')
            {
                //find base
                //recalculate disp
                //add 4 to fetched symbol value
            }   //end inner if
            else
            {
                objectcode.back() = (objectcode.back()+1)*1048576+disp; //it is program relative, so multiply by 1048756 and add disp
            }   //end else
        }   //end outer if
        else    //What to do if it is program-relative (and no '+' has been found)
        {
            objectcode.back() = (objectcode.back() + 2) * 4096 + disp;
        }   //end else
        cout << objectcode.back()<<endl;
    }   //end for
}   //end machinecode 

I have ran the debugger and it points to the last line (in my program) on the call stack being this one:
while (opcode[x].compare(str[y].symbol) != 0)
I have tried two variations of that one. I think the one I would need is :
while (opcode[x].substr(opcode.find_first_not_of("#")) != str[y].symbol) Note that str[] is an array of structs (I can post link to code if necessary; the program is rather large. Or I can post the function that generates the struct.) I think I should do that (it works fine!):
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
void assemblytoopcode::fillstruct() //This function is only getting called once, and its purpose is to fill the data structure that is the symboltable. This array of structs will have EVERY symbol and its value in it, to be
//looked up...
{
    int tempx = 0;
    string line;
    size_t pos;
    ifstream myfile ("symbol definitions.txt");

    if (myfile.is_open())
    {
        while (myfile.good())
        {
            getline(myfile, line);
            tempx++;  //Opening the file once to obtain the size of the dynamic array....
        }   //end while
        myfile.close();
        myfile.clear();
        myfile.open("symbol definitions.txt");
        str = new (nothrow) symboltable [tempx];
        for (int y = 0; y < tempx; y++)
        {
            getline(myfile, line);
            pos = line.find_first_of('\t');
            str[y].symbol = line.substr(0,pos+1);
            line.erase(0,pos+1);
            stringstream(line)>>hex>>str[y].value;
        }   //end for
    }   //end if
    else cout << "Uh-oh. Something went wrong!" << endl;
}   //end fillstruct 

Program, why you crash now?
Last edited on
Program, why you crash now?


I think the more likely question is "Programmer, why you make me crash now?"

In the while loop that begins on line 10, why is there no cap on the value y can take?

Why is y reused for a different purpose on line 50?

Why is y not reset to 0 on the next iteration of the for loop?
Good catch. I reset the value, and it still crashes. (How did I forget to reset it again?) I thought I didn't need one as I tried this logic in a *much* simpler program, and the loop knew to exit on the parameter becoming equal to the length of the array. However, even if I use other value for the comparison of operand to label (and reset that value!) and give it a limit value, it crashes. I wonder if it is something simple again....

The crash is now at this while loop:
while((tempstr != labels[w]) && (w < numlines)) //loop searching for the label matching its reference (if there is one)...
Last edited on
Here is a link to my full code: http://megafileupload.com/en/file/374946/assembler-main-3--cpp.html It is worth noting that I resolved the logic error in the function that was creating the dynamic structure (how could I have been so careless?) Also note that while((tempstr != labels[y]) && (y != numlines)) became the code snippet up above.
Last edited on
I think I found the error. I try a much simpler program and crash the program when I intentionally enter a value not in the array of structs. Here is some code:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
#include <iostream>
#include <fstream>
#include <sstream>
#include <string>
#include <vector>

using namespace std;

struct symboltable
{
    string symbol;
    int value;
};   //end symboltable
symboltable * symtab;
int findvalue(string yourstring)
{
    int y = 0;
    while ((yourstring != symtab[y].symbol) && (y < 59))
    {
        y++;
    }   //end while
    if (y != 59) return symtab[y].value;
    else cout << "An error occurred. Take measures"<<endl;
}   //end findvalue
int main()
{
    int x = 0;
    size_t pos = 0;
    string * str;
    string command;
    string tempstr;

    ifstream myfile ("symbol definitions.txt");
    bool dothis;
    if (myfile.is_open())
    {
        while (myfile.good())
        {
            getline(myfile, tempstr);
            x++;  //Opening the file once to obtain the size of the dynamic array....
        }   //end while
        myfile.close();
        dothis = true;
    }   //end if

    else
    {
        cout << "Uh-oh. Something went wrong!" << endl;
        dothis = false;
    }   //end else
    if (dothis)
    {
        myfile.clear();
        myfile.open("symbol definitions.txt");
        cout << "x == " << x << endl;
        str = new (nothrow) string [x];
        symtab = new (nothrow) symboltable [x];
        cout << "Your data structure looks like this: \n"<<endl;
        if ((str != 0) && (symtab != 0))
        {
            for (int w = 0; w < x; w++)
            {
                getline(myfile, str[w]);
                //Parsing element of string array into the data structure
                pos = str[w].find_first_of('\t');
                symtab[w].symbol = str[w].substr(0, pos);
                str[w].erase(0, pos+1);
                stringstream(str[w])>>hex>>symtab[w].value;
                cout << symtab[w].symbol << '\t' << symtab[w].value<<endl;
            }   //end for
            myfile.close();
        }   //end inner if
        else cout << "ERROR: Cannot allocate memory."<<endl;
        cout << "Please enter an assembler command for which I should fetch the value of (in all caps): ";
        cin >> command;
        cout << "The integer value for what you have entered is: "<<findvalue(command)<<endl;
    }   //end if
	return 0;
}	//end main 

This is the exact same program that I got the struct-making code for the big program from. Why does the program crash if the string is not in the array of data structures? How do I prevent such things from happening? Better yet, how do I search a dynamic array of strings?

The other idea I had was to allocate some more memory to store the names of each register used so. I could do this in the simulated first pass. Then, I could just find out if (operands[x].find_first_of(',') == 1), and if it is, compare it against the dynamic array holding the register names. This would work because, in the SIC/XE machine, register names are all one character. I just wish C++ would be able to handle the possibility that either a string is not equal to any of it's like-type members of a dynamic array, or the process of checking the sizes string to be matched (and the one to compare with in the array), and then using an integral comparison between the two. I wonder if using operands[x] = operands[x].c_str() and doing this all C-style (which is pretty much what I was getting ready to try...) would work.
Last edited on
Sorry for the spam posts, but I have some code that doesn't crash, but will perform what I was talking about without crash. Anyone experiencing my dilemma should check out this short program:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
#include <iostream>
#include <string>
#include <new>

using namespace std;

int main()
{
    bool issame = false;
    string * a;
    string b = "abcdefghij";
    string c;
    int d = 0;
    int f,g;
    a = new (nothrow) string [10];
    if (a != 0)
    {
        for (int x = 0; x < 10; x++) a[x] = b.at(x);
        cout << "Please enter a letter that is one of the first ten of the alphabet: ";
        getline(cin, c);
        //Here we are attempting a string comparison against members of dynamically-allocated memory. The straightforward way will cause
        //the program to crash, so we need a workaround.
        //The first criteria should be for the SIZE of the string in the dynamic array, and the size of the string the user entered. We
        //can stop any comparison if those two don't match. We also know that integer comparison can work. Maybe if the two strings are
        //of the same size, we could make a dynamic integer array for each of them, and compare their integer members. Worth a shot.

        while ((!(issame)) && (d < 10))
        {
            f = c.length();
            g = a[d].length();
            if (f != g) d++;
            else
            {
                //Do a member-by-member integer comparison of the two same-sized strings.
                for (int y = 0; y < c.length(); y++)
                {
                    if (c.at(y) != a[d].at(y))
                    {
                        d++;
                        break;
                    }   //end if
                    if ((y == c.length()) && (c.at(y) != a[d].at(y))) issame = true;
                }   //end for
            }   //end else
        }   //end while
        cout << "d == "<<d<<endl;
    }   //end if
    else
    {
        cout << "Uh-oh. Memory cannot be allocated."<<endl;
    }   //end else
    if (d < 10)
    {
        d = 10;
        cout << "Can you see me?"<<endl;
    }
    return 0;
}
Here we are attempting a string comparison against members of dynamically-allocated memory. The straightforward way will cause the program to crash, so we need a workaround.


Perhaps you could specify what you think the straightforward way is.

Something like this (in my original code):
1
2
3
4
while ((tempstr != labels[y]) && (y < numlines))
{
     y++;
} //end while 
Perhaps you could specify it in terms of the code you just posted.
OK.
1
2
3
4
while ((c != a[d]) && (d < 10))
{
      d++;
} //end while 
I'm gonna go out on a limb and say that doesn't cause a crash in your simplified code.

Could be this indicates your problem is elsewhere.


No. I ran it through the debugger. It crashes at the start of that while loop (if the string is not there at all). Now the problem is, with my workaround, getting the logic right (and not hanging when the string is found). I could use a low-level goto line; to fix what fault exists with it, or think more about how the break; works. That low-level jump statement is one that I would have to be careful with, because it jumps regardless of condition. (I love how learning assembly language helps with further understanding of more powerful languages such as C++!) Oh, and I am using the standard GCC compiler and Code::Blocks IDE.
Last edited on
I'm pretty sure I've mentioned this to you before. The point at which a program crashes is not necessarily the point at which something first goes wrong.

Your workaround isn't solving a problem. At best, it's masking one.

Then, tell me! What is the problem with this:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
#include <iostream>
#include <string>
#include <new>

using namespace std;

int main()
{
    bool issame = false;
    string * a;
    string b = "abcdefghij";
    string c;
    int d = 0;
    int f,g;
    a = new (nothrow) string [10];
    if (a != 0)
    {
        for (int x = 0; x < 10; x++) a[x] = b.at(x);
        cout << "Please enter a letter that is one of the first ten of the alphabet: ";
        getline(cin, c);
        while ((c != a[d]) && (d < 10))
        {
            d++;
        }   //end while
        if (d < 10) cout << "Your string was found at a["<<d<<"]."<<endl;
        else cout << "Your string was not found in dynamic array a."<<endl;
    }   //end if
    else
    {
        cout << "Uh-oh. Memory cannot be allocated."<<endl;
    }   //end else
    delete[] a;
    return 0;
}
I'm gonna go out on a limb and say that doesn't cause a crash in your simplified code.

Could be this indicates your problem is elsewhere.


There's nothing wrong with that code.

[edit: Oops. As Cubbi pointed out, you are using d before you check to see whether it's a valid value, I think my mind is still half on that mess of a work-a-round.]
Last edited on
1
2
3
4
5
6
7
8
int d = 0;
...
a = new (nothrow) string [10];
...
while ((c != a[d]) && (d < 10))
{
        d++;
}   //end while 


That's just reading from the 11th string in an array of 10 strings. Incidentally, why aren't you using C++ containers / algorithms? That loop is really a call to find(a.begin(), a.end(), c)
Last edited on
I guess that this is true because it checks two conditions even if the second is false. (How did I not pay attention to that?) Oh, and I didn't know that algorithm existed, or if it was exclusively for vectors. (I didn't want to use vectors because the algorithmic complexity of pushing n elements is O(n^2). Also, this is for systems programming class, so I don't think they will let me.) But, since I am using dynamic arrays, I know the size of them, and so this should work just as well. Thanks for that knowledge. This alone has been a good teachable moment.
I have modified the straightforward approach to:
1
2
3
4
while (d < 10)
{
     if (c != a[d]) d++;
}   //end while 

Now I have two ways to do it! Thanks, guys! idk how far I would have got in this project without your patience and wisdom! (Note to self, think more beautifully...)
I didn't want to use vectors because the the algorithmic complexity of pushing n elements is O(n^2)

The complexity is exactly the same as for an array
Topic archived. No new replies allowed.