Looking for help with a binary file.

Hello.
My task is to create a program that puts all the reserved c++ words (Max length 30) into a ordered table. then i need to write a function which using binary search checks, if an input string (max length 30) is one of these C++ reserved words. Table must be direct access file. C++ reserved words must be taken from a text file. Also i am not allowed to copy content of the file into operative memory.

So, this is how far i've got.


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
#include <fstream>
#include <iostream>
using namespace std;


int main ()
{
    fstream fin;
    ofstream fout ("binary.txt", ios::binary); // This is the binary file where im going to store reserved words. 
    char c;
    int size;
    fin.open("reserved.txt", ios::in);//This is the file from where im going to take these reserved words. 
    fin.get (c);
    while(fin)
    {
        fout.write ((char*)&c, sizeof(int));
        if(c == '\n') cout << endl; // Just prints out the reserved words.
        else cout << c;
        size++;// just to know how many words are there. 
        fin.get(c);
    }
    cout <<"Content of the file is inside binary file now."<< endl;
    string word;
    cout <<" \n Ievadiet vardu, kuru salidzinat ar binaro failu."<< endl;
    cin >> word;
    cout << word<< endl;
    fin.close();
    fout.close();
    return 0;
}



Basicly what i need now is to do Bubble sort to sort these words and then i have to search the binary file for the input word using binary search to find out if its one of the c++ reserved words.

The problem is, to use binary search or bubble sort there should be something like an array or indexes so that i can do the searching. I was told that a text file with a fixed lenght can be used as an array. In theory i do understand that but i've no idea how its done. So im looking for an example how to tell that This is the first word, this is the 2nd word and so on, so that i can perform the binary search.

Also the way i wrote it, is the "reserved.txt" a direct access file?

I am not looking for a ready code, i just need some example how to work with the binary file as if it was an array.

Thank you.
Bubble sort: http://www.cplusplus.com/faq/sequences/sequencing/sort-algorithms/bubble-sort/

Binary Search: http://www.cplusplus.com/reference/algorithm/binary_search/

I believe reserved.txt in this case is a direct access file.
Table must be direct access file.

I think I need clarification on the statement.

Taking you literally I would have to use system specific calls, or the Boost Filesystem (which wraps these calls.) Either way, it's not what I think of as beginner code.

Edit: noticed another comment:

Do you mean a binary file which is laid out so it can be handled like an array after loading? That's possible, but not necessarily the best way to go.

I was told that a text file with a fixed lenght can be used as an array.

If by fixed length you mean a fixed line length, then it is possible. But it's not something I would do. And I can't think of a clean solution, either.

You could use an char* array to point into the file data, instead. But even then it's not esp. clean.

Also i am not allowed to copy content of the file into operative memory.

If you took this literally you wouldn't be able to do anything with your file. You got to copy at least some of it into "operative memory"!

And is this the text file or the keyword file?

Also the way i wrote it, is the "reserved.txt" a direct access file?

I don't think so. But I'm not sure I get the term in this context.

Andy

PS (ASIDE) Boost Filesystem is expected to become part of C++ in 2014.
http://www.boost.org/doc/libs/1_58_0/libs/filesystem/doc/index.htm

Recent milestones: C++14 DIS, 8 TS's under development
https://isocpp.org/std/status
Last edited on
My task is written in Latvian language and i kind of suck at translating.

So heres the task:

"Write a program in C ++. If the program works with a file, you should not copy the entire content of the file in operative memory. File component means fixed-length record.
H7. Write a program that puts all standard C ++ reserved words in an ordered table (Ordered table as far as i understood means that these words are in alphabetic order). Write a function, which, using binary search, checks if an input string (length 30) is a C ++ reserved word or not. Table should be made as direct access file. C ++ reserved program should read from a text file."

I believe that file component is each word, so they are fixed length.

And yes, i kind of didnt make my salf clear about that copying to operative memory i guess.
I'm not allowed to copy the entire content.

reserved.txt is the file where all the keywords are . Binary.txt is the file where program will copy them. Its the binary file.

So how i imagined this.

Program takes all the keywords from reserved.txt and copys them to binary.txt. Sorts them into alphabetic order using bubble sort. Then asks the user for an input string. Then the program compares the input string with the words inside that binary file using binary search.

I can imagine doing this with a simple array. But i've no idea how its done inside a file.

Basicly, i'm not looking for the best way to do this. What i need right now is the easiest to understand way to do this.
I can imagine doing this with a simple array. But i've no idea how its done inside a file.

That ("inside a file") still confuses me, too.

You could read the keywords from a text file into an array and then write the whole of the array out to a binary file. That file could then be read back. But it's still memory you accessing to read or write the data in the array.

Andy
Last edited on
Well yea, i can't do that becouse i would be copying entire content of the file into the operative memory and then i'd copy it to a binary file...

One of my colleagues told me that, if the component in that file has a constant size, i can use the file as an array, by moving the cursor to component_size * i. I would get the i+1 component like that.

Gosh im so confused right now.
If you're talking about using istream::seekg to set the position and then extracting the record found there, it doesn't seem like an esp. sensible thing to do. But I guess it would work.
http://www.cplusplus.com/reference/istream/basic_istream/seekg/

And to write to a specific postion, you set the stream position using ostream::setp
http://www.cplusplus.com/reference/ostream/ostream/seekp/

But note that the file is still going to be in memory. For C++ IO streams mechanism to work, the C++ IO streams library to work it's opening the file (using the underlying system call) and reading at least part of the file into a buffer. The calls you make read the data out of their buffer. So it's not direct file access!

And C++ programmer usually use 0-based indices. So moving to component_size * i should get you the 0 index component (cf. an array of 4 elements - a[0], a[1], a[2], a[3])

Andy
Last edited on
Okay.
I found out that we are supposed to use seekg etc to do the task. Also, by the direct file access lecturer ment that the input is taken from a binary file, not the text file or something.

So this is how far i've got. I believe im close to the end.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
#include <fstream>
#include <iostream>
using namespace std;

void Binary_Search(string filename, string SearchVal)
{
    fstream file;
    int first, last, middle;
    string CurrString;
    bool found = false;
    //file.open(filename);
	file.seekg(0,ios::beg);
	first=file.tellg();
	file.seekg(0,ios::end);
	last=file.tellg();
	file.seekg(0,ios::beg);
	while(first<=last)
	{
		middle=(first+last)/2;
		cout<<first<<" "<<last<<" "<<" "<<middle;  //**
		file.seekg(middle);
		file >> CurrString;
		if(CurrString == SearchVal)
        {
            found = true;
            break;
        }
		if(CurrString < SearchVal) first = middle + 1;
		if(CurrString > SearchVal) last = middle - 1;
    }
    if(found == true) cout << SearchVal << " It is a C++ reserved word."<< endl;
    else cout << SearchVal << " It's not a c++ reserved word."<< endl;

}

int main ()
{
  [code]  fstream fin ("rezervetie.txt", ios::in);
    fstream fout ("binarie.txt", ios::binary);
    string ievade;
    fin >> (ievade);
    while(fin)
    {
        fout.write((string*)&ievade, 30);
        fin >> (ievade);
    }
    cout <<"Content of the file is now inside the binary file."<< endl;
    string vards;
    cout <<" \nInput the word you want to search for "<< endl;
    cin >> vards;
    cout << vards<< endl;
    Binary_Search("binarie.txt",vards);
    fin.close();
    fout.close();
    return 0;
}


But i'm facing a problem that i cant seem to fix by my salf.
1
2
3
4
5
6
7
8
9
  fstream fin ("rezervetie.txt", ios::in);
    fstream fout ("binarie.txt", ios::binary);
    string ievade;
    fin >> (ievade);
    while(fin)
    {
        fout.write((string*)&ievade, 30);
        fin >> (ievade);
    }


This part of code. Gives me an error that fout.write has no matching function.
i tried to do it like fout << ((string*)&ievade, 30); which gave me no errors, but it didnt write anything inside a binary file.
Last edited on
I haven't looked at Binary_Search() in detail, but as you haven't opened the binary file it's hardly going to work.

If you use an fstream rather than an ofstream (like I do in my code below), you could pass it to Binary_Search(), but it might make more sense here to close the file in main() and then reopen to read in Binary_Search().

Also, as you're working with a binary file, you'll need to use istream::read() -- with a char buffer -- to read the file. Not operator>>

And I'm not sure about the way you work out the position of the last record; seekg() to beg and end will get you the file size (in bytes!). Then use file size / record size, etc.

Anyway...

This part of code. Gives me an error that fout.write has no matching function.

Well, the compiler's correct.

Checking
http://www.cplusplus.com/reference/ostream/ostream/write/
you'll see there's only one overload of write and it takes a const char*

fout << ((string*)&ievade, 30);

This is a very poorly bit of code!

operator<< is primarily for working with text files; operator<< on an int, double, etc. inserts a string representation of the value into the output stream, not the actual binary value.

And you can't cast a random pointer to a string* and expect it to work! :-(

You don't even know how big it is!!

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
#include <iostream>
#include <fstream>
#include <algorithm> // for fill_n
using namespace std;

int main() {
    // use ifstream and ofstream rather than fstreams
    ifstream fin ("rezervetie.txt"); // ios::in is implicit here
    ofstream fout ("binarie.dat", ios::binary); // ios::out is implicit here
    // binarie is not text so use different extension
    string ievade;
    const int bufferSize = 30;
    char buffer[bufferSize] = {0}; // buffer which we own (value init to zero)
    // fin >> (ievade); // move this
    while(fin >> ievade)
    {
        // as we don't know what's going on inside the std::string
        // after it.s buffer ends, we need to copy the string into
        // our buffer to we can control all the bytes we write to
        // the binery file.
        fill_n(buffer, bufferSize, '\0'); // fill buffer with null char

        // copy from string to our buffer
        ievade.copy(buffer, bufferSize - 1);
        // note that string::copy() does not append a null character at
        // the end, so leave at least one zero at end of buffer.

        // write buffer to binery file
        fout.write(buffer, bufferSize);

        //fin >> (ievade);// don't need this
    }
    return 0;
}


With input file (rezervetie.txt)

one
two
three
four
five
supercalifragilisticexpialidocious


I get the following binary file (binarie.dat) --dumped with Hex Edit

 00000000  6F 6E 65 00 00 00 00 00-00 00 00 00 00 00 00 00  *one.............*
 00000010  00 00 00 00 00 00 00 00-00 00 00 00 00 00 74 77  *..............tw*
 00000020  6F 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00  *o...............*
 00000030  00 00 00 00 00 00 00 00-00 00 00 00 74 68 72 65  *............thre*
 00000040  65 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00  *e...............*
 00000050  00 00 00 00 00 00 00 00-00 00 66 6F 75 72 00 00  *..........four..*
 00000060  00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00  *................*
 00000070  00 00 00 00 00 00 00 00-66 69 76 65 00 00 00 00  *........five....*
 00000080  00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00  *................*
 00000090  00 00 00 00 00 00 73 75-70 65 72 63 61 6C 69 66  *......supercalif*
 000000A0  72 61 67 69 6C 69 73 74-69 63 65 78 70 69 61 6C  *ragilisticexpial*
 000000B0  69 64 6F 00                                      *ido.*


Andy

PS You could write your own app to dump files as hex?
Last edited on
Thanks alot.
It seems to copy the text file content to the binary file just fine now. (Checked out with my hex editor).

Now what bothers me is that binary search function.
I got stuck with opening the binary file...
So, heres my code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
#include <fstream>
#include <iostream>
#include <algorithm>
using namespace std;

void Binary_Search(string filename, string SearchVal)
{
    fstream file;
    int first, last, middle;
    string CurrString;
    bool found = false;
    file.open(filename);
	file.seekg(0,ios::beg);
	first=file.tellg();
	file.seekg(0,ios::end);
	last=file.tellg();
	file.seekg(0,ios::beg);
	while(first<=last)
	{
		middle=(first+last)/2;
		cout<<first<<" "<<last<<" "<<" "<<middle;  //**
		file.seekg(middle);
		file >> CurrString;
		if(CurrString == SearchVal)
        {
            found = true;
            break;
        }
		if(CurrString < SearchVal) first = middle + 1;
		if(CurrString > SearchVal) last = middle - 1;
    }
    if(found == true) cout << SearchVal << " is a C++ reserved word."<< endl;
    else cout << SearchVal << " Not a C++ reserved word."<< endl;

}

int main ()
{
    ifstream fin ("rezervetie.txt");
    ofstream fout ("binarie.dat", ios::binary);
    string ievade;
    const int bufferSize = 30;
    char buffer[bufferSize] = {0};
    while(fin >> ievade)
    {
        fill_n(buffer, bufferSize, '\0');
        ievade.copy(buffer, bufferSize - 1);
        fout.write(buffer, bufferSize);
    }
    string vards;
    cout <<" \nPlease input the word to compare if its C++ reserved word or not."<< endl;
    cin >> vards;
    Binary_Search("binarie.dat",vards);
    fin.close();
    fout.close();
    return 0;
}


So basicly, the function passes the file name to the function. After that, when i'm trying to opent he file using " file.open(filename);" it gives me an error which i cant seem to fix by my self.
The error message: error: no matching function for call to 'std::basic_fstream<char>::open(std::string&)'|

fstream's open only takes a const char*. Try file.open( filename.c_str() );

Also, please get in the habit of checking error codes and NULL pointers.
fstream's open only takes a const char*

For completeness, this was that case pre-C++11

From:

http://www.cplusplus.com/reference/fstream/fstream/open/

C++11 tab:

1
2
3
4
void open (const char* filename,
           ios_base::openmode mode = ios_base::in | ios_base::out);
void open (const string& filename,
           ios_base::openmode mode = ios_base::in | ios_base::out);


Andy

Guess i'm stuck in the last part.
So, if each lines length is 30 bytes and if i know how many lines the binary file has, how do i, lets say, tell that This is the 1st line, this is the second line etc.

Basicly, ive imagined it like this. Binary search returns the line index or something, or just puts the cursor in the begining of the line. Then, just copys the whole line into an char array, which has size of 30. and then using strncmp(); id compare it with the input string.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
#include <iostream>
#include <fstream>
using namespace std;

void Binary_Search(const string& filename, string SearchVal)
{
    std::fstream file;
    int first, last, middle, BuffSize = 30, lines = 1;
    char buffer[BuffSize];
    bool found = false;
    file.open( filename.c_str());
    if (file.is_open())
    {
        cout << "The file is opened"<< endl;
    }
    else
    {
        cout << "Error opening file"<< endl;
    }
    char c;
    file.get(c);
    while(file)
    {
        if(c == '\n')
        {
            lines++;
        }
        file.get(c);
    }
    cout << "\n"<<lines;//The amount of lines inside the binary file.
    /*
    So the first one should be 0, last one should be equal to lines. 
    This is how im planing to get the attributes to do the binary search. 
    The problem is, lets say, it has to compare line 40 with the input value
    to figure out if the input value is greater or smaller then the string @ line 40.
    Ive no idea how to get to that line 40... to get the string. 
    */
};


int main()
{
    const unsigned int BUFFER_SIZE = 30;
    char buffer[BUFFER_SIZE];
    fstream fin ("rezervetie.txt", ios::in);
    fstream fout ("Binary.bin", ios::out);
    while (fin)
    {
        fin.read (buffer, BUFFER_SIZE);
        fout.write (buffer, 30);
    };
    fin.close ();
    fout.close ();
    string ievade;
    cout << "Ievadiet vardu, kuru salidzinat: ";
    cin >> ievade;
    Binary_Search("Binary.bin", ievade);

}

I think you're going to want to read a line at a time, perhaps with getline().
See comments in code.

Andy

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
#include <iostream>
#include <fstream>
#include <algorithm> // for fill_n
#include <cstdlib> // for srand(), rand()
using namespace std;

void Binary_Search (const string& filename, string SearchVal)
{
    //fstream file; // what std:: here an not elsewhere?
    // ifstream as we're just reading it
    // ios::bin as we're treating it as binary, yes?
    ifstream file (filename.c_str(), ios::binary);
    // don't need these variables yet
    //int first, last, middle, BuffSize = 30, lines = 1;
    //char buffer[BuffSize];
    //bool found = false;
    //file.open( filename.c_str());
    if (file.is_open())
    {
        cout << "The file is opened"<< endl;
        cout << "\n";
    }
    else
    {
        cout << "Error opening file"<< endl;
        cout << "\n";
        return; // no point continuing Binary_Search() if file failed to open!
    }
    //char c;
    //file.get(c);
    //while (file)
    //{
    //    if(c == '\n')
    //    {
    //        lines++;
    //    }
    //    file.get(c);
    //}
    //cout << "\n"<<lines;//The amount of lines inside the binary file.

    // The binary file doesn't have any lines in it! (see hex dump above)

    const unsigned int RECORD_SIZE = 30; // was BUFFER_SIZE
    char buffer[RECORD_SIZE] = {0}; // zero init buffer	

    int recordCount  =  0;
    int recordWanted = -1;

    while (file.read(buffer, RECORD_SIZE))
    {
        if(SearchVal == buffer)
        {
            recordWanted = recordCount;
            // if this was just a naive search loop could bail out now...
        }
        
        // as it's a char buffer and we know the string is null
        // terminated as we packed the buffer with zeroes when we
        // wrote the file (not a totally safe assumption, but...) 
        cout << recordCount << " : " << buffer << "\n";

        // refill buffer with zeroes for next time round
        fill_n (buffer, RECORD_SIZE, 0);

        ++recordCount;
    }

    cout << "\n";
    cout << "file contains " << recordCount << " records\n";
    cout << "\n";
    if (recordWanted == -1)
        cout << "record wanted could not be found\n";
    else
        cout << "record wanted is at index " << recordWanted << " records\n";
    cout << "\n";

    cout << "5 random records:\n";
    cout << "\n";

    for(int i = 0; i < 5; ++i)
    {
        int recordIndex = rand() % recordCount;
        file.seekg(recordIndex * RECORD_SIZE, ios::beg);
        fill_n (buffer, RECORD_SIZE, 0);
        file.read(buffer, RECORD_SIZE);
        cout << recordIndex << " : " << buffer << "\n";
    }

    cout << "\n";

    /*
    So the first one should be 0, last one should be equal to lines. 
    This is how im planing to get the attributes to do the binary search. 
    The problem is, lets say, it has to compare line 40 with the input value
    to figure out if the input value is greater or smaller then the string @ line 40.
    Ive no idea how to get to that line 40... to get the string. 
    */
} // functions don't need ; at end

void Create_Bin_File ()
{
    // use ifstream and ofstream rather than fstream
    ifstream fin ("rezervetie.txt"); // ios::in presumed
    ofstream fout ("Binary.bin", ios::binary); // ios::out presumed, and is binary

    const unsigned int RECORD_SIZE = 30; // was BUFFER_SIZE
    char buffer[RECORD_SIZE] = {0}; // zero init buffer

    // use getline() not read() for the input text file
    // don't test fin here and then call getline in loop
    while (fin.getline (buffer, RECORD_SIZE))
    {
        fout.write (buffer, RECORD_SIZE);
        // refill buffer with zeroes for next time round
        fill_n (buffer, RECORD_SIZE, 0);
    } // ; not needed here
    //fin.close (); no need for close now, as handled by destructor
    //fout.close ();
}

int main ()
{
    srand(123U); // as it's a test use fixed seed for now (as repeatable)

    Create_Bin_File (); // factor out...

    string ievade;
    cout << "Ievadiet vardu, kuru salidzinat: ";
    cin >> ievade;
    Binary_Search ("Binary.bin", ievade);

    return 0; // as I'm neurotic
}

Last edited on
Topic archived. No new replies allowed.