Out of bounds memory block?

I've been writing something that will allow me to read an entire file and organize all data into 8 byte binary blocks. I want to be able to manipulate individual bits however I see fit and then feed all of it back into an output file. To do this I made three individual classes that create instances of the previous class to store this data in the format I want. The three classes are _byte, _block, and _blockman. _byte and _block are self explanitory and have functions to display what information they hold, (for debugging) and public variables that can be changed. _blockman is the block management class that instanciates enough blocks to hold the contents of the file (up to 2GB). The constructor parameter for this class is a filename, the file is loaded into memory and divied up. The problem I'm having is that either too many blocks are instanciated (an extra block will be made that holds nothing but 0s) or just enough blocks will be made, but there are unexpected bytes at the end of the block, after what should be the end of the file.

The main() that I'm testing all this with.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
#include <cstdlib>
#include "blocks.cpp"

//This was made to create byte and 8x8 bit block objects complete with 
   functions to set and display them.

using namespace std;

int main() {
   int i;
   ofstream test("test.txt", ios::out | ios::trunc);
   test << "blah\nThis is a test of the \'ofstream test\'.\n";
   test << "This will also be used to test \'class _blockman\'.\nblah\na";
   test.close();
   _blockman myfile("test.txt");
   cout << "Size of file is " << myfile.mysize << endl;
   cout << myfile.allocatedblocks << " 8x8 bit blocks were allocated for 
      use with test.txt\n";
   for(i=0;i<myfile.allocatedblocks; i++) {
      cout << "\nBlock " << i;
      myfile.myblock[i].printblock();
   }
   return EXIT_SUCCESS;
}


At the end of this running I get:

Block 12
Block Values:
01100001 97 a
01101000 104 h
00001010 10 

01100001 97 a
00110110 54 6
00101001 41 )
00111101 61 =
01000011 67 C
Press any key to continue . . .


I have no idea where the "6)=C" comes from, by all rights these should still be zeroed out values from when the bytes were created!

The code in question is telling me that my string that I outputted to "test.txt" is 104 characters long, when in actuality it's 100.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
class _blockman { //manages the setting and manipulating of multiple blocks as it pertains to file io
protected:
   int i, j, blocksneeded;
   int beg, end, size;
   int setblock(int, int, int);
   int setblock(int, int, char); 
   char *memblock;
   int manage();
   ifstream input;
public:
   int mysize;
   int allocatedblocks;
   _block *myblock;
   _blockman(const char*);
   ~_blockman();
};

_blockman::_blockman(const char* filename) { //Works for files up to 2 GB
   input.open(filename);
   if(input.is_open()) {
      beg = (int) input.tellg();
      input.seekg(0, ios::end);
      end = (int) input.tellg();
      size = end-beg;
      mysize = size; 
      memblock = new char[size];
      input.seekg(0,ios::beg);
      input.read(memblock, size);
      input.close();
      float fsize;
      if(size % 8 == 0) {
         blocksneeded = size/8;
      }else{
         fsize = size/8;
         size = (int) fsize;
         blocksneeded = size+1;
         size*=8;
      }
      try {
         myblock = new _block[blocksneeded];
         allocatedblocks = blocksneeded;
      }
      catch (bad_alloc&) {
        cout << "Error allocating memory." << endl;
        system("Pause");
        allocatedblocks = -1;
      }
   }else {
      cout << "Error opening file!\n";
      system("Pause");
      allocatedblocks = -1;
   }
   manage();
}
The string contains 96 text characters, plus 4 newlines. In the file the newline may be represented as a CR-LF pair, that is two characters, bringing the total to 104.

But that's relatively trivial, I don't know how it affects the execution of your program.
#include "blocks.cpp"

Don't do this. Put your class declaration into a header file, the implementation into a .cpp file. INclude the header file into any .cpp file that needs to use an object of that type.

If you are using an IDE, you could do this automatically with a class wizard.

The easiest way to find runtime errors is to use the debugger, you can have a watchlist of variables, breakpoints, step through code 1 line at a time.

This should be easy if a debugger is in your IDE.

HTH
@Chervil: That is indeed the case, I took off two \n and the count dropped by four. However there is still the issue with the seemingly random bytes at the end of the last block. I don't understand how they got there. Here is the manage function.

1
2
3
4
5
6
7
8
9
10
11
12
13
int _blockman::manage() {
   if(allocatedblocks == -1) {
      exit(147);
   }
   for(i=0;i<=blocksneeded;i++) {
      for(j=0;j<8;j++) {
         if((i*8+j)<mysize) {
            setblock(i, j, memblock[i*8+j]);
         }else break;
      }
   }
   return 0;
}


This makes a reference to _block::setblock(int block, int byte, char input) which goes all the way to the _byte level to set the individual value.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
void _byte::setvalue(char carg) {
   cvalue = carg;
   int iarg = (int) carg;
   ivalue = iarg;
   for(i=0;i<8 && iarg>0;i++) {
      if(iarg>=bitvalues[i]) {
         cbinvalue[i] = '1';
         ibinvalue[i] = 1;
         iarg-=bitvalues[i];
      }else{
         cbinvalue[i] = '0';
         ibinvalue[i] = 0;
      }
   }
   cbinvalue[8] = '\0';
   sbinvalue = cbinvalue;
}


Edit: When I took the endlines out I got this:

Block 12
Block Values:
00001010 10

01100001 97 a
00111101 61 =
01111000 120 x
00000000 0
00000000 0
00000000 0
00000000 0
Press ENTER to continue...


How do the CR-LF pairs work?
Last edited on
@TheIdeasMan: Is that a naming convention thing? I've heard that before on this project, but I haven't gotten a reason.
Don't do this. Put your class declaration into a header file, the implementation into a .cpp file. INclude the header file into any .cpp file that needs to use an object of that type.


Is that a naming convention thing? I've heard that before on this project, but I haven't gotten a reason.


It is an organisation thing, plus including .cpp files will cause you problems.

It is also unnecessary, the compiler only needs the header file to see which functions are part of a class. The Linker can then go find the .cpp file for the implementation.

This concept isn't helped by authors of books & websites which show declaration & implementation all rolled into 1 file in small examples.

Consider the example where a .cpp file needs to use 10 different types of objects. Say a typical .cpp file for a class has 500 lines of code for it's 10 functions say. If you include all the files, you will have 5000 lines included.

Also consider where objects are need in multiple files - it is much more efficient to just include the header.

Normally header files are much smaller than .cpp files, because they just contain the declarations. So if you include the 10 headers, it might only be 500 lines.

HTH
Last edited on
Ok, that makes sense. Thank you much! Would you mind explaining the whole CR-LF pairs thing? I don't think I understand that, but it may explain the extra bytes at the end of the last block.
How do the CR-LF pairs work?

This varies with operating system, Windows tends to use carriage-return/linefeed while other OSes may use just a linefeed.
Sample text:
test
file

Represented as hexadecimal like this:
 t  e  s  t \r \n  f  i  l  e
74 65 73 74 0D 0A 66 69 6C 65


If you want to view/ edit files in hex, try
http://sourceforge.net/projects/hexplorer/
(for Windows).
On windows, a newline is 2 characters - a Carriage Return and a Line Feed.

On Unix / Linux a newline is 1 character - newline.

Some functions convert CR/LF into a newline. So perfectly good encryption code can loose 1 char for every CR/LF, then when it is decrypted, the file will be shorter by this amount.

Good luck!!!
Ah, ok. Is it possible that carriage return and linefeed are being counted as just a linefeed in my program, and it's running into something else? The count went down by 4 when I took out 2 \n. Isn't that usually caught by windows as an access violation?
I guess it is counting them as CR-LF, but the values don't match up, they're different every time I compile.
I'd suggest that you should be opening the file in binary mode. That way, each byte is just a byte.

When you open in text mode, some interpretation/translation of the data may occur so what you end up with in memory may not be the same as the file stored in disk.

(the initial part where you create the file is ok in text mode, since you are in fact dealing with text).
Lol, yeah I thought it would be easier to test everything in text mode, but apparently not! Thanks guys, I really appreciate it! :)
Yup, that's what it ended up being: an interpretation issue. Loaded the file with ios::binary and those random bytes disappeared. Thank you Chervil and TheIdeasMan!
Topic archived. No new replies allowed.