Writing Binary data files

Hi,

I am supposed to read and write a data file. Right now I am using ASCII format, but I need to considerably reduce the size. Someone suggested me to read and write the file in the Binary format.

I Googled, but couldn't find any precise and simple answers. My code is already "extremely well written" to read write everything in ASCII format. Is it possible that I can make it read and write in Binary just by introducing some flags somewhere and including packages?

Also, two questions about binary data files:
1) Does the code read, write and broadcast (for parallel programs) them faster compared to ASCII files?
2) Can I use the binary file created on one architecture on other architectures? I want it to run on i686, x86 and ia64.

Thanks
Last edited on
"Binary" vs "ASCII" (or "text") is not always a considerable saving in file size, particularly if you are writing a lot of string data.

The difference is essentially that textual data is human readable, where binary data is machine readable.

The integer value 42 is written "42" (two bytes), but it can be stored in a single byte value.

Floating point values are almost always better written as textual representations or as a specific binary representation, instead of whatever your OS/compiler uses.

So, to answer your questions:
1) Maybe. The less there is to transfer the faster it goes. So whichever file format is smallest wins.
2) Yes, but again, you must be specific about how data is stored in your binary file. For example, are multi-byte values stored in Big Endian or Little Endian format? Etc.

To change your code to handle binary format instead of textual data, you need to open your files with the ios::binary flag, and you need to rewrite your functions that do the actual I/O to use the iostream direct I/O methods (read(), write, get(), and/or put()). This should only be a trivial change, and you can even set it up so that the method to use (textual or binary) is indeed chosen on a flag value somewhere.

You might want to google around "c++ serialization" for more.

Hope this helps.
Thanks Duoas.

Replacing my regular function for writing ASCII files, which was something like:
1
2
3
4
ofstream OutFile;
OutFile.open(my_file, ios::out);
OutFile << my_double;
OutFile.close()

by the following syntax
1
2
3
4
ofstream OutFile;
OutFile.open(my_file, ios::out | ios::binary);
OutFile.write( (char*)&my_double, sizeof(double));
OutFile.close()

did it for me. My data file was mostly double. I am seeing more than 50% compression in file size, which is good. Also, to my surprise, I did not had to do anything more and this binary was read successfully by all the different architectures.

For reading the binary file, the syntax was:
1
2
3
4
ifstream InFile;
InFile.open(my_file, ios::in | ios::binary);
InFile.read( (char*)&my_double, sizeof(double));
InFile.close()


Thanks
Glad you got it working, but you should be aware that simply dumping a double to file using write() and reading it directly with read() will absolutely fail whenever you compile your program on a machine that uses a different byte order or when the machine's FP representation differs.

Make sure you document this as part of your file's format.
Thanks Duoas.
I don't remember the details, but some guy told me about his raxor-uber library he was doing to store binary data in a fast way for multi-platform purpose. What he basically did was:

1) mmap a file to some memory location
2) redirect the TCP/IP stack buffer for network packets to this place.
3) send the data to the IP stack...
At this point I got dizzy and lost interest. :-D

Ciao, Imi.
Topic archived. No new replies allowed.