Writing/Reading String Objects To/From Files

Hi Folks,
I am using the code below to write a single instance of object "Employee" to a file in Binary mode. The write part seems to work fine, however when I try to read the single employee object from the file into memory I get a double free or corruption error.

I think this has to do with the fact that I am using a string data member in the Employee class but I don't understand what is going wrong. I have read that strings can vary in length and use dynamic memory allocation but if I write a single employee object to a file with data member 'name' equal to "John", it should be the exact same size when I read it back in right?

The code below works with no issues when I omit the string data member. Why is that? Where is the memory for the string object being "double released" when I read the employee object back into memory from the file?

I am using Linux Mint 15, Eclipse June and GCC 4.7.3 with the -std=c++11 option.

Thanks!

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
#include <iostream>
#include <fstream>
#include <string>
#include <cstdlib>
using namespace std;

class Employee
{
public:
	int number;
	string name;
};

int main()
{
//Write employee data to a file in binary format
Employee employee;
employee.number = 1;
employee.name = "John";

ofstream output("output.dat", ios::out | ios::binary );

if(!output)
{
	cerr<<"File could not be opened"<<endl;
	exit(EXIT_FAILURE);
}

output.write(reinterpret_cast<const char*>(&employee), sizeof(Employee));
output.close();


//Read employee data from a file using binary format
ifstream input("output.dat", ios::in |ios::binary);

if(!input)
{
	cerr<<"File could not be opened"<<endl;
	exit(EXIT_FAILURE);
}

Employee employeFromFile;

input.read(reinterpret_cast <char *>(&employeFromFile), sizeof(Employee));

cout<<employeFromFile.number<<endl;
cout<<employeFromFile.name<<endl;
}
The problem is that string does not store the value ("John") within the instance for name. string dynamically allocates space for the value and stores the pointer to that dynamiic memory within the string instance.

When you read the file back into employeFromFile, you're overwriting employeFromFile.name with the contents (including the pointer to "John") of Employee.name. At the end of main, both Employee and employeFromFile are destructed. Since both instances point to the same dynamic memory containing "John" (which was never written to the file), you get an error trying to release the same memory twice.
> I have read that strings can vary in length and use dynamic memory allocation
strings use an inner pointer that points where the data is. When you write an std::string object you write that pointer, not the content.
In a different run that address is meaningless (as it do not contains the string) and may be invalid (as it was not created from new)

> Where is the memory for the string object being "double released" ?
You are casting to char in order to mess with the internals of the string. (Don't do that, use the interface provided.)
As a result, the internal pointers of `employee.name' and `employeFromFile' both point to the same address.
When their destructor is invoked, they both try to free the same address.

Another look: sizeof(Employee) is a constant value ¿where are you taking into account the length of the string?

You need to read up about object serialization.

Serialization
http://en.wikipedia.org/wiki/Serialization

C++ FAQ -- Serialization and Unserialization
http://www.parashift.com/c++-faq/serialization.html

If you're using classes then I assume you buy into object oriented principles. If that's the case, you should be getting the object to save itself to disk, not saving it from outside. That allows the class/object to use internal knowlege to convert its data into a sutable form for storage and retrieval.

And with object serialization, a class is responsible for how it stores it's information. Usually done bit by bit.

Your main routine should end up something like this (I've also tightened up the encapsulation a bit), where saveToStream and readFromStream have these signatures

1
2
	bool saveToStream(ostream& output)
	bool readFromStream(istream& input)


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
int main()
{
	bool ret = false;
	{
		//Write employee data to a file in binary format
		Employee employee(1, "John");
		ofstream output("output.dat", ios::out | ios::binary);
		if(!output)
		{
			cerr<<"File could not be opened"<<endl;
			return EXIT_FAILURE;
		}
		ret = employee.saveToStream(output);
	}
	if(ret)
	{
		//Read employee data from a file using binary format
		ifstream input("output.dat", ios::in |ios::binary);
		if(!input)
		{
			cerr<<"File could not be opened"<<endl;
			return EXIT_FAILURE;
		}
		Employee employeeFromFile;
		bool ret = employeeFromFile.readFromStream(input);
		if(ret)
		{
			cout<<employeeFromFile.getNumber()<<endl;
			cout<<employeeFromFile.getName()<<endl;
		}
		else
		{
			cout<<"read failed"<<endl;
		}
	}
	return 0;
}


Andy

PS I also replace the exit(EXIT_FAILURE) with return EXIT_FAILURE as it's better from a C++ perspective.

When I call return in main(), destructors will be called for my locally scoped objects. If I call exit(), no destructor will be called for my locally scoped objects! Re-read that. exit() does not return. That means that once I call it, there are "no backsies." Any objects that you've created in that function will not be destroyed. Often this has no implications, but sometimes it does, like closing files (surely you want all your data flushed to disk?).

From: return statement vs exit() in main()
http://stackoverflow.com/questions/461449/return-statement-vs-exit-in-main
Last edited on
Topic archived. No new replies allowed.