Need help saving a vector to a file

Forum

Forum
Windows Programming
Need help saving a vector to a file

Need help saving a vector to a file

Pages: 12

Ok, I've come quite a long ways with my little program. I've patted myself on the back for the problems I was able to solve myself and have been grateful for the help I've received here. Now I'm stuck wanting to save information to a simple txt file. I'm following a tutorial from a somewhat dated book but it seems to apply (perhaps it doesn't!). Here is my two save and load functions:

void Orthography::LoadOrthographies(vector<Orthography>& orthographies)
{
	ifstream inFile("orthographies.txt", ios::binary);

	if (inFile)
	{
		inFile.read((char*)orthographies, sizeof(orthographies) * orthographies.size());
	}
}

void Orthography::SaveOrthographies(vector<Orthography>& orthographies)
{
	ofstream outFile("orthographies.txt", ios_base::binary);

	if (outFile)
	{
		outFile.write((char*)orthographies, sizeof(orthographies) * orthographies.size());

		outFile.close();
	}
}

I am getting red lines under "orthographies" after the (char*) in both functions. The error is "no suitable conversion function from "std::vector<Orthography>, std::allocator<Orthography>>" to "char*" exists". Ok, fair enough. Unfortunately I have no idea where to go from here. I've read up a bit on ifstream and ofstream but I can't seem to find anything to help me out here.

All that aside, I am not 100% positive that I am getting the orthography object into the function correctly. My gut is telling me to do it the way I have it (IE by reference).

closed account (Dy7SLyTq)

#include <iostream>
#include <fstream>
#include <string>

using std::ofstream;
using std::string;

template<class Type>
void SaveVecToFile(string &FileName, vector<Type> &ObjContainer)
{
	ofstream File(FileName);

	for(auto CurrentObj : ObjContainer)
		File<< CurrentObj << endl;
}

naraku9333 (2163)

I'm not particularly experienced with reading/writing binary files but you can try

1
2
3

inFile.read((char*) &orthographies[0], sizeof(orthographies) * orthographies.size());
...
outFile.write((char*) &orthographies[0], sizeof(orthographies) * orthographies.size());

andywestken (4087)

If your first snippet is supposed to be reading elements into a vector, rather than reading the vector class itself, then you need:

// get the count, which I assume you've saved
size_t count = 0;
inFile.read((char*)&count, sizeof(size_t));

// make sure the vector is big enough
orthographies.resize(count);

// pass the address of the first element and use the sizeof a single element
// (not the vector) multiplied by the number of elements.
inFile.read((char*)&orthographies[0], sizeof(Orthography) * orthographies.size());

But I am not convinced a binary format is the right one for your needs. It is certainly not the way to write "a simple txt file." And it will only work if your Orthography struct contains only simple types, like int and double. It it contains strings or vectors, of uses pointers, then it just won't work.

Is your Orthography class/struct basically a mapping of one (Unicode) string fragment to anothers? If this is the case, are the struct members wstrings? If there's more to it than that, please post its definition. (Edit: is Orthography a map<string, string> ??)

DTSCode code is in the right ball park, but unfortunately of little use to you as you're using Visual C++ 2010 which doesn't support range-based for loops. Note that it also requires you to provide an overload of operator<< for your Orthography type.

Andy

PS Yes, references are good. You could even use a const ref with SaveOrthographies.

Last edited on

closed account (Dy7SLyTq)

so just curious, why doesnt mine work?

andywestken (4087)

Because Visual Studio 2010 is too stupid!

Andy

Does MSVC10 Visual Studio 2010 support C++ range based loops
http://stackoverflow.com/questions/6898859/does-msvc10-visual-studio-2010-support-c-range-based-loops

Last edited on

closed account (Dy7SLyTq)

ah sorry thanks for explaining that

edit: fixed word issue

Last edited on

Ulfhedhin (104)

Yes, my CharacterMatch class has two wstring members and that is pretty much it. I'd tried the map class as you suggested but it won't work for me. The way I am using the strings, I won't always have a unique key.

You are probably right in that binary isn't the way to go. I was following a book :). What would you recommend as far as how to save it and set it up?

andywestken (4087)

If your structure is two wstrings, then it's not possible to use the approach you tried above. The binary approach only works when every thing is inside the actual struct. So it would work with

struct Orthography {
    wchar_t from[bufferSize];
    wchar_t to[bufferSize];
};

but not

struct Orthography {
    wchar_t* from;
    wchar_t* to;
};

struct Orthography {
    wstring from;
    wstring to;
};

When you say that it's "pretty much it", what do you mean? Is there something else in the struct?

If you strings are only every a few chars (1 or 2? less than 8? ??), you could use buffers rather than strings. But even then I think you should save your data in a text file rather than a binary file.

As far as multiple keys go, you could use multimap, rather than map.
http://www.cplusplus.com/reference/map/multimap/

But will there be an arbitrarily large number of values with the same key, or only ever a few? Can you place an upper limit?

Andy

PS Actually, due to the way most wstring are implemented, the binary file approach will prob work for small strings but not large strings, as they store small strings in a buffer and larger strings in heap memory. But you're not supposed to know/assume that.

Last edited on

andywestken (4087)

The following code saves Orthography values in a vector to a text file, one per line.

1. It uses more or less the approach DTSCode suggested, but adjusted to take account of the fact that Visual C++ 2010 does not support range-based for loops.

2. An insertion operator has been defined for Orthography, here assumed to be just two wstrings, for illustrative purposes.

3. I made the save and read functions normal functions, rather than templating them, as I was being a bit lazy.

4. All strings are Unicode and are saved to and read from a Unicode text file.

5. You'll see the way the files are opened is a bit odd. This appears to be how you have to open them to read and write Unicode (I am still a little bit unsure of this, but the information I've found to date all points this way.)

6. I am assuming that the slash char will never be mapped, so I can use it as an easy to find separator.

7. While I use operator<< to write the file, I am using getline rather than operator>> to read them back, and then manually splitting the strings, as I think this is less fiddly to make robust.

Andy

// coded for Visual C++ 2010, etc.

#define _CRT_SECURE_NO_WARNINGS
#include <iostream>
#include <iomanip>
#include <fstream>
#include <string>
#include <vector>
#include <cstdio>  // for _fileno
#include <io.h>    // for _setmode
#include <fcntl.h> // for _O_U16TEXT
using namespace std;

struct Orthography {
    Orthography() {}
    Orthography(const wstring& f, const wstring& t) : from(f), to(t) {}
    wstring from;
    wstring to;
};

wostream& operator<<(wostream& wos, const Orthography& orth) {
    wos << orth.from << L"/" << orth.to;
    return wos;
}

bool parseStr(const wstring& str, Orthography& orth) {
    size_t pos = str.find(L'/');
    if(pos == str.npos)
        return false;
    orth.from = str.substr(0, pos);
    orth.to   = str.substr(pos + 1);
    return true;
}

void saveOrthographyToFile(const wstring& filePath, const vector<Orthography>& orthographies) {
    FILE* fp = _wfopen(filePath.c_str(), L"w,ccs=UNICODE");
    wofstream ofs(fp);

    const size_t count = orthographies.size();
    for(size_t index = 0; count > index; ++index)
        ofs << orthographies[index] << endl;

    fclose(fp);
}

void readOrthographyFromFile(const wstring& filePath, vector<Orthography>& orthographies) {
    FILE* fp = _wfopen(filePath.c_str(), L"r,ccs=UNICODE");
    wifstream ifs(fp);

    wstring line;
    while(getline(ifs, line)) {
        Orthography orth;
        if(parseStr(line, orth))
            orthographies.push_back(orth);
    }

    fclose(fp);
}

void saveValues(const wstring& filePath) {
    vector<Orthography> orthographies;
    orthographies.push_back(Orthography(L"a", L"ą"));
    orthographies.push_back(Orthography(L"y", L"ai"));
    orthographies.push_back(Orthography(L"v", L"aw"));
    orthographies.push_back(Orthography(L"f", L"ng"));
    orthographies.push_back(Orthography(L"th", L"þ"));

    wcout << L"Save data to file : c" << filePath << endl;
    saveOrthographyToFile(filePath, orthographies);
    wcout << endl;
}

void readAndDisplayValues(const wstring& filePath) {
    vector<Orthography> orthographies;

    wcout << L"Read data from file : c" << filePath << endl;
    readOrthographyFromFile(filePath, orthographies);
    wcout << endl;

    ios_base::fmtflags old_flags = wcout.setf(ios_base::left, ios_base::adjustfield);

    wcout << L"Data read:" << endl;
    const size_t count = orthographies.size();
    for(size_t index = 0; count > index; ++index) {
        const Orthography& ortho = orthographies[index];
        wcout << setw(2) << ortho.from << L" -> " << setw(2) << ortho.to << endl;
    }
    wcout << endl;

    wcout.setf(old_flags);
}

int wmain() {
    _setmode(_fileno(stdout), _O_U16TEXT);

    const wchar_t filePath[] = L"othographies.txt";

    saveValues(filePath);
    readAndDisplayValues(filePath);

    return 0;
}

Last edited on

Ulfhedhin (104)

As far as "pretty much it", I did have a "m_Name" variable that stored a name for each CharacterMatch object but I never used it and ultimately removed it. Beyond that, it is just two wstrings.

Yes, the strings will be small. For example, the longest string I can think of at the moment is 3 characters long. For example, some historical sources have the language written in two to three character syllables separated by dashes (IE "bla-bla-bla"). So adding a few more characters onto that would serve in the "just in case" capacity should I come across something more than that.

Most of the time, it will be a simple character swap. For example, a "d" will become an "a". Next would be a character from an orthography that would become two in the current one. For example, a "g" would become a "th". Next would be two characters which would become one. For example, "dh" would become "ð" (the "eth" character). So most of the time we have 1 to 1, 1 to 2, and 2 to 1.

No, there won't be a large number of values with the same key. Actually, it is very rare (only once or twice) but it does occur.

I'll stick with a text file since you recommend it. Like I said, I was only following a book and had no idea if it would have been the best route to go :).

I'll poke through the code you posted and try to apply it. Unless anything I've said about my objects changes anything as far as your approach.

andywestken (4087)

As your strings are of limited size you probably want to use wchar_t buffers rather than wstrings). As you're using Visual C++ 2010, you can use a std::array rather than a C-style array.

If you make this change (wstring to array) in the code I posted previously, there are no changes to the functions saveOrthographyToFile and loadOrthographyToFile or afterwards except for line 86 (the wcout line) there from and to become from() and to(). Changing them to accessors made sense as it's tidier than the alternative, e.g. &(ortho.from[0]), etc.

The Orthography class (changed from struct) does have more work to do. And the insertion operator and parseStr function have also had to change. See below.

I selected a buffer size of 8 wchar_ts as that's the same as wstring uses for its fixed buffer. This gives you up to 7 chars to play with; if that's not enough, then up the value.

Andy

Extra headers:

1
2

#include <array>
#include <algorithm>

Changed code (inc new helper function toArray)

template<size_t N, class V>
array<wchar_t, N> toArray(const V& v) {
    array<wchar_t, N> d = array<wchar_t, N>();
    copy(begin(v), end(v), begin(d));
    return d;
}

class Orthography {
public:
    static const size_t N = 8;

    Orthography()
    : m_from(array<wchar_t, N>()), m_to(array<wchar_t, N>()) {}
    Orthography(const Orthography& that)
    : m_from(that.m_from), m_to(that.m_to) {}
    Orthography(const wstring& f, const wstring& t)
    : m_from(toArray<N>(f)), m_to(toArray<N>(t)) {}

    Orthography& operator=(const Orthography& that) {
        m_from = that.m_from;
        m_to   = that.m_to;
        return *this;
    }

    const wchar_t* from() const {
        return &m_from[0];
    }
    const wchar_t* to() const {
        return &m_to[0];
    }

private:
    array<wchar_t, N> m_from;
    array<wchar_t, N> m_to;
};

wostream& operator<<(wostream& wos, const Orthography& orth) {
    wos << orth.from() << L"/" << orth.to();
    return wos;
}

bool parseStr(const wstring& str, Orthography& orth) {
    size_t pos = str.find(L'/');
    if(pos == str.npos)
        return false;
    wstring f = str.substr(0, pos);
    wstring t = str.substr(pos + 1);
    orth = Orthography(f, t);
    return true;
}

Last edited on

Ulfhedhin (104)

Ok, I am poking through the code and trying to make sense of what you are doing. Goofy question, is all of that code meant to go into a header file?

I've read a bit on overloading operators so this will give me good practice. I'm also a little fuzzy on how "this" works. It was one of those things that I read about but since I've never used it, it hasn't really "sunk in" yet. The same applies to templates.

I've never used array before (as an include) but I think I see what you are doing with it. Is it really as simple as array<wchar_t N> making an array of wchar_t's of N size (in this case 8)? Is this another way of doing wchar_t myArray[8]?

I may be wrong (still studying the code) but are you filling orthographies with character matches without using the CharacterMatch objects like I was using? If so, I'm thinking that would be better since I wouldn't have to hassle with another class.

Ulfhedhin (104)

Ok, I've had a bit of time to try to wrap my head around your code but I still have some questions.

For example, I'm not sure what this is doing:

1
2

Orthography()
    : m_from(array<wchar_t, N>()), m_to(array<wchar_t, N>()) {}

I can tell that this is a constructor but I'm not sure what the colon is for and what is going on after it. I see that m_from and m_to are member variables of type array<wchar_t> but I'm not sure of the syntax involved here with what you are doing with the (array<wchar_t, N>()) after them.

I think I've got a handle on this one:

1
2

Orthography(const Orthography& that)
    : m_from(that.m_from), m_to(that.m_to) {}

That one references an orthography object and near as I can tell, you are getting the m_from and m_to from the "that" orthography object and putting it into the m_from and m_to. Again, I am not sure what is going on with the syntax here with the parenthesis. What are those parenthesis doing in the code like m_from(that.m_from)?

This one:

1
2

Orthography(const wstring& f, const wstring& t)
    : m_from(toArray<N>(f)), m_to(toArray<N>(t)) {}

This looks to be using the template code you had above:

template<size_t N, class V>
array<wchar_t, N> toArray(const V& v)
{
    array<wchar_t, N> d = array<wchar_t, N>();
    copy(begin(v), end(v), begin(d));
    return d;
}

I'm still pretty new to templates so I'm not entire sure what you have going on here.

This last function looks overloads the = operator.

Orthography& operator=(const Orthography& that)
	{
        m_from = that.m_from;
        m_to   = that.m_to;
        return *this;
    }

The only thing is I am still fuzzy on what "this" is doing exactly. I've read up on it and everything points to (ha!) saying that it is a hidden pointer to the object you are working with. So my questions is what exactly is it doing here? I can see that this function compares member variables but what is it returning when it returns "this"? Since this is an = operator, is it returning something like a true or false value?

The other two functions:

const wchar_t* from() const
	{
        return &m_from[0];
    }

    const wchar_t* to() const
	{
        return &m_to[0];
    }

Those seem to be simple "get" functions.

I'm still working on the rest of the functions. With all of these constructors and such, I'm fuzzy on how I should implement them. Does this work in the same way that I had before but it is just coded differently? This new code of yours doesn't seem to use the CharacterMatch objects. Do they not need to?

andywestken (4087)

1
2

Orthography(const Orthography& that)
    : m_from(that.m_from), m_to(that.m_to) {}

This code is using the constructor initializer list. You should always prefer this form to the equvalent.

Orthography(const Orthography& that) {
    m_from = that.m_from;
    m_to = that.m_to;
}

as the members are always initialized in a single step in the former case, whereas they can be initialized and then assigned to in the latter.

#2

1
2

Orthography()
    : m_from(array<wchar_t, N>()), m_to(array<wchar_t, N>()) {}

Same deal regarding the init list. But the template class array<>'s default constructor does not zero init the data array it holds. So I am using value initialization to zero temporary instance of array<> and using them to init the member variables.

Value initialization also works with built-in types, e.g.

int i = int(); // set i to 0

Basically, when you use empty brackets after a type which has no explicit constructor, then the variable is "value initialized", which means it's filled with zeros.

#3

Here...

    Orthography& operator=(const Orthography& that)
    {
        m_from = that.m_from;
        m_to   = that.m_to;
        return *this;
    }

this is a pointer to the instance (it's only hidden in the sense it's only visible inside the class.). So you could code the method as

    Orthography& operator=(const Orthography& that)
    {
        this->m_from = that.m_from;
        this->m_to   = that.m_to;
        return *this;
    }

It is possible to code operator= to return void, for example. But the convention is to return a reference to the instance as this means that the class will behave like a normal, built0-in type. e.g.

1
2
3

int m;
int n;
m = n = 3; // set m and n to 3

It's not syntax I tend to use, but it is allowed by C++. So most people write classes so they follow suit.

Andy

Further reading:

c++ value initialization
http://stackoverflow.com/questions/5697296/c-value-initialization

Last edited on

Ulfhedhin (104)

Will this work without the CharacterMatch class? I'm thinking so based on what you posted but I'm not 100% sure since I'm not sure if I need to completely do away with my code while implementing yours.

How would this work in the program? I'm not quite sure how to use this as far as adding "character matches" (not the CharacterMatch objects that I think I have to do away with). If I am reading it right, each match will be put into their respective arrays and I can get at them via their index, right?

andywestken (4087)

Will this work without the CharacterMatch class?

Not sure, I don't have enough info to go on.

I based the code I wrote on your opening post, which uses a vector<Orthography>.

How do your CharacterMatch and Orthography classes relate to each other??

If I am reading it right, each match will be put into their respective arrays and I can get at them via their index, right?

The writing code writes the contents of the individual Orthography elements in a vector to disk.

The reading code reads the data back, constructs a new Orthography, and adds it to the end of a vector.

So you can get at the elements index.

Andy

Ulfhedhin (104)

The CharacterMatch class is simply this:

#include "CharacterMatch.h"

// Constructors
CharacterMatch::CharacterMatch()
{

}

CharacterMatch::CharacterMatch(wstring wInput, wstring wOutput)
{
	m_Input = wInput;
	m_Output = wOutput;
}

// Destructor
CharacterMatch::~CharacterMatch()
{

}


// Accessor Functions
wstring CharacterMatch::GetInput() const
{
	return m_Input;
}

wstring CharacterMatch::GetOutput() const
{
	return m_Output;
}

Basically it is just a class that contains two wstring variables and nothing else. I'm thinking that what you've done might negate the need for the class due to the use of the arrays. Or was that just so the information could be saved? It seems to me that if my Orthography object contains two arrays (as you did), I wouldn't need a dedicated CharacterMatch class because the arrays would handle the storage and retrieval of the strings. I'm basing this on my understanding of your constructor:

1
2

Orthography(const wstring& f, const wstring& t)
    : m_from(toArray<N>(f)), m_to(toArray<N>(t)) {}

If I'm right, I'm thinking is I need to come up with a way to add my character matches (with a "match" being a particular index in the "from" array to the same index in the "to" array). I'll also need to make sure that any new "match" needs to verify that it isn't a duplicate entry.

Does it sound like I'm in the right ballpark?

andywestken (4087)

Your CharacterMatch looks just like what I assumed your Orthography class would look like.

What is your Orthography, then?

with a "match" being a particular index in the "from" array to the same index in the "to" array

A given Orthography (or CharacterMatch ) should hold both the from and to strings.

The arrays I'm using are just a way of holding short strings (i.e. sequence of a few chars). They are not arrays of strings.

Andy

PS Not your CharacterMatch should really use the constructor initializer list, like this (though probably laid you with more style than possible here...)

CharacterMatch::CharacterMatch(wstring wInput, wstring wOutput) :
	m_Input(wInput),  m_Output(wOutput)
{
}

Last edited on

Ulfhedhin (104)

Ahh, ok. I thought your arrays were arrays of "matches", not arrays that held the short strings.

The way I had it set up was I had a CharacterMatch class which, as you saw, only held 2 wstring variables. My Orthography class has a vector array that holds CharacterMatch objects. What has me confused is if the code you posted to save the orthographies to file works with that. You said that my CharacterMatch looks like what you thought my Orthography class looked like. How so? Can these be combined?

I see what you did with the CharacterMatch constructor since it is laid out in the way you said was preferable to:

CharacterMatch::CharacterMatch(wstring wInput, wstring wOutput)
{
	m_Input = wInput;
	m_Output = wOutput;
}

So I'll be changing it right now :). Also, I'll post my Orthography class code (before your additions) along with my CharacterMatch code so you can see where I was (trying to!) go with it.

CharacterMatch.h

class CharacterMatch
{
public:
	// Constructor(s)/Destructor
	CharacterMatch();
	CharacterMatch(wstring wInput, wstring wOutput) : m_Input(wInput), m_Output(wOutput);
	~CharacterMatch();

	// Accessor functions
	wstring GetInput() const;
	wstring GetOutput() const;

	// Class functions


private:
	// Members
	wstring m_Input;
	wstring m_Output;
};

Orthography.h

class Orthography
{
public:

	// Constructor(s)/Destructor
	Orthography();
	Orthography(wchar_t* name);
	~Orthography();

	// Accessor functions
	wstring GetName() const;
	vector<CharacterMatch> GetCharacterMatches() const;

	// Class functions
	void AddCharacterMatch(CharacterMatch match);
	//void LoadOrthographies(vector<Orthography>& orthographies);
	//void SaveOrthographies(vector<Orthography>& orthographies);

	template<class Type>
	void SaveOrthographies(vector<Type>& orthographies);

private:
	// Members
	wstring m_Name;

	vector<CharacterMatch> m_CharacterMatches;
	
};

Last edited on

Pages: 12