Needing help with parsing

Pages: 12
I am working on a program that can swap out one orthography (writing system) for another to help with language translations. I have an Orthography class which contains a vector array of CharacterMatch objects (see my code for those here: http://www.cplusplus.com/forum/windows/108112/)

I've been working on this program for a couple months and my initial parsing portion of my program works great. However it is hard-coded into the program and I'd like to make it flexible. So after a lot of thinking, trial and error, searching, and some help from the posters here, I think I finally got the code for my Orthography and CharacterMatch classes hammered out to where a user can enter their own character matches into the program. Now what I am having trouble with is coming up with a way to use a particular orthography object and iterate through the character matches to produce the results I want.

I have been staring at this code for days now and really am none the wiser and I feel like I am not seeing an obvious solution.

The idea is for users to enter their own character matches. For example, the user can enter a "g" and have it parse a string and replace all "g" characters with "th". But here is where part of my problem lies. What if a user wants to not only swap out a "g" with a "th" but also another character? I am trying to account for multiple swaps and that means adding more strings to the m_iResults vector array as needed (before, I was doubling them). In short, if there are multiple possible outputs, I'm wanting to show each possible combination.

Hopefully I've explained this well enough. And hopefully my own attempt is so off track as to throw off those attempting to help :).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
vector<wstring> Parser::OrthographyParser(const wstring& wstrInput, const Orthography& orthography)
{
	// Set the size of the vector array to 1 (m_iCount)
	m_iCount = 0;
	m_Results.resize(m_iCount);

	// Now we check the orthography against itself.  We need to do the multiple
	// letter checks first.
	// We will use the size of the wstrInput string to determine the max number if times to
	// run the loop.

	// Variables to use in this function
	// Count variable
	int cnt = 0;
	int cnt2 = 0;
	wstring temp1;
	wstring temp2;
	int index = 0;

	// Figure out the longest string in the orthography inputs
	vector<CharacterMatch>::const_iterator matchIter;
	for (matchIter = orthography.GetCharacterMatches().begin(); matchIter != orthography.GetCharacterMatches().end(); matchIter++)
	{
		temp1 = matchIter->GetInput();

		if (temp1.size() > cnt)
		{
			cnt = temp1.size();
		}
	}

	// Empty temp
	temp1 = L'';

	// Remove all of the dashes (if any)
	for (int i = 0; i < wstrInput.size(); i++)
	{
		if (wstrInput[i] == L'-')
		{
			temp1.resize(i);
			temp1 += L'';
		}
		else
		{
			temp1.resize(i);
			temp1 += wstrInput[i];
		}
	}

	for (int i = 0; i < temp1.size(); i++)
	{
		for (int j = cnt; j > 0; j--)
		{
			// Create a buffer to hold the portion of temp we are checking against
			vector<wchar_t> buf(STR_LEN);

			// Populate that buffer
			for (int k = (0 + index); k < (cnt + index); k++)
			{
				if (index == 0)
					buf[k] = temp1[k];
				else
					buf[k - index] = temp1[k];
			}

			temp2 = &buf[0];

			vector<CharacterMatch>::const_iterator iter;
			for (iter = orthography.GetCharacterMatches().begin(); iter != orthography.GetCharacterMatches().end(); iter++)
			{
// TRYING TO FIGURE OUT WHAT TO DO IN HERE

				if (iter->GetInput() == temp2)
				{
					

				}

			}
		}

		index++;
	}

	return m_Results;
}
I've been tinkering with this over the past few weeks and I'm still stuck so I'm going to bump this thread again. Although I think I'm better off than I was, I'm still not able to get this to work the way I'm needing it to.

What I am trying to do is have the function find the largest possible string in the Othography and check for those. Then I am wanting it to work backwards to the next size, all the way down to one character. So for example if my largest input string was 3 characters (such as "dah"), it would check the first three characters in the wstrInput string for "dah". If it didn't find one, it would then check the first 2 characters for any 2 character matches in the Orthography. And if it didn't find any, then it would check for any 1 character matches.

That is what I am wanting it to do...cycle through each possible string and if necessary, work backwards. I am trying to keep track of my index position for this but it is proving troublesome. Any help would be GREATLY appreciated.

Here is my current function:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
vector<wstring> Parser::OrthographyParser(const wstring& wstrInput, const Orthography& orthography)
{
	// Set the size of the vector array to 1 (m_iCount)
	m_iCount = 1;
	m_Results.resize(m_iCount);

	// Set the first item in the vector array to an empty string
	m_Results[0] = L"";

	// Map to hold the orth inputs and the number of times each one occurs in the user's input
	map<wstring, int> matches;

	bool match = false;
	bool wstrInputEnd = false;

	int iMatchSize = 0;
	int index = 0;

	// Find the largest input string in the orthography
	vector<CharacterMatch>::const_iterator matchIter1;
	for (matchIter1 = orthography.GetCharacterMatches().begin(); matchIter1 != orthography.GetCharacterMatches().end(); matchIter1++)
	{
		wstring temp;

		temp = matchIter1->GetInput();

		if (temp.size() > iMatchSize)
		{
			iMatchSize = temp.size();
		}
	}

	while(!wstrInputEnd)
	{
		//match = false;

		// Now iterate through the string looking for a match starting with the largest input string and work backwards
		vector<CharacterMatch>::const_iterator matchIter2;
		for (matchIter2 = orthography.GetCharacterMatches().begin(); matchIter2 != orthography.GetCharacterMatches().end(); matchIter2++)
		{
			// We are iterating through the inputs.  We are using the index variable to keep track of where
			// we are in the string.
		
			// Create a string based on the index position and the size we are comparing against.
			wstring temp1;
			wstring temp2 = wstrInput;
			for (int i = 0; i < iMatchSize; i++)
			{
				temp1 += temp2[i + index];
			}

			// Now check the inputs from the orthography against the newly created temp string
			if (matchIter2->GetInput() == temp1)
			{
				// If there was a match, set the match flag to true.
				match = true;

				m_Results[0] += matchIter2->GetOutput();
			}

			// If there was a match, set the index position accordingly.
			if (match)
			{
				index += iMatchSize;
			}

			// If the index is greater or equal to the wstrInput size, exit the loop
			if (index == wstrInput.size())
				wstrInputEnd = true;

			if (iMatchSize > (wstrInput.size() - iMatchSize) && match)
			{
				iMatchSize = wstrInput.size() - iMatchSize;
			}

			match = false;
		}

		// Now we need to see if the inputted string is long enough for another runthrough with
		// the same size
		//iMatchSize--;
	}


	return m_Results;
}
Last edited on
Have a look at the boost library.
There's a parser lib (Spirit) and maybe you can use the regular expressions lib (boost.regex)

Spirit:
http://www.boost.org/doc/libs/1_54_0/libs/spirit/doc/html/spirit/introduction.html

regex:
http://www.boost.org/doc/libs/1_54_0/libs/regex/doc/html/boost_regex/introduction_and_overview.html
What you need to do is sort the vector of CharacterMatches so that they start with the longest and get shorter. Then way, when you search from the beginning to the end of the vector of CharacterMatches, you find the longest match first.

In the code below, the sorting is done by a call to a method called Orthography::PrepareForUse(). I practice it would be better done by an internal method which is triggered when you load a file, edit the set, etc (this might have possible ramifications for the way the GUI works with the Orthography class?)

The code includes sorting using the std::sort algorithm + suitable algorithm (provided as a lambda function) and also two ways of finding the string: using std::find_if and "long hand".

I'm not sure I follow your code, but I think it should be doing more or less what you want??

Andy

Where CharacterMatch.h contains the definition of class CharacterMatch from this post:

vector iterators incompatible
http://www.cplusplus.com/forum/windows/108112/#msg587154

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
#include<iostream>
#include<string>
#include<vector>
#include<map>
#include<array>
#include<algorithm>

#include "CharacterMatch.h"

using namespace std;

#define USE_SORT_ALGORITHM
#define USE_FIND_ALGORITHM

void TestParser();

int main()
{
    TestParser();

    return 0;
}

class Orthography;

// stripped down Parser
class Parser
{
private:
    size_t m_iCount;
    vector<wstring> m_Results;

public:
    Parser() {}
    ~Parser() {}

    vector<wstring> OrthographyParser(const wstring& wstrInput, const Orthography& orthography);
};

// stripped down Orthography
class Orthography
{
private:
    std::wstring m_Name;
    std::vector<CharacterMatch> m_CharacterMatches;

public:
    Orthography(const std::wstring& Name) : m_Name(Name) {}
    ~Orthography(){}

    void AddCharacterMatch(const CharacterMatch& charMatch)
    {
        m_CharacterMatches.push_back(charMatch);
    }

    const vector<CharacterMatch>& GetCharacterMatches() const
    {
        return m_CharacterMatches;
    }

    void PrepareForUse();
};

#ifndef USE_SORT_ALGORITHM
// use function for predicate
bool CompareInputLenThenVal(CharacterMatch& lhs, CharacterMatch& rhs)
{
    const wstring& strL = lhs.GetInput();
    const wstring& strR = rhs.GetInput();
    return ((strL.length() > strR.length()) || ((strL.length() == strR.length()) && (strL > strR)));
}
#endif

void Orthography::PrepareForUse()
{
#ifdef USE_SORT_ALGORITHM
    // use C++ lambda function for predicate
    auto CompareInputLenThenVal = [](CharacterMatch& lhs, CharacterMatch& rhs) -> bool
    {
        const wstring& strL = lhs.GetInput();
        const wstring& strR = rhs.GetInput();
        return ((strL.length() > strR.length()) || ((strL.length() == strR.length()) && (strL > strR)));
    };
#endif

    // predicate evaluates to true if the LHS CharacterMatch's input string is longer
    // than that of the RHS, of if the strings have equal length and the LHS string
    // is lexically greater than the RHS.
    sort(m_CharacterMatches.begin(), m_CharacterMatches.end(), CompareInputLenThenVal);
}

vector<wstring> Parser::OrthographyParser(const wstring& wstrInput, const Orthography& orthography)
{
    // Set the size of the vector array to 1 (m_iCount)
    m_iCount = 1;
    m_Results.resize(m_iCount); // TODO get rid of m_iCount and use m_Results.size() / resize() instead

    // Set the first item in the vector array to an empty string
    m_Results[0] = L"";

#if 0 // not used
    // Map to hold the orth inputs and the number of times each one occurs in the user's input
    map<wstring, int> matches;
#endif

    size_t index = 0; // size_t better than int here

    const vector<CharacterMatch>& charMatches = orthography.GetCharacterMatches();

#ifdef USE_FIND_ALGORITHM
    auto MatchFirstInput = [&wstrInput, &index] (const CharacterMatch& charMatch) -> bool
    {
        const wchar_t* input = charMatch.GetInput();
        return equal(input, input + wcslen(input), wstrInput.begin() + index);
    };
#endif

    while(index < wstrInput.size())
    {
        // typedef, to save typing
        typedef vector<CharacterMatch>::const_iterator iter_type;

#ifdef USE_FIND_ALGORITHM
        // using find_if algorithm
        iter_type matchIter_match = find_if(charMatches.begin(), charMatches.end(), MatchFirstInput);
#else
        // long hand...
        // use iter rather than bool
        iter_type matchIter_match = charMatches.end();

        // Now iterate through the string looking for a match starting with the largest input string
        // and work backwards (THIS ASSUMES VECTOR IS CORRECTLY SORTED)
        for (iter_type matchIter2 = charMatches.begin(); matchIter2 != charMatches.end(); matchIter2++)
        {
            const wchar_t* input = matchIter2->GetInput();
            if(equal(input, input + wcslen(input), wstrInput.begin() + index))
            {
                matchIter_match = matchIter2;
                break;
            }
        }
#endif

        // Then...
        if (matchIter_match != charMatches.end())
        {
            // If there was a match, append the "output" value to the result and adjudt
            // the index position accordingly.
            const wchar_t* input  = matchIter_match->GetInput();
            const wchar_t* output = matchIter_match->GetOutput();
            m_Results[0] += output;
            index += wcslen(input);
        }
        else
        {
            // If there as no match found, just copy current char across and update
            // the index by one and try again...
            m_Results[0] += wstrInput[index];
            ++index;
        }
    }


    return m_Results;
}

void TestParser()
{
    Orthography ortho(L"Test");
    ortho.AddCharacterMatch(CharacterMatch(L"oo" , L"u"));
    ortho.AddCharacterMatch(CharacterMatch(L"u" , L"w"));
    ortho.AddCharacterMatch(CharacterMatch(L"i" , L"y"));
    ortho.AddCharacterMatch(CharacterMatch(L"sc" , L"sh"));
    ortho.AddCharacterMatch(CharacterMatch(L"rs" , L"rz"));
    ortho.AddCharacterMatch(CharacterMatch(L"s" , L"zz"));
    ortho.AddCharacterMatch(CharacterMatch(L"sch", L"sk"));
    ortho.AddCharacterMatch(CharacterMatch(L"ith" , L"iz"));

    {
        const vector<CharacterMatch>& charMatches = ortho.GetCharacterMatches();

        wcout << L"Test char matches" << endl;
        vector<CharacterMatch>::const_iterator matchIter;
        for (matchIter = charMatches.begin(); matchIter != charMatches.end(); ++matchIter)
        {
            const wchar_t* input  = matchIter->GetInput();
            const wchar_t* output = matchIter->GetOutput();
            wcout << input << L" -> " << output << endl;
        }
        wcout << endl;
    }

    ortho.PrepareForUse();

    {
        const vector<CharacterMatch>& charMatches = ortho.GetCharacterMatches();

        wcout << L"Sorted char matches" << endl;
        vector<CharacterMatch>::const_iterator matchIter;
        for (matchIter = charMatches.begin(); matchIter != charMatches.end(); ++matchIter)
        {
            const wchar_t* input  = matchIter->GetInput();
            const wchar_t* output = matchIter->GetOutput();
            wcout << input << L" -> " << output << endl;
        }
        wcout << endl;
    }

    {
        wstring text = L"school is out, let's scoot! down with teachers!";

        Parser parser;
        vector<wstring> results = parser.OrthographyParser(text, ortho);

        wcout << L"input text:" << endl;
        wcout << text << endl;
        wcout << endl;

        wcout << L"output text:" << endl;
        const size_t count = results.size();
        for(size_t index = 0; index < count; ++index)
        {
            wcout << results[index] << endl;
        }
        wcout << endl;
    }
}


Test char matches
oo -> u
u -> w
i -> y
sc -> sh
rs -> rz
s -> zz
sch -> sk
ith -> iz

Sorted char matches
sch -> sk
ith -> iz
sc -> sh
rs -> rz
oo -> u
u -> w
s -> zz
i -> y

input text:
school is out, let's scoot! down with teachers!

output text:
skul yzz owt, let'zz shut! down wiz teacherz!
Last edited on
It is working! Thank you!!! And I don't blame you if you had trouble following my code. Looks like I was way off LOL. And your code is VERY advanced for me so I don't feel too bad for not solving this immediately. But I'm sure going to study it! To be honest I am surprised that I got it working as easily as I did. I must be getting better :).

The only thing I am missing now is a way to double my output like I did with my hard-coded orthographies. For example, let's say I have these matches in an orthography:

wah -> wa
dah -> da
dah -> dą

And then I input "wahdah". It should come out with these results:

wada
wadą

I accomplished this with my hard-coded orthographies with this function:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
void Parser::DoubleOutput(wstring str1, wstring str2)
{
	// Take in the wstring variables that will be outputted
	// Multiply the count by 2
	m_iCount *= 2;

	// Resize both string vector arrays
	m_Results.resize(m_iCount);

	// Copy the contents to both halves of the newly doubled array
	int cnt = 0;
	for (int i = ((m_iCount / 2)); i < (m_iCount); i++)
	{
		m_Results[i] = m_Results[cnt];
		cnt++;
	}

	for (int i = 0; i < (m_iCount / 2); i++)
	{
		m_Results[i] += str1;
		m_Results[i + (m_iCount / 2)] += str2;
	}
}


But after going over that, it doesn't look like it would work here. Any suggestions?

I had an error come up but I can't get it to happen again. At first I thought it was from trying to enter a letter that I didn't have in the orthography but that wasn't it. I'll keep an eye out for it.

And thanks again!
When you find a match -- i.e. matchIter_match != charMatches.end() -- then you can look at the following CharacterMatch's. As they're in order, all the matches for dah will be next together; and the search will have found the first of them.

Do you have to handle more than two in some cases? Or only ever two?

Andy
Last edited on
The idea behind this is to show various possibilities to help with translating the text I put into the program. What I'm wanting the program to do is take in terms as they are spelled from various sources (usually each source has its own orthography) and return possibilities in the current system. So to do that, I am wanting to have, if necessary, multiple possible outputs for me to study.

Each time there is a multiple possibility, the number of results needs to double. I hope I'm making sense with this. I'll try to show another example.

Let's say I have an orthography that has 2 outputs for the "g" character and 2 outputs for the "p" character.

g -> th
g -> ð
p -> p (IE it stays the same)
p -> b

Now if I input "gapa" into my program, the m_Results need to be:

thapa
ðapa
thaba
ðaba

See what I'm after now? That is what my last code snippet did with my other hard-coded orthographies and it worked great. It has gotten tricky now that I'm trying to make it more flexible by having the user submit their own custom orthographies.

With these multiple possibilities being outputted, I can look at them and look for patterns I am more familiar with that could help me with translations.
I knew what you were after; we discussed this doubling mechanism in an earlier thread.

I was just wondering it one input sequence might be mapped to 3 or even more output sequences. The code could equally well be made to triple, quadruple, ... the output, rather than just double it.

The suggestion I made in my last post still stands; lines 149-152 of my code need to be expanded to look at the CharacterMatch's which follow the one which was found.

Handling just one or two output strings should be easy; it'll take a bit more work to handle more than two but it shouldn't be that hard.

Andy
I think I see what you are saying. I'll play around with it and see what I can come up with :).
Ok, I've been looking at this for the past few days and I just can't get a handle on it. I don't know if it is just a case of the code being so foreign to me (it is a lot to take in LOL) that it will just take time to get used to it and understand it or if I'm at some kind of block where I'm locked in to the way I did it before and my mind is trying to shoehorn my old code to work here somehow.

Right now near as I can tell, the code stops when it finds a match and doesn't continue through the CharacterMatches to see if there are anymore matches. What I don't understand is how to tell it to keep looking all the way through the rest of the matches. I think part of what is messing me up is that I am not familiar with the #ifdef, #endif, #if, and #endif code and how it works.

I figure the code needs to keep searching through the CharacterMatches to try to find another match. THEN I can worry about trying to double the output.

What do you think?
The code does need to carry on the search.

But it continues from the first match, using the iterator which was used for the find operation, rather that trying to continue the search using find.

Remembering that this does rely on the CharacterMatches being correctly sorted, as I mentioned previously.

Then

anywestken wrote:
lines 149-152 of my code need to be expanded to look at the CharacterMatch's which follow the one which was found.

That is

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
        // etc

        // Then...
        if (matchIter_match != charMatches.end())
        {
            // If there was a match, append the "output" value to the result and adjudt
            // the index position accordingly.
            const wchar_t* input  = matchIter_match->GetInput();
            const wchar_t* output = matchIter_match->GetOutput();
            m_Results[0] += output;
            index += wcslen(input);
        }
        else
        {
            // If there as no match found, just copy current char across and update
            // the index by one and try again...
            m_Results[0] += wstrInput[index];
            ++index;
        }

        // etc 


needs to be replaced by

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
        // etc

        // Then...
        if (matchIter_match != charMatches.end())
        {
            // If there was a match, append the "output" value to the result and adjudt
            // the index position accordingly.
            const wchar_t* input  = matchIter_match->GetInput();

            const wchar_t* output_1 = matchIter_match->GetOutput();
            const wchar_t* output_2 = nullptr;
            
            ++matchIter_match; // increment iterator

            // if there is another CharacterMatch following the one you found
            // and if it has the same input
            // then get it, too
            if(    (matchIter_match != charMatches.end())
                && (0 == wcscmp(input, matchIter_match->GetInput())))
                output_2 = matchIter_match->GetOutput();

            if(output_2 == nullptr)
            {
                size_t count = m_Results.size();
                for(size_t i = 0; i < count; ++i)
                    m_Results[i] += output_1;
            }
            else
            {
                DoubleOutput(output_1, output_2);
            }

            index += wcslen(input);
        }
        else
        {
            // If there as no match found, just copy current char across and update
            // the index by one and try again...
            size_t count = m_Results.size();
            for(size_t i = 0; i < count; ++i)
                m_Results[i] += wstrInput[index];

            ++index;
        }

        // etc 


Note that is only handles the case when you have one or two possible outputs for any given input.

If you need to generalize to work with an arbitrary number of possible output, you need to rework this code to use a loop and modify you DoubleOutput method to do more than just double.

Andy
Last edited on
Output and code.

Andy

Test char matches
oo -> u
u -> w
u -> VV
i -> y
sc -> sh
rs -> rz
s -> zz
s -> CC
sch -> sk
ith -> iz

Sorted char matches
sch -> sk
ith -> iz
sc -> sh
rs -> rz
oo -> u
u -> w
u -> VV
s -> zz
s -> CC
i -> y

input text:
school is out, let's scoot! down with teachers!

output text:
skul yzz owt, let'zz shut! down wiz teacherz!
skul yCC owt, let'zz shut! down wiz teacherz!
skul yzz oVVt, let'zz shut! down wiz teacherz!
skul yCC oVVt, let'zz shut! down wiz teacherz!
skul yzz owt, let'CC shut! down wiz teacherz!
skul yCC owt, let'CC shut! down wiz teacherz!
skul yzz oVVt, let'CC shut! down wiz teacherz!
skul yCC oVVt, let'CC shut! down wiz teacherz!


Part 1 of 2 (now too long for a single post...)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
#include<iostream>
#include<string>
#include<vector>
#include<map>
#include<array>
#include<algorithm>

#include "CharacterMatch.h"

using namespace std;

#define USE_SORT_ALGORITHM
#define USE_FIND_ALGORITHM

void TestParser();

int main()
{
    TestParser();

    return 0;
}

class Orthography;

// stripped down Parser
class Parser
{
private:
    size_t m_iCount;
    vector<wstring> m_Results;

    void DoubleOutput(const wchar_t* str1, const wchar_t* str2);

public:
    Parser() {}
    ~Parser() {}

    vector<wstring> OrthographyParser(const wstring& wstrInput, const Orthography& orthography);
};

// stripped down Orthography
class Orthography
{
private:
    std::wstring m_Name;
    std::vector<CharacterMatch> m_CharacterMatches;

public:
    Orthography(const std::wstring& Name) : m_Name(Name) {}
    ~Orthography(){}

    void AddCharacterMatch(const CharacterMatch& charMatch)
    {
        m_CharacterMatches.push_back(charMatch);
    }

    const vector<CharacterMatch>& GetCharacterMatches() const
    {
        return m_CharacterMatches;
    }

    void PrepareForUse();
};

#ifndef USE_SORT_ALGORITHM
// use function for predicate
bool CompareInputLenThenVal(CharacterMatch& lhs, CharacterMatch& rhs)
{
    const wstring& strL = lhs.GetInput();
    const wstring& strR = rhs.GetInput();
    return ((strL.length() > strR.length()) || ((strL.length() == strR.length()) && (strL > strR)));
}
#endif

void Orthography::PrepareForUse()
{
#ifdef USE_SORT_ALGORITHM
    // use C++ lambda function for predicate
    auto CompareInputLenThenVal = [](CharacterMatch& lhs, CharacterMatch& rhs) -> bool
    {
        const wstring& strL = lhs.GetInput();
        const wstring& strR = rhs.GetInput();
        return ((strL.length() > strR.length()) || ((strL.length() == strR.length()) && (strL > strR)));
    };
#endif

    // predicate evaluates to true if the LHS CharacterMatch's input string is longer
    // than that of the RHS, of if the strings have equal length and the LHS string
    // is lexically greater than the RHS.
    sort(m_CharacterMatches.begin(), m_CharacterMatches.end(), CompareInputLenThenVal);
}

vector<wstring> Parser::OrthographyParser(const wstring& wstrInput, const Orthography& orthography)
{
    // Set the size of the vector array to 1 (m_iCount)
    m_iCount = 1;
    m_Results.resize(m_iCount); // TODO get rid of m_iCount and use m_Results.size() / resize() instead

    // Set the first item in the vector array to an empty string
    m_Results[0] = L"";

#if 0 // not used
    // Map to hold the orth inputs and the number of times each one occurs in the user's input
    map<wstring, int> matches;
#endif

    size_t index = 0; // size_t better than int here

    const vector<CharacterMatch>& charMatches = orthography.GetCharacterMatches();

#ifdef USE_FIND_ALGORITHM
    auto MatchFirstInput = [&wstrInput, &index] (const CharacterMatch& charMatch) -> bool
    {
        const wchar_t* input = charMatch.GetInput();
        return equal(input, input + wcslen(input), wstrInput.begin() + index);
    };
#endif

    while(index < wstrInput.size())
    {
        // typedef, to save typing
        typedef vector<CharacterMatch>::const_iterator iter_type;

#ifdef USE_FIND_ALGORITHM
        // using find_if algorithm
        iter_type matchIter_match = find_if(charMatches.begin(), charMatches.end(), MatchFirstInput);
#else
        // long hand...
        // use iter rather than bool
        iter_type matchIter_match = charMatches.end();

        // Now iterate through the string looking for a match starting with the largest input string
        // and work backwards (THIS ASSUMES VECTOR IS CORRECTLY SORTED)
        for (iter_type matchIter2 = charMatches.begin(); matchIter2 != charMatches.end(); matchIter2++)
        {
            const wchar_t* input = matchIter2->GetInput();
            if(equal(input, input + wcslen(input), wstrInput.begin() + index))
            {
                matchIter_match = matchIter2;
                break;
            }
        }
#endif

        // Then...
        if (matchIter_match != charMatches.end())
        {
            // If there was a match, append the "output" value to the result and adjudt
            // the index position accordingly.
            const wchar_t* input  = matchIter_match->GetInput();

            const wchar_t* output_1 = matchIter_match->GetOutput();
            const wchar_t* output_2 = nullptr;
            
            ++matchIter_match; // increment iterator

            // if there is another CharacterMatch following the one you found
            // and if it has the same input
            // then get it, too
            if(    (matchIter_match != charMatches.end())
                && (0 == wcscmp(input, matchIter_match->GetInput())))
                output_2 = matchIter_match->GetOutput();

            if(output_2 == nullptr)
            {
                size_t count = m_Results.size();
                for(size_t i = 0; i < count; ++i)
                    m_Results[i] += output_1;
            }
            else
            {
                DoubleOutput(output_1, output_2);
            }

            index += wcslen(input);
        }
        else
        {
            // If there as no match found, just copy current char across and update
            // the index by one and try again...
            size_t count = m_Results.size();
            for(size_t i = 0; i < count; ++i)
                m_Results[i] += wstrInput[index];

            ++index;
        }
    }

    return m_Results;
}
Last edited on
Part 2 of 2

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
void Parser::DoubleOutput(const wchar_t* str1, const wchar_t* str2) // was using wstring
{
    // Take in the wstring variables that will be outputted
    // Multiply the count by 2
    m_iCount *= 2;

    // Resize both string vector arrays
    m_Results.resize(m_iCount);

    // Copy the contents to both halves of the newly doubled array
    size_t cnt = 0; // was int (similarly for i)
    for (size_t i = ((m_iCount / 2)); i < (m_iCount); i++)
    {
        m_Results[i] = m_Results[cnt];
        cnt++;
    }

    for (size_t i = 0; i < (m_iCount / 2); i++)
    {
        m_Results[i] += str1;
        m_Results[i + (m_iCount / 2)] += str2;
    }
}

void TestParser()
{
    Orthography ortho(L"Test");
    ortho.AddCharacterMatch(CharacterMatch(L"oo" , L"u"));
    ortho.AddCharacterMatch(CharacterMatch(L"u" , L"w"));  // doubled
    ortho.AddCharacterMatch(CharacterMatch(L"u" , L"VV")); // doubled
    ortho.AddCharacterMatch(CharacterMatch(L"i" , L"y"));
    ortho.AddCharacterMatch(CharacterMatch(L"sc" , L"sh"));
    ortho.AddCharacterMatch(CharacterMatch(L"rs" , L"rz"));
    ortho.AddCharacterMatch(CharacterMatch(L"s" , L"zz")); // doubled
    ortho.AddCharacterMatch(CharacterMatch(L"s" , L"CC")); // doubled
    ortho.AddCharacterMatch(CharacterMatch(L"sch", L"sk"));
    ortho.AddCharacterMatch(CharacterMatch(L"ith" , L"iz"));

    {
        const vector<CharacterMatch>& charMatches = ortho.GetCharacterMatches();

        wcout << L"Test char matches" << endl;
        vector<CharacterMatch>::const_iterator matchIter;
        for (matchIter = charMatches.begin(); matchIter != charMatches.end(); ++matchIter)
        {
            const wchar_t* input  = matchIter->GetInput();
            const wchar_t* output = matchIter->GetOutput();
            wcout << input << L" -> " << output << endl;
        }
        wcout << endl;
    }

    ortho.PrepareForUse();

    {
        const vector<CharacterMatch>& charMatches = ortho.GetCharacterMatches();

        wcout << L"Sorted char matches" << endl;
        vector<CharacterMatch>::const_iterator matchIter;
        for (matchIter = charMatches.begin(); matchIter != charMatches.end(); ++matchIter)
        {
            const wchar_t* input  = matchIter->GetInput();
            const wchar_t* output = matchIter->GetOutput();
            wcout << input << L" -> " << output << endl;
        }
        wcout << endl;
    }

    {
        wstring text = L"school is out, let's scoot! down with teachers!";

        Parser parser;
        vector<wstring> results = parser.OrthographyParser(text, ortho);

        wcout << L"input text:" << endl;
        wcout << text << endl;
        wcout << endl;

        wcout << L"output text:" << endl;
        const size_t count = results.size();
        for(size_t index = 0; index < count; ++index)
        {
            wcout << results[index] << endl;
        }
        wcout << endl;
    }
}
Ok, I've finally been able to try to work this into my code and I have an error. This error may stem from me not being quite sure if I am taking this code and making it work with my code. You've got this put into a single file and that is throwing me off a bit as far as making it go where it needs to in my separate files.

As of right now, the error I'm getting is when I am trying to use the PrepareForUse() function. You have it after you output the matches to the screen. But from what I can see, it needs to go in my OrthographyParser function. The orthography gets passed in and you mentioned that it needs to be sorted before it parses, right? With that in mind, I get a red line under "orthography" in my line:

orthography.PrepareForUse();

That says "object has type qualifiers that are not compatible with the member function." This line is in my Orthography cpp file. If I comment out this file, it doesn't double the output. I'm wondering if this is because the orthography hasn't been sorted.

Here are my Orthography and Parser files with your code placed in it.

Orthography.h
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
#ifndef ORTHOGRAPHY_H
#define ORTHOGRAPHY_H

#define _CRT_SECURE_NO_WARNINGS

//=====================================================
// Includes
//=====================================================
#include "CharacterMatch.h"
#include <vector>

#define USE_SORT_ALGORITHM
#define USE_FIND_ALGORITHM

class Orthography
{
public:

	// Constructor(s)
	Orthography();
	Orthography(const std::wstring& name);

	// Destructor
	~Orthography();

	// Accessor functions
	const std::wstring& GetName() const;
	const std::vector<CharacterMatch>& GetCharacterMatches() const;

    // Class functions
	bool AddCharacterMatch(const CharacterMatch& match);
	void DeleteCharacterMatch(const std::wstring& matchName);

	// Storage functions
	bool WriteToStream(std::wostream& wos) const;
	bool ReadFromStream(std::wistream& wis);

	void PrepareForUse();

private:

	std::wstring m_Name;
	std::vector<CharacterMatch> m_CharacterMatches;
	
};


void WriteOrthographiesToFile(const std::vector<Orthography>& orthographies, const std::wstring& filepath);
void ReadOrthographiesFromFile(std::vector<Orthography>& orthographies, const std::wstring& filepath);





#endif // ORTHOGRAPHY_H 


Orthography.cpp
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
#include "Orthography.h"
#include <fstream>
using namespace std;

const wchar_t orthoBegin[] = L"BEGIN ORTHOGRAPHY:";
const wchar_t orthoEnd  [] = L"END ORTHOGRAPHY";

// Constructor
Orthography::Orthography()
{
	
}

Orthography::Orthography(const wstring& name) : m_Name(name)
{
	m_Name = name;
}

// Destructor
Orthography::~Orthography()
{

}


#ifndef USE_SORT_ALGORITHM
// use function for predicate
bool CompareInputLenThenVal(CharacterMatch& lhs, CharacterMatch& rhs)
{
    const wstring& strL = lhs.GetInput();
    const wstring& strR = rhs.GetInput();
    return ((strL.length() > strR.length()) || ((strL.length() == strR.length()) && (strL > strR)));
}
#endif

void Orthography::PrepareForUse()
{
#ifdef USE_SORT_ALGORITHM
    // use C++ lambda function for predicate
    auto CompareInputLenThenVal = [](CharacterMatch& lhs, CharacterMatch& rhs) -> bool
    {
        const wstring& strL = lhs.GetInput();
        const wstring& strR = rhs.GetInput();
        return ((strL.length() > strR.length()) || ((strL.length() == strR.length()) && (strL > strR)));
    };
#endif

    // predicate evaluates to true if the LHS CharacterMatch's input string is longer
    // than that of the RHS, of if the strings have equal length and the LHS string
    // is lexically greater than the RHS.
    sort(m_CharacterMatches.begin(), m_CharacterMatches.end(), CompareInputLenThenVal);
}



// Accessor Functions
const wstring& Orthography::GetName() const
{
	return m_Name;
}

const vector<CharacterMatch>& Orthography::GetCharacterMatches() const
{
	return m_CharacterMatches;
}


// Class functions
bool Orthography::AddCharacterMatch(const CharacterMatch& match)
{
	// Make sure that the user isn't trying to enter a duplicate match
	vector<CharacterMatch>::iterator iter;
	for (iter = m_CharacterMatches.begin(); iter != m_CharacterMatches.end(); iter++)
	{
		if (iter->GetName() == match.GetName())
		{
			return false;
		}
	}

	m_CharacterMatches.push_back(match);

	return true;
}

void Orthography::DeleteCharacterMatch(const wstring& matchName)
{
	vector<CharacterMatch>::iterator iter;
	for (iter = m_CharacterMatches.begin(); iter != m_CharacterMatches.end(); iter++)
	{
		if (iter->GetName() == matchName)
		{
			m_CharacterMatches.erase(iter);
			break;
		}
	}
}

bool Orthography::WriteToStream(wostream& wos) const
{
	bool ret_val = true;
	wos << orthoBegin << m_Name << endl;
	// from saveOrthographyToFile
	const size_t count = m_CharacterMatches.size();
	for(size_t index = 0; count > index; ++index)
		wos << m_CharacterMatches[index] << endl;
	// end
	wos << orthoEnd << endl;
	return ret_val;
}

bool Orthography::ReadFromStream(wistream& wis)
{
	bool ret_val = true;
	bool got_end = false;
	{
		wstring line;
		getline(wis, line);
		if(0 == line.find(orthoBegin))
			m_Name = line.substr(wcslen(orthoBegin));
		else
			ret_val = false;
	}
	if(ret_val)
	{
		// based on readOrthographyFromFile
		wstring line;
		while(getline(wis, line)) {
			if(line == orthoEnd) {
				got_end = true;
				break;
			}
			CharacterMatch chmatch;
			ret_val = parseStr(line, chmatch);
			if(!ret_val)
				break;
			m_CharacterMatches.push_back(chmatch);
		}
	}
	if(ret_val) {
		ret_val = got_end;
	}
	return ret_val;
}

void ReadOrthographiesFromFile(vector<Orthography>& orthographies, const wstring& filePath)
{
	FILE* fp = _wfopen(filePath.c_str(), L"r,ccs=UNICODE");
	wifstream ifs(fp);

	for( ; ; )
	{
		Orthography orth;
		if(!orth.ReadFromStream(ifs))
			break;
		orthographies.push_back(orth);
	}

	fclose(fp);
}

void WriteOrthographiesToFile(const vector<Orthography>& orthographies, const wstring& filePath)
{
	FILE* fp = _wfopen(filePath.c_str(), L"w,ccs=UNICODE");
	wofstream ofs(fp);

	const size_t count = orthographies.size();
	for(size_t index = 0; count > index; ++index)
	{
		const Orthography& orth = orthographies[index];
		orth.WriteToStream(ofs);
	}

	fclose(fp);
}


Next post will have the Parser code.
Parser.h
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
#ifndef PARSER_H
#define PARSER_H

#include "Orthography.h"
using namespace std;

class Parser
{
public:
	// Constructor(s)/Destructor
	Parser();
	~Parser();

	// Accessor functions
	int GetCount();
	vector<wstring> GetResults();

	// Class functions
	vector<wstring> OrthographyParser(const wstring& wstrInput, const Orthography& orthography);
	
	void DoubleOutput(const wchar_t* str1, const wchar_t* str2);

	void SetCount(int count);

private:
	// Member variables
	size_t m_iCount;
	vector<wstring> m_Results;
};


#endif // PARSER_H 


Parser.cpp
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
#include "Parser.h"

// Constructor
Parser::Parser()
{
	m_iCount = 0;
}

// Destructor
Parser::~Parser()
{

}


// Accessor functions =================================
int Parser::GetCount()
{
	return m_iCount;
}

vector<wstring> Parser::GetResults()
{
	return m_Results;
}


// Class functions ====================================

// Set the count
void Parser::SetCount(int count)
{
	m_iCount = count;
}



void Parser::DoubleOutput(const wchar_t* str1, const wchar_t* str2) // was using wstring
{
    // Take in the wstring variables that will be outputted
    // Multiply the count by 2
    m_iCount *= 2;

    // Resize both string vector arrays
    m_Results.resize(m_iCount);

    // Copy the contents to both halves of the newly doubled array
    size_t cnt = 0; // was int (similarly for i)
    for (size_t i = ((m_iCount / 2)); i < (m_iCount); i++)
    {
        m_Results[i] = m_Results[cnt];
        cnt++;
    }

    for (size_t i = 0; i < (m_iCount / 2); i++)
    {
        m_Results[i] += str1;
        m_Results[i + (m_iCount / 2)] += str2;
    }
}



vector<wstring> Parser::OrthographyParser(const wstring& wstrInput, const Orthography& orthography)
{
    // Set the size of the vector array to 1 (m_iCount)
    m_iCount = 1;
    m_Results.resize(m_iCount); // TODO get rid of m_iCount and use m_Results.size() / resize() instead

    // Set the first item in the vector array to an empty string
    m_Results[0] = L"";

#if 0 // not used
    // Map to hold the orth inputs and the number of times each one occurs in the user's input
    map<wstring, int> matches;
#endif

    size_t index = 0; // size_t better than int here

    const vector<CharacterMatch>& charMatches = orthography.GetCharacterMatches();

	// orthography.PrepareForUse(); // I THINK this goes here!!! ERROR: object has type qualifiers that are not compatible with the member function

#ifdef USE_FIND_ALGORITHM
    auto MatchFirstInput = [&wstrInput, &index] (const CharacterMatch& charMatch) -> bool
    {
        const wchar_t* input = charMatch.GetInput();
        return equal(input, input + wcslen(input), wstrInput.begin() + index);
    };
#endif

    while(index < wstrInput.size())
    {
        // typedef, to save typing
        typedef vector<CharacterMatch>::const_iterator iter_type;

#ifdef USE_FIND_ALGORITHM
        // using find_if algorithm
        iter_type matchIter_match = find_if(charMatches.begin(), charMatches.end(), MatchFirstInput);
#else
        // long hand...
        // use iter rather than bool
        iter_type matchIter_match = charMatches.end();

        // Now iterate through the string looking for a match starting with the largest input string
        // and work backwards (THIS ASSUMES VECTOR IS CORRECTLY SORTED)
        for (iter_type matchIter2 = charMatches.begin(); matchIter2 != charMatches.end(); matchIter2++)
        {
            const wchar_t* input = matchIter2->GetInput();
            if(equal(input, input + wcslen(input), wstrInput.begin() + index))
            {
                matchIter_match = matchIter2;
                break;
            }
        }
#endif

        // Then...
        if (matchIter_match != charMatches.end())
        {
            // If there was a match, append the "output" value to the result and adjudt
            // the index position accordingly.
            const wchar_t* input  = matchIter_match->GetInput();

            const wchar_t* output_1 = matchIter_match->GetOutput();
            const wchar_t* output_2 = nullptr;
            
            ++matchIter_match; // increment iterator

            // if there is another CharacterMatch following the one you found
            // and if it has the same input
            // then get it, too
            if(    (matchIter_match != charMatches.end())
                && (0 == wcscmp(input, matchIter_match->GetInput())))
                output_2 = matchIter_match->GetOutput();

            if(output_2 == nullptr)
            {
                size_t count = m_Results.size();
                for(size_t i = 0; i < count; ++i)
                    m_Results[i] += output_1;
            }
            else
            {
                DoubleOutput(output_1, output_2);
            }

            index += wcslen(input);
        }
        else
        {
            // If there as no match found, just copy current char across and update
            // the index by one and try again...
            size_t count = m_Results.size();
            for(size_t i = 0; i < count; ++i)
                m_Results[i] += wstrInput[index];

            ++index;
        }
    }

    return m_Results;
}


Now what I do in my program is create a couple global variables in my main.cpp file:
1
2
3
4
5
// Create global parser object
Parser g_Parser;

// Vector array to hold the user-created orthographies
vector<Orthography> g_Orthographies;


Now when the parse button is pressed, to make a long story short, it gets to this line :) :
1
2
3
4
5
6
7
8
9
// Determine which orthography will be used.
				vector<Orthography>::iterator orthIter;
				for (orthIter = g_Orthographies.begin(); orthIter != g_Orthographies.end(); orthIter++)
				{
					if (orthIter->GetName() == &buf[0])
					{
						g_Parser.OrthographyParser(input, *orthIter);
					}
				}


That will find the appropriate orthography selected in the dropdown menu and feed it into the OrthographyParser function. I'm just not sure that I'm implementing your code the way it needs to. Any idea what I need to tweak here?

And by the way, thanks for ALL the help you've given!
You're getting the error "object has type qualifiers that are not compatible with the member function." as you're trying to call a non-const method on an Orthography that has been passed to the function OrthographyParser by const reference. Objects passed by const reference cannot have their internal state changed, which is obviously required for a sort.

I didn't say that the PrepareForUse() function needed to be done just before parsing; I said the CharacterMatchs had to be sorted before they were used to process your string.

andywestken wrote:
In the code below, the sorting is done by a call to a method called Orthography::PrepareForUse(). I practice it would be better done by an internal method which is triggered when you load a file, edit the set, etc (this might have possible ramifications for the way the GUI works with the Orthography class?)

I did the sort after populating the test orthography.

I expected you to sort the CharacterMatches (a) after they were loaded from file and (b) after they've been edited.

But if it's easier for you to sort it just before use, then pass it to OrthographyParser bu non-const ref. i.e. alter method signature to
vector<wstring> OrthographyParser(const wstring& wstrInput, Orthography& orthography);

(this might need you to ripple this change through your code?)

And using unsorted CharacterMatches may well mean no doubling, depending on the order. If you manually order then, it should work though.

I haven't debugged your code this time, but I did diff it against my test. The code looks fine to me (pretty much in step.) So the only immediate issue is the sorting.

Andy
Last edited on
I think I see what you're saying. I'm going to try to go with your initial idea (having them sorted after they are loaded and after they've been edited) since, well, I'm thinking you know best LOL :).

So with that being the case, would it be as simple as including the PrepareForUse() function inside these functions:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
//=======================================================
// Save and Load functions
//=======================================================
void SaveValues(const wstring& filePath)
{
    WriteOrthographiesToFile(g_Orthographies, filePath);
}

void ReadAndDisplayValues(const wstring& filePath)
{
    vector<Orthography> orthographies;

    //wcout << L"Read data from file : " << filePath << endl;
    ReadOrthographiesFromFile(orthographies, filePath);
    //wcout << endl;

    ios_base::fmtflags old_flags = wcout.setf(ios_base::left, ios_base::adjustfield);

    //wcout << L"Data read:" << endl
          //<< endl;
    const size_t ortho_count = orthographies.size();
    for(size_t ortho_index = 0; ortho_count > ortho_index; ++ortho_index)
	{
        Orthography& ortho = orthographies[ortho_index];
        // << L"Orthography : " << ortho.GetName() << endl;

		// Add the orthography to the g_Orthographies vector



        vector<CharacterMatch> chmatches = ortho.GetCharacterMatches();
        const size_t chmatch_count = chmatches.size();
        for(size_t chmatch_index = 0; chmatch_count > chmatch_index; ++chmatch_index)
		{
            CharacterMatch& chmatch = chmatches[chmatch_index];
            //wcout << setw(8) << chmatch.GetName() << L" : "
                  //<< setw(2) << chmatch.GetInput() << L" -> "
                  //<< setw(2) << chmatch.GetOutput() << endl;
        }
        //wcout << endl;
    }
    //wcout << endl;

	g_Orthographies = orthographies;

    wcout.setf(old_flags);
}


It is late for me right now but at the moment I am thinking that perhaps I would iterate through the orthographies and sort them before they are saved so they would be saved in sort form and when they are loaded they would be sorted again (just in case). Does that sound like what you were thinking?
It is late for me right now but at the moment I am thinking that perhaps I would iterate through the orthographies and sort them before they are saved so they would be saved in sort form and when they are loaded they would be sorted again (just in case). Does that sound like what you were thinking?


Yes

Plus sorting after the orthographies are edited using your app (does it support this functionality?)

Andy
Bah, looks like the post I wrote the other day didn't go through. Well, to recap, I said that I was able to get it working and that I'd test it out by trying to break it LOL. I'd also typed out a very elaborate and heartfelt thank you :).

Well, I tried to break it and failed. It is working beautifully! So for now I am just tweaking things here and there and trying to polish things up a bit. I think I am going to take a good hard look at the code you provided so I can try to understand how it works. Then I am going to try to use it to work with not only orthographies but also linguistic shifting. So I'm going to try to figure out how to get those functions to save orthographies to one file and linguistic shifting to another. That'll give me a great opportunity to work with your code. Speaking of which, just so I can gauge my "newbiness"...how advanced was that code you provided? I'm trying to see how far along I am :).

I may also try to toy around with inputting text from a file as well as having each word as its own drop-down box populated with each of the possible combinations from the parse (we talked about this in passing in another thread). I may also see about implementing a linguistic alphabet as well. In short, I'm going to keep working with this program to continue to get more experience programming while at the same time using it now that it is technically ready for the task I made it for.

Again, thank you VERY much for your time and patience!!!! :)
how advanced was that code you provided?

It's pretty routine code. Once you have a good grasp of the basic usage of the standard containers, iterators, and algorithms (along with the comparable elements of the C++ language) then you could see yourself as an intermediate programmer.

There were other places I would have used algorithms, but avoided them to keep things easier to follow.

Andy

PS If you haven't already done so, you should eiminate the separate count member variable (m_iCount) and use the value returned by vector<>::size()
Last edited on
Pages: 12