Trying to get at a string index from a specific string in a string array

Pages: 12
store a user's character matches / so other users can create their own swapping systems / I'm thinking a data file will work fine for now

Sounds totally feasible. A couple of approaches even occur to me off the top of my head.

You are you familiar with INI-files?? And maybe the INF files used by one of the Windows installers (there are loads of them in C:\Windows\inf) which use the same sort of format. In this format, you'd end up with something like:

; test.rules
;
; example ini-file style data file

[rules]
rule.nasals
rule.v_with_w

[rule.nasals]
mapping_type = replace_after
mapping_trigger = n
mappings = rule.nasals.mappings

[rule.nasals.mappings]
a = ą
i = į
u = ų

[rule.v_with_w]
mapping_type = replace_after
mapping_trigger = v
mappings = rule.v_with_w.mappings

[rule.v_with_w.mappings]
v = w


Which you read using the Win32 calls GetPrivateProfileSection, GetPrivateProfileString, etc.

(Writing this raises a number of questions. If you want to generalisable, then ou need to be handle a reasonable nuber of conditions and swaps: swapping a single char with a pair (and vice versa); swapping on longer sequence with another?; triggering (not sure what term to use here?) on a leading char; or on a trailing char; or even on a leading and trailing char. Hmmm... Do you know about regular expressions??)

I have taken care of the global variable issue

:-)

I also tried messing with a full-blown windows application

You did you create a Win32 Application, yes? (Not a Windows Forms application.)

If you just want the app to look more or less like Notepad it's pretty easy.

Are you using Visual Studio 2010 Professional? You mentioned the the toolbox in another post, which isn't provided with the Express edition, so I guess so.

If that is the case, you could also use MFC or WTL to implement you app, rather than just plain Win32.

Andy

INI file
http://en.wikipedia.org/wiki/INI_file

GetPrivateProfileString function
http://msdn.microsoft.com/en-us/library/windows/desktop/ms724353%28v=vs.85%29.aspx

GetPrivateProfileSection function
http://msdn.microsoft.com/en-us/library/windows/desktop/ms724348%28v=vs.85%29.aspx
Last edited on
I tweaked the code in #msg568993 to remove some unecessary code:
http://www.cplusplus.com/forum/windows/105326/#msg568993

for-loop on line 66 was

1
2
3
4
5
6
7
8
9
10
11
                // Now cycle through first half of the strings and nasalize last char
                for (size_t k = 0; k < (iCount_old); ++k)
                {
                    if (k < (iCount_old))
                    {
                        // replace with a with ogonek
                        wstring& result = Results[k];
                        size_t pos_last = result.length() - 1;
                        result[pos_last] = ch_nasal;
                    }
                }


now

1
2
3
4
5
6
7
8
                // Now cycle through first half of the strings and nasalize last char
                for (size_t k = 0; k < (iCount_old); ++k)
                {
                    // replace with a with ogonek
                    wstring& result = Results[k];
                    size_t pos_last = result.length() - 1;
                    result[pos_last] = ch_nasal;
                }


The if-condition was left over when the parser either changed the "a" to an "ą" or an "aa". Not needed when it's just to "ą" or "a". (In the older version of the parser, the loop termination condition was iCount = iCount_old * 2.)

Andy
Last edited on
About all that I know about INI and INF files is that they exist. I've tweaked the values in INI files in the past but only rarely. I've never seen code like the rules and mapping stuff you posted but I think I see the idea behind it. I don't know how to implement something like that, though. And I have no idea what "regular expressions" are :). I suspect I'll find out though :).

As far as the global variable issue, when I said I'd solved it, that was a bit premature. I'd set up my functions to use wstring& as you suggested. I tested it out with one of the more complicated conversions and it seemed to work fine. It was at that point that I said I'd taken care of the global issue. But when I tried it with smaller strings (the one I tried was a 9 character long string), it kept telling me it was out of range. I have no idea why it would work with a longer string but not a shorter one and no matter how much I poked through the code, I couldn't see where the problem was. But when I went back to feeding wchar_t* into those functions instead of wstring&, it started working fine again. Not sure what is going on there.

I'll give you two functions that would be used in my program.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
// Main parsing function.  This function will determine
// which parser to use based on user input.
void Parser::MainParser(wchar_t* wstrInput, int str_len, int iSource)
{
	switch (iSource)
	{
	case PHONETIC: // Phonetic
		Phonetic(wstrInput, str_len);
		break;

	case SOURCE1: // Source1 orthography
		Source1(wstrInput, str_len);
		break;

	case SOURCE2: // Source2 orthography
		Source2(wstrInput, str_len);
		break;

	case SOURCE3: // Source3 orthography
		Hamilton(wstrInput, str_len);
		break;

	case SOURCE4: // Source4 orthography
		Source4(wstrInput, str_len);
		break;
	}
}


That function determines which function will be used. Here is my doubling function in case it is needed:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
// Function to double the output strings and append the inputted strings
// to the ends of each.
void Parser::DoubleOutput(wstring str1, wstring str2)
{
	// Take in the wstring variables that will be outputted
	// Multiply the count by 2
	m_iCount *= 2;

	// Resize both string vector arrays
	m_Results.resize(m_iCount);

	// Copy the contents to both halves of the newly doubled array
	int cnt = 0;
	for (int i = ((m_iCount / 2)); i < (m_iCount); i++)
	{
		m_Results[i] = m_Results[cnt];
		cnt++;
	}

	for (int i = 0; i < (m_iCount / 2); i++)
	{
		m_Results[i] += str1;
		m_Results[i + (m_iCount / 2)] += str2;
	}
}


And here is one of my specific orthography functions.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
// Function for SOURCE4 orthography.
vector<wstring> Parser::Source4(wchar_t* wstrInput, int str_len)
{
	// Set the size of the vector array to 1 (m_iCount)
	m_iCount = 1;
	m_Results.resize(m_iCount);

	// Set the first item in the vector array to an empty string
	m_Results[0] = L"";

	// Loop through the string and convert as necessary
	for (int i = 0; i < str_len; i++)
	{
		// Cycle through each character
		// Vowels first
		if (wstrInput[i] == 'a')
		{
			for (int j = 0; j < m_iCount; j++)
			{
				m_Results[j] += L"e";
			}
		}
		else if (wstrInput[i] == 'l')
		{
			for (int j = 0; j < m_iCount; j++)
			{
				m_Results[j] += L"a";
			}
		}
		else if (wstrInput[i] == 'd')
		{
			for (int j = 0; j < m_iCount; j++)
			{
				m_Results[j] += L"a";
			}
		}
		else if (wstrInput[i] == 'e')
		{
			for (int j = 0; j < m_iCount; j++)
			{
				m_Results[j] += L"i";
			}
		}
		else if (wstrInput[i] == 'y')
		{
			for (int j = 0; j < m_iCount; j++)
			{
				m_Results[j] += L"ai"; // dipthong
			}
		}
		else if (wstrInput[i] == 'i')
		{
			for (int j = 0; j < m_iCount; j++)
			{
				m_Results[j] += L"i";
			}
		}
		else if (wstrInput[i] == 'o')
		{
			for (int j = 0; j < m_iCount; j++)
			{
				m_Results[j] += L"o";
			}
		}
		else if (wstrInput[i] == 'w')
		{

			// Double the output with these characters
			DoubleOutput(L"u", L"w");

		}
		else if (wstrInput[i] == 'u')
		{
			for (int j = 0; j < m_iCount; j++)
			{
				m_Results[j] += 0x0105; // nasal a (with ogonek)
			}
		}
		else if (wstrInput[i] == 'v')
		{
			for (int j = 0; j < m_iCount; j++)
			{
				m_Results[j] += L"aw";
			}
		}

		// Now we're getting into the consonants
		else if (wstrInput[i] == 'p')
		{

			// Double the output with these characters
			DoubleOutput(L"b", L"p");

		}
		else if (wstrInput[i] == 'c')
		{

			// Double the output with these characters
			DoubleOutput(L"ch", L"j");

		}
		else if (wstrInput[i] == 't')
		{

			// Double the output with these characters
			DoubleOutput(L"t", L"d");

		}
		else if (wstrInput[i] == 'f')
		{
			for (int j = 0; j < m_iCount; j++)
			{
				m_Results[j] += L"ng";
			}
		}
		else if (wstrInput[i] == 'g')
		{
			//wstring eth = L"0";
			//eth[0] = 0x00F0;
			// Double the output with these characters
			DoubleOutput(L"th", L"ð"); // eth

		}
		else if (wstrInput[i] == 'h')
		{

			// Double the output with these characters
			DoubleOutput(L"h", L"x");

		}
		else if (wstrInput[i] == 'j')
		{

			// Double the output with these characters
			DoubleOutput(L"s", L"sh");

		}
		else if (wstrInput[i] == 'k')
		{

			// Double the output with these characters
			DoubleOutput(L"g", L"k");

		}
		else if (wstrInput[i] == 'm')
		{
			for (int j = 0; j < m_iCount; j++)
			{
				m_Results[j] += L"m";
			}
		}
		else if (wstrInput[i] == 'n')
		{
			for (int j = 0; j < m_iCount; j++)
			{
				m_Results[j] += L"n";
			}
		}
		else if (wstrInput[i] == 'r')
		{
			for (int j = 0; j < m_iCount; j++)
			{
				m_Results[j] += L"r";
			}
		}
		else if (wstrInput[i] == 's')
		{
			for (int j = 0; j < m_iCount; j++)
			{
				m_Results[j] += L"s";
			}
		}
		else if (wstrInput[i] == 'x')
		{
			for (int j = 0; j < m_iCount; j++)
			{
				m_Results[j] += L"x";
			}
		}
		else
		{
			m_Results.resize(1);
			m_Results[0] = L"Characters not recognized!";
			return m_Results;
		}
	}

	// Is the text box blank?  Tell them.  May rework this to a message box.
	if (m_Results[0] == L"")
		m_Results[0] = L"You didn't enter any characters!";

	return m_Results;
}


I still need to get rid of the "magic numbers" :). But if you take those functions and replace wchar_t* with wstring& in their arguments, for whatever reason it has the problem I mentioned above. But if I keep it as it is here, it works with no "out of range" issues.

I'll look into those links you posted and see what I can get out of them :).

And when I tried to make a program earlier, I was trying a Windows Forms Application. Should I go with a Win32 project? And yes, I am using Visual Studio 2010 Professional.
You already using std::vector; are you also familiar with std::map? Or iterators? Or how to use the table lookup approach?

Parser::MainParser() looks file -- though maybe you could come up with more descriptive names than Source1, ... (or is that all you've got???)

Parser::DoubleOutput could be recoded to use std::copy, it you want.

But Parser::Source4 is way too long for what it's doing. If you used either a map or a lookup table, you could fold the function right, right down. And make it less susceptible to bugs, too.

What you need to do is look up what to do with a char: e.g. 'a' -> append 'e'; 'y' -> append "ai"; 'w' -> double output "u", w"; etc.

Then you can the appropriate helper function. You already do DoubleOuput (if I read it right?), so you just need e.g.

1
2
3
4
5
6
7
void AppendOutput(const wstring& str)
{
	for (int j = 0; j < m_iCount; j++)
	{
		m_Results[j] += str;
	}
}


Regarding Win32 project / Windows Forms Application : the former is for normal C++ code (with SendMessage, etc.), the latter for C++/CLI. You can mix the two (a C++/CLI GUI with native algorithic code, for example), but that might get a bit confusing if you're learning both at the same time.

Basically, for the code you've been disussing here on cplusplus.com, you need a Win32 app.

Andy

PS Note that when you do this:

wstrInput[i] == 'v'

where wstrInput is a wchar_t*, then the compiler is promoting the char 'v' to a wchar_t and then comparing it with the wchar_t in wstrInput.

For example, if I do this

wcout << 'ą' << endl;

then Visual C++ warns me (my source file is Unicode):
warning C4566: character represented by universal-character-name '\u0105'
cannot be represented in the current code page (1252)

Need to use

wcout << L'ą' << endl;

Also this (an int literal)

0x0105; // nasal a (with ogonek)

should really be (a wchar_t literal)

L'\x0105'; // nasal a (with ogonek)
Last edited on
I am a little familiar with iterators. Haven't really had the opportunity to use them. Or perhaps I have had the opportunity but have done it the hard way by not using them LOL. As far as std::map, no, sure haven't used that before. I haven't used "table lookup" either. I'll look into them and see if I can swap my code out and use those. The same goes for std::copy as well. This has been a heckuva learning experience for me and I get introduced to new things every time you mention something.

I do have more descriptive names for my functions but I changed them when I posted them here since their actual names would allow those on the internet to deduce who I am if they looked hard enough :).

I am going to see about the Win32 project. You are right in that the Windows Forms Application had all kinds of gibberish that I am not familiar with. One thing at a time :).

Yea, I still need to swap out the "magic numbers" and standardize some things. I'll post the new (hopefully more streamlined!) function in a bit. Any thoughts on why the program was yelling at me when I was trying to use wstring&? You'd mentioned that wstring& would be a better way to go and I would definitely like to do that if I can. All this wchar_t* converting to wstring converting to LPTWSTR or whatever it is converting to whatever else I need is very confusing LOL.
iterators / maps / table lookup / regular expressions

Given the kind of code you're working on, these could all be of use at some point.

since their actual names would allow those on the internet to deduce who I am if they looked hard enough :).

Ah, ok

magic numbers

I think that using the literal characters in you code might not be such a bad thing, as that's the whole point of your code: to deal with character mappings. You just need to be sure your source files use a suitable encoding.

Andy

PS Regarding the out of range error: the assert should tell you what index you were trying to use. The problem is that the error is still there when you use a wchar_t*, but it is not spotted by the debug checks.

You should switch back to the wstring and then, when you hit the assert, look at the call stack to work out which bit of you code is calling the string operator[]. And check the assert, or the operator[] call itself, to see what index is being passed.
I've never really used asserts (which is probably a bad thing LOL) so I don't know much about their implementation.

I looked into maps and how to iterate through them. Would what you are talking about be a simple as doing something like this?

1
2
3
map<wstring, wstring> orthography;
orthography[L"g"] = L"th";
// and so on for each character 


And then iterate through them like this?
1
2
3
map<wstring, wstring>::iterator i:
i = orthography.find(L"g");
m_Results[i] += i->second;


I haven't quite nailed it down in my mind but I can't shake the feeling that there would be a clever way to use the for loop with this and feed the wstrInput into it somehow.
The problem with map::operator[] is that it will always return a value for a key, even if a value has not been previously set. If you need to find out if a value has been added to the map, you need to use map::find().

map::find()
http://www.cplusplus.com/reference/map/map/find/

And it you need to guard against overwriting existing entries, you should use map::insert.

map::insert()
http://www.cplusplus.com/reference/map/map/insert/

Note that I've written out the example below long hand. In practice, I (like a lot of people) use typedefs to compact my code. e.g. (but with more domain relevant names.)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
    // snip

    typedef map<wchar_t, wstring> map_type;
    typedef map_type::iterator    map_type_iter;

    map_type mappings;

    // snip

    for (int i = 0; i < map_count; i++)
    {
        pair<map_type_iter, bool> ret = mappings.insert(make_pair(map_from[i], map_to[i]));

    // snip 


Andy

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
#include <iostream>
#include <map>
#include <string>
using namespace std;

int main()
{
    const wchar_t  map_from[] = {L'l', L'd', L'e', L'y' , L'd'};
    const wchar_t* map_to  [] = {L"a", L"a", L"i", L"ai", L"val for dupe key" };
    const size_t map_count = _countof(map_from); // should not really use parallel arrays...

    map<wchar_t, wstring> mappings;

    wcout << L"set up mappings..." << endl;
    wcout << endl;

    for (int i = 0; i < map_count; i++)
    {
        pair<map<wchar_t, wstring>::iterator, bool> ret
            = mappings.insert(make_pair(map_from[i], map_to[i]));

        if(!ret.second)
        {
            wcout << L"skipping dupe key : " << map_from[i] << L" -> " <<  map_to[i] << endl;
            wcout << L"existing value mapping : " << ret.first->first << L" -> " <<  ret.first->second << endl;
            wcout << endl;
        }
    }

    const wchar_t look_for[2] = {L'a', L'l'};
    const size_t look_for_count = _countof(look_for);

    for (int i = 0; i < look_for_count; i++)
    {
        wcout << L"looking for " << look_for[i] << L" ..." << endl;
        map<wchar_t, wstring>::iterator iter = mappings.find(look_for[i]);
        if(iter != mappings.end())
        {
            wcout << L"found it! :-) : " << iter->first << L" -> " << iter->second << endl;
        }
        else
        {
            wcout << L"could not find anything? :-(" << endl;
        }
        wcout << endl;
    }

    return 0;
}


Output:

set up mappings...

skipping dupe key : d -> val for dupe key
existing value mapping : d -> a

looking for a ...
could not find anything? :-(

looking for l ...
found it! :-) : l -> a
Last edited on
Wow, going to take a bit to go over that code :).
Topic archived. No new replies allowed.
Pages: 12