Reading a string from text file

Hi everyone!
I have a problem when working with string in text file.
For example, I have a text:
<w:drawing><wp:inline distT="0" distB="0" distL="0" distR="0"><wp:extent cx="1799590" cy="1799590"/><wp:extent cx="5486400" cy="3200400"/>

I 'd like to read from the beginning to the end of text file and get values of cx and cy.
What should I do to do that task?
Thanks for your help!
Last edited on
Tokenize on the angle brackets and then tokenize on the substring 'cx=' and 'cy='.
Ideally you'd prob want to use an XML parser library, then iterate over each Node with name "wp:extent" , then extract values of the attributes "cx" , "cy"
I'm not very experienced with XML files so I copied this text to .txt file.
slepeckypes wrote:
I'm not very experienced with XML files so I copied this text to .txt file.
That's silly. Not only do lots of editors have XML syntax support, so that looking at it, the data seems organized and readable, but there are many perfectly good XML parsers out there, too, in almost any programming language. They'll extract any data you want in an instant.
Ok I'm a goof but I took a stab at it
This works if all the values are 7 digits.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
#include <fstream>
#include <iostream>
using namespace std;

int linecount=1;
string line;

int main (int argc, char *argv[])
{
    if (argc!=2)
    {
        cout << "Incorrect arguments\n";
    }
	else
	{
	ifstream inputfile (argv[1]);
    if (inputfile.is_open())
	{

cout << "CX" << "\tCY" << endl;

	while ( inputfile.good())
		{
		getline (inputfile,line);
		if (!line.empty())
			{

			for (int i =0; i <= line.size(); i++)
				{
					
					if ((line[i] == 'c') && (line[i+1] == 'x')  && (line[i+2] == '='))
						{
			//				cout << "CX= found at "	<< i << endl;
							cout << line [i+4]<< line [i+5]<< line [i+6]<< line [i+7]<< line [i+8]<< line [i+9]<< line [i+10];
						}
		
				if ((line[i] == 'c') && (line[i+1] == 'y')  && (line[i+2] == '='))
						{
			//				cout << "CY= found at "	<< i << endl;
							cout << "\t" << line [i+4]<< line [i+5]<< line [i+6]<< line [i+7]<< line [i+8]<< line [i+9]<< line [i+10] << endl;
						}
				}
			}
		}

	inputfile.close();
    }
	else
		{
		        cout << "File not open" << endl;;
		}
	}

return 0;
}


The output should look like this

C:\Temp>test123 cxy.txt
CX CY
1799590 1799590
5486400 3200400
1799591 1799591
5486401 3200401
1799592 1799592
5486402 3200402


Oh yea, my test file

<w:drawing><wp:inline distT="0" distB="0" distL="0" distR="0"><wp:extent cx="1799590" cy="1799590"/><wp:extent cx="5486400" cy="3200400"/>
<w:drawing><wp:inline distT="0" distB="0" distL="0" distR="0"><wp:extent cx="1799591" cy="1799591"/><wp:extent cx="5486401" cy="3200401"/>
<w:drawing><wp:inline distT="0" distB="0" distL="0" distR="0"><wp:extent cx="1799592" cy="1799592"/><wp:extent cx="5486402" cy="3200402"/>
Last edited on
There are many powerful tools to search for a specific patterns with regular expressions. You can use 'awk' for example which let you do the task you are asking for and even much more. You can learn more about it here https://www.gnu.org/software/gawk/
Last edited on
This works if all the values are 7 digits.

Values are from 6 to 8 digits
Then just modify the code to get all digits between the quotes.

If the numbers you want are all in the same column space that would be another way to just grab a certain range. Since I don't have your data file I can't say.
Last edited on
slepeckypes, don't be afraid of trying libraries. Trust me, step 1 is choosing an XML library. A lot of them are lightweight, too. For example, https://github.com/zeux/pugixml

You'll look back on this and be like, "oh, all i needed to do was include this one header file, compile my lib against that one, and write <10 lines ...?"
I agree with icy1. A xml library is the easiest way to go. pugixml is easy to use and well documented.
However it would require a complete well-formed and valid xml document.
With the snippet you showed every xml lib will fail.
If you have only a snippet then it would be better to follow SamuelAdams way.
I agree with everyone above. There are much better and faster ways but given a one line example. I hacked it. First idea I came up with worked so.... I left it there.
Last edited on
This is what I would do, String operations, find "cx="" and read every number that follows, same for cy.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
#include <iostream>
#include <fstream>
#include <string>

using namespace std;

void pause();



int main()
{

	fstream file;
	
	char foundCharacter;
	
	int position;
	
	string temporalString = "",
		   fileString,
		   cx1,
		   cy1,
		   cx2,
		   cy2;
		   
	
	file.open("readFromFile.txt");
	
	if (file.fail())
		exit(1);
	else
		cout << "\nFile successfully opened!";
	
		
	getline(file, fileString); //read string from file
	
	
	//CX1
	position = fileString.find("cx=\"", 0 ); //find first position of first "cx=" in string
	position += 4; //increment position to skip " sign;
	
	
	for (position; foundCharacter != '\"' ; position ++) //read every character that's a number untill " is found
	{
		foundCharacter = fileString.at(position);
		
		if (foundCharacter != '\"')
			temporalString.push_back(foundCharacter);
		else
			break;
	}
	
	cx1 = temporalString; //set first cx result
	
	
	//CX2
	position = fileString.find("cx=\"", position ); //find next position of "cx=" in string
	temporalString.clear(); //clear temporalString
	foundCharacter = '\0'; //clear character
	position += 4; //increment position to skip " sign;
	
	
	for (position; foundCharacter != '\"' ; position ++) //read every character that's a number untill " is found
	{
		foundCharacter = fileString.at(position);
		
		if (foundCharacter != '\"')
			temporalString.push_back(foundCharacter);
		else
			break;
	}
	
	cx2 = temporalString;
	
	
	//CY1
	position = fileString.find("cy=\"", 0 ); //find next position of first "cy=" in string
	temporalString.clear(); //clear temporalString
	foundCharacter = '\0'; //clear character
	position += 4; //increment position to skip " sign;
	
	
	for (position; foundCharacter != '\"' ; position ++) //read every character that's a number untill " is found
	{
		foundCharacter = fileString.at(position);
		
		if (foundCharacter != '\"')
			temporalString.push_back(foundCharacter);
		else
			break;
	}
	
	
	cy1 = temporalString;
	
	
	//CY2
	position = fileString.find("cy=\"", position ); //find next position of "cy=" in string
	temporalString.clear(); //clear temporalString
	foundCharacter = '\0'; //clear character	
	position += 4; //increment position to skip " sign;
	
	
	for (position; foundCharacter != '\"' ; position ++) //read every character that's a number untill " is found
	{
		foundCharacter = fileString.at(position);
		
		if (foundCharacter != '\"')
			temporalString.push_back(foundCharacter);
		else
			break;
	}
	
	
	cy2 = temporalString;
	
	
	cout << "\n\ncx1 = " << cx1
		 << "\ncy1 = " << cy1
		 << "\ncx2 = " << cx2
		 << "\ncy2 = " << cy2;
	
		
	pause();
	
	
	return(0);
}


void pause()
{
	std::cout << "\n\n\n Press enter to continue...";
	std::cin.sync();
	std::cin.get();
	std::cout << "\n\n";
	
	return;
}


Hope this helps,

Regards,

Hoogo;
Topic archived. No new replies allowed.