Parse a txt file

Hello

I have a text file with the following format

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
//MAIN OBJECT
Object = ControlNetwork
  Created      = 2013-01-28T12:26:17

//FIRST CONTROL POINT OBJECT
  Object = ControlPoint
    PointId     = 1

    Group = ControlMeasure
      Img_Name	   = img1
      X_Coord      = 802
      Y_Coord      = 725
    End_Group

    Group = ControlMeasure
      Img_Name 	  = img2
      X_Coord     = 480
      Y_Coord     = 708
    End_Group
//END OF FIRST OBJECT
  End_Object

//SECOND CONTROL POINT OBJECT
  Object = ControlPoint
    PointId     = 2

    Group = ControlMeasure
      Img_Name 	    = image1
      X_Coord       = 317
      Y_Coord       = 130
    End_Group

    Group = ControlMeasure
      Img_Name 	    = image2
      X_Coord       = 128
      Y_Coord       = 116
    End_Group
//END OF FIRST OBJECT
  End_Object

//END OF MAIN OBJECT
End_Object
End


I want to get the X and Y coordinates of the each control points from 2 images.

For example, from the above file i want to retrieve such information:

-PointID=1 and point coordinates are 802,725(from image1) and 480,708 (from image2)

-PointID=2 and point coordinates are 317,130(from image1) and 128,116 (from image2)

There can be millions of points in such a text file. So what is the best way to parse these files?

Thanks
The control point objects are nested in a main object. Does each text file contain only one main object, or does it contain an arbitrary number of main objects (each of which contain an arbitrary number of control point objects)?

You can use a vector to store the control point objects as you parse through your file, something like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
struct controlPoint
{
	int pointID;
	int coordX;
	int coordY;
};

vector<controlPoint> controlPoints;

controlPoint temp; //used to store the data you parse for

//... (parse file for a control point object and store the data in temp)

temp.pointID = /**/ ;
temp.coordX = /**/ ;
temp.coordY = /**/ ;

controlPoints.push_back(temp); //store the data in your controlPoints vector

//... (at this point, temp can be reused for the next control point object you parse, 
// and then it can be pushed back in controlPoints vector, until all the control points are stored in your vector. ) 


If your file contains more than a single main object, you'll have to keep track of the nesting level and the current object you are in (by parsing "Object =" and "End_Object" notations, and counting their occurance using a suitable variable structure), and perhaps maintain a similar vector for the file's main objects, each entry of which will contain the controlPoints vector you fill as you parse the main object.

If you are working with a very large number of points, and you want to use the data in your program for calculations that are memory intensive (and not just extract and save it in a new file, or something similar), you may want to implement the above with dynamically allocated arrays, because vectors have some memory overhead due to the fact that they allocate some additional memory as they grow, in order to store potential new entries to the vector without having to allocate new memory for each such new entry as it occurs (i.e. , the vector::capacity can be equal to or greater than the vector::size for a given vector).
Last edited on
Topic archived. No new replies allowed.