Identifying values between tags [Help Needed]

hey all, recently, I have been trying to construct my home-brewed way of identifying values between tags. I am trying to find values between tags, but the solution must be capable of finding multiple results. For this example, I am trying to parse a human readable configuration file. Here is an example of the file I am trying to read.

1
2
3
4
5
6
7
8
9
10
11
[Bank]
    [Account]
         [Balance]50[/Balance]
         [Rep]23[/Rep]
    [/Account]

    [Account]
         [Balance]54[/Balance]
         [Rep]13[/Rep]
    [/Account]
[/Bank]


At first, I thought it would be best to immediately isolate the values with regex. I later discovered that this is not a very reliable solution. The best solution I have so far is to create a string with the entire bank inside, then to create an array with each account, looping for each element inside each account. I am unsure as to how to continue, but I will post some examples of code I have tried.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
#include <string>
#include <regex>

// Your string
std::string str = "[Bank]test[/Bank]";

// Your regex, in this specific scenario
// Will NOT work for nested <column> tags!
std::regex rgx("[Bank](.*?)[/Bank]");
std::smatch match;

// Try to match it
if(std::regex_search(str.begin(), str.end(), match, rgx)) {
  // You can use `match' here to get your substring
  
  cout << "match";
};


This code does not work, and is not capable of looping through multiple regex matches. I already know how to load the configuration from a file, but I have no current way of parsing the data on the file. Help is much appreciated. Thanks, Drew.
Last edited on
Simple demo to find the account data:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
#include <iostream>
#include <algorithm>
#include <string>

using namespace std;

int main()
{
  std::string str = "[Bank]"
                    "[Account]"
                    "[Balance]50[/Balance]"
                    "[Rep]23[/Rep]"
                    "[/Account]"
                    "[Account]"
                    "[Balance]54[/Balance]"
                    "[Rep]13[/Rep]"
                    "[/Account]"
                    "[/Bank]";

  auto first = str.find("[Account]");
  while (first != string::npos)
  {
    auto last = str.find("[/Account]", first+10);
    if (last == string::npos)
       break;
    string data = str.substr(first+9, last - first-9);
    cout << data << "\n";
    first = str.find("[Account]", last + 10);
  }
}

Output

[Balance]50[/Balance][Rep]23[/Rep]
[Balance]54[/Balance][Rep]13[/Rep]


Probably worth to create functions to extract the account, balance and rep separately.
I have now got further on my journey. Here is the code I have so far:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
#include <iostream>
#include <algorithm>
#include <string>
#include <vector>

using namespace std;

class Data { //The vector is strictly temporary to copy all results down. All relevant data should be stored in permanant array or vector

public:

	std::vector<string> Result;

	void Parse(std::string str, std::string input) //input being name of value to isolate.
	{
		std::string delim1 = "[" + input + "]";
		std::string delim2 = "[" + input + "/]";


		Result.clear();

		auto first = str.find(delim1);
		while (first != string::npos)
		{
			auto last = str.find(delim2, first + delim2.size());
			if (last == string::npos)
				break;
			string data = str.substr(first + delim1.size(), last - first - delim1.size());

			Result.push_back(data);

			first = str.find(delim1, last + delim2.size());
		}
	}


} Data;


int main()
{
	std::string str = "[Account] [Balance]20[/Balance] [Reputation]17[/Reputation] [/Account] [Account] [Balance]20[/Balance] [Reputation]15[/Reputation] [/Account]"; 

	Data.Parse(str, "Account"); //Calls to set vector to contain each account.


	std::string Accounts[20][3]; //Creates new array to hold each account and its data. the first column is the raw data, second is rep, third is balance.

	int accountcount = Data.Result.size();


	for (int i = 0; i < accountcount; i++) //Loops for each element in the vector. (each account)
	{
		Accounts[i][0] = Data.Result.at(i); //adds each result into the account list.

											
	}

	for (int i = 0; i < accountcount;) //For each account...
	{
		i = i + 1;
		Data.Parse(Accounts[i][0], "Reputation"); //Sets vector to the reputation of current account.
														 //After parsing this data, we must take the vector and store in array.

		Accounts[i][1] = Data.Result.at(1); //takes reputation inside vector, and saves it into accounts.

		Data.Parse(Accounts[i][0], "Balance"); //Sets vector to the balance of current Account.

		Accounts[i][2] = Data.Result.at(1); //takes balance in vector, and saves to array.
	}

	cout << Accounts[1][0];
}


I am trying to basically loop through each account, printing the values of balance and reputation in between. The issue I am now facing is that I cannot seem to find a way to get the above code to function correctly.
Last edited on
You can start changing line 17 with:
std::string delim2 = "[/" + input + "]";

Why are you using the public variable Result instead of putting it inside the Parse function and returning it as a result?
Maybe like this:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
#include <iostream>
#include <algorithm>
#include <string>

using namespace std;


string read_tag(const string& str, const string& tag)
{
  const string opentag = "[" + tag + "]";
  const string closetag = "[/" + tag + "]";

  auto first = str.find(opentag);
  auto last = str.find(closetag, first + opentag.length() + 1);
  if (first == string::npos || last == string::npos)
    return "";
      
  return  str.substr(first + opentag.length(), last - first - opentag.length());    
}

int main()
{
  std::string str = "[Bank]"
                    "[Account]"
                    "[Balance]50[/Balance]"
                    "[Rep]23[/Rep]"
                    "[/Account]"
                    "[Account]"
                    "[Balance]54[/Balance]"
                    "[Rep]13[/Rep]"
                    "[/Account]"
                    "[/Bank]";

  auto first = str.find("[Account]");
  while (first != string::npos)
  {
    auto last = str.find("[/Account]", first+10);
    if (last == string::npos)
      break;
    string data = str.substr(first+9, last - first-9);
    if (!data.empty())
    {
      cout << "Balance: " << read_tag(data, "Balance") << "\n";
      cout << "Rep: " << read_tag(data, "Rep") << "\n";
    }
    first = str.find("[Account]", last + 10);
  }
}


Output:

Balance: 50
Rep: 23
Balance: 54
Rep: 13
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
#include <iostream>
#include <vector>
#include <string>
using namespace std;


//======================================================================


// Add to a collection of delimited items
void delimited( string text, string before, string after, vector<string> &collection )
{                                                
   int p, q = 0;
   while ( true )
   {
      p = text.find( before, q );      if ( p == string::npos ) return;
      p += before.size();
      q = text.find( after, p );       if ( q == string::npos ) return;
      collection.push_back( text.substr( p, q - p ) );
      q += after.size();
   }
}


//======================================================================


// Overload for a single delimited item (if there are several it will return the first)
string delimited( string text, string before, string after )
{                                                
   int p, q = 0;
   p = text.find( before, q );      if ( p == string::npos ) return "";
   p += before.size();
   q = text.find( after, p );       if ( q == string::npos ) return "";
   return text.substr( p, q - p );
}


//======================================================================


string trim( string s, string junk = " " )
{
   int i = s.find_first_not_of( junk );
   if ( i == string::npos ) return "";

   int j = s.find_last_not_of( junk );
   return s.substr( i, j - i + 1 );
}


//======================================================================

struct Account
{
   string name;
   double balance;
   double rep;
};

ostream &operator << ( ostream &strm, const Account &a )
{
   return strm << "Name: " << a.name << '\n' << "Balance: " << a.balance << '\n' << "Rep: " << a.rep << '\n';
}


//======================================================================


int main()
{
   string bank = "  [Bank]\n"
                 "      [Account]\n"
                 "          [Name]Lastchance[/Name]\n"
                 "          [Balance]50[/Balance]\n"
                 "          [Rep]23[/Rep]\n"
                 "      [/Account]\n"
                 "               \n"
                 "      [Account]\n"
                 "          [Name]Cplusplus[/Name]\n"
                 "          [Balance]54[/Balance]\n"
                 "          [Rep]13[/Rep]\n"
                 "      [/Account]\n"
                 "  [/Bank]\n";


   // Get a vector of strings for whole accounts
   vector<string> accountStrings;
   delimited( bank, "[Account]", "[/Account]", accountStrings );


   // Generate account objects - TO DO: ERROR CHECKING
   vector<Account> accounts;
   for ( string a : accountStrings )
   {
      string name    = trim( delimited( a, "[Name]"   , "[/Name]"    ) );
      double balance = stod( delimited( a, "[Balance]", "[/Balance]" ) );
      double rep     = stod( delimited( a, "[Rep]"    , "[/Rep]"     ) );
      accounts.push_back( { name, balance, rep } );
   }


   // Output
   for ( const Account &a : accounts ) cout << a << "\n";
}


Name: Lastchance
Balance: 50
Rep: 23

Name: Cplusplus
Balance: 54
Rep: 13
Last edited on
ok. Here is what I have so far:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
#include <iostream>
#include <vector>
#include <string>
using namespace std;


class Parse {
    
    public:
    
    std::vector<string> Result;
    
    std::string FinalR[200][5];
    
    void Parse(std::string str, std::string input)
    {
		std::string delim1 = "[" + input + "]";
		std::string delim2 = "[/" + input + "]";
		
		void Handle(str, delim1, delim2)
    }
    
	void Handle(std::string str, std::string delim1, std::string delim2) //input being name of value to isolate.
	{

		Result.clear();

		auto first = str.find(delim1);
		while (first != string::npos)
		{
			auto last = str.find(delim2, first + delim2.size());
			if (last == string::npos)
				break;
			string data = str.substr(first + delim1.size(), last - first - delim1.size());

			Result.push_back(data);

			first = str.find(delim1, last + delim2.size());
		}
	}
    
    void ParseAll(std::string str, std::string subject, std::string object)
    {
        void Parse(str, subject); //Parses to isolate the subject
        
        std::string subject = Result.at(1); //Sets the subject to the subject.
        
        void Parse(str, object); //Isolates each object into the array of beauty
        
        int count = Result.size();
        
        for (for (int i = 0; i < count; i++) { //Loops for each account.
         
        FinalR[i][0] = Result.at(i);  //Sets the first column for raw data for error checking later.
                 
         void Handle([k][0], "[", "]"); //Sets vector to contain each sub object.
         
         
         int count = Result.size();
            for (int k = 0; k < count; k++) //Loops for each item in the count.
            {
              FinalR[i][k] = Result.at(k);
            }
            
        }
        
        
        std::string object = Result.at(i);
        
    }
    
    
    
} P, Parse, Handle, Handler;


I appreciate all of the replies, but I was hoping I could simply pass the subject name, object-name, and then receive a vector with each data-set inside the given hierarchy. Sadly, this code has a few bugs that I cannot seem to solve quite perfectly. Once again, thanks so much for the extended help!
Thanks everyone for the help, but I ended up deciding to simply use JSON because I am wasting way too much time on this.
Topic archived. No new replies allowed.