Reading txt file into 3d vector

Hi people,
is it possible to read a txt file with 3 different delimiters (,;#) into a 3d vector of x,y values ("sf::vector2f")?
The lines of the txt file have varying lengths, so an array is not possible (cpp0x standard).
I've got only an approach that does not work correctly.

The txt file looks like this: (numbers are only exemplaric for better understanding)

1,2,3,4,5,6;7,8,9,10,11,12#
13,14,15,16,17,18#


What I would like to do, is to save the values in the following way:

temptempworldcoordinates3d.push_back(1,2);
temptempworldcoordinates3d.push_back(3,4);
temptempworldcoordinates3d.push_back(5,6);

--> Semikolon Symbol:

tempworldcoordinates3d.push_back(temptempworldcoordinates3d.push_back);
temptempworldcoordinates3d.clear();

temptempworldcoordinates3d.push_back(7,8);
temptempworldcoordinates3d.push_back(9,10);
temptempworldcoordinates3d.push_back(11,12);

--> Hash Symbol:

tempworldcoordinates3d.push_back(temptempworldcoordinates3d.push_back);
worldcoordinates3d.push_back(tempworldcoordinates3d);

temptempworldcoordinates3d.clear();
tempworldcoordinates3d.clear();

temptempworldcoordinates3d.push_back(13,14);
temptempworldcoordinates3d.push_back(15,16);
temptempworldcoordinates3d.push_back(17,18);

--> Hash Symbol:

tempworldcoordinates3d.push_back(temptempworldcoordinates3d.push_back);
worldcoordinates3d.push_back(tempworldcoordinates3d);

temptempworldcoordinates3d.clear();
tempworldcoordinates3d.clear();


How to read it from text file correctly and tokenize it?
Is there a way to do it without external libraries and without too much code.
Any help apperciated!


My approach:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
#include <SFML/Graphics.hpp>
#include <iostream>
#include <sstream>
#include <fstream>
#include <algorithm>

std::vector<std::vector<std::vector<sf::Vector2f>>> worldcoordinates3d;
std::vector<std::vector<sf::Vector2f>> tempworldcoordinates3d;
std::vector<sf::Vector2f> temptempworldcoordinates3d;

int main()
{
	sf::RenderWindow window(sf::VideoMode(500,500), "Map", sf::Style::Close);
	window.setFramerateLimit(40);

	while (window.isOpen())
	{
		sf::Event event;
		while (window.pollEvent(event))
		{
			switch (event.type)
			{
			case sf::Event::Closed:
				window.close();
				break;
			case sf::Event::KeyPressed:
				if(sf::Keyboard::isKeyPressed(sf::Keyboard::L))
				{
				std::ifstream infile;
				infile.open("worldmapcoordinates.txt");
				std::string line;
				sf::Vector2f vector2fvar = sf::Vector2f(0,0);
				std::string str = "";
				float floatvarx,floatvary = 0;
				
				while(std::getline(infile, line)) 
				{
					std::size_t prev = 0, pos;
					while ((pos = line.find_first_of("#,;", prev)) != std::string::npos)
					{
						if (pos > prev)
						{
							str = line.substr(prev, pos-prev);
							
							std::istringstream (str) >> floatvarx;
							vector2fvar.x = floatvarx;
							
							std::istringstream (str) >> floatvary;
							vector2fvar.y = floatvary;						
							
							temptempworldcoordinates3d.push_back(vector2fvar);
							
							prev = pos+1;
						}
					}
					if (prev < line.length())
					{
						str = line.substr(prev, std::string::npos);
						
						std::istringstream (str) >> floatvarx;
						vector2fvar.x = floatvarx;
						
						std::istringstream (str) >> floatvary;
						vector2fvar.y = floatvary;
						
						temptempworldcoordinates3d.push_back(vector2fvar);
					}
				}
					for(int i=0;i<temptempworldcoordinates3d.size();i++)
					{
						std::cout << temptempworldcoordinates3d[i].x << " " << temptempworldcoordinates3d[i].y << std::endl;
					}
				}
			}
		}
	window.clear();
	window.display();
	}
	return 0;
}
Last edited on
you could extract all the numbers from each line of the file using std::regex. As the numbers would appear as std::string type then apply std::stringstream to convert them to int:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
# include <iostream>
# include <string>
# include <fstream>
# include <regex>
# include <iterator>

int main()
{
    std::ifstream inFile {"F:\\test.txt"};
    while (inFile)
    {
        std::string line;
        getline(inFile, line);
        if (inFile)
        {
            std::regex re{"\\d+"};

            auto numbers = std::sregex_iterator(line.begin(), line.end(), re);
            auto endNumbers = std::sregex_iterator();

            for (std::sregex_iterator i = numbers; i != endNumbers; ++i)
            {
                std::cout << (*i).str () << " ";
            }

            std::cout << "\n";
        }
    }
}

basic overview of C++ regex: http://www.cplusplus.com/reference/regex/ECMAScript/
more detailed C++ regex: http://www.informit.com/articles/article.aspx?p=2064649
You're looking to parse something which isn't trivial. (Admittedly, it's not particularly complex, but it's not trivial.)

It's often useful to write out a formal grammar. From what you've written, I can guess it'd look something like this:

start              ::= { triplet list } ;
point              ::= integer, literal comma, integer ;
coordinate triplet ::= point, literal comma, point, literal comma, point ;
triplet list       ::= coordinate triplet, (";", triplet list) | ('#',
                       any amount of whitespace) ;
literal comma      ::= "," ;


This is BNF (Backus Naur Form). (See https://en.wikipedia.org/wiki/Backus%E2%80%93Naur_form )
It's pretty easy to understand and write, and has the advantages of being completely unambiguous. It's also excellent documentation about the format you're willing to accept. The grammar of C++ itself is described using Backus-Naur form (see http://www.nongnu.org/hcb/ , or the more-definitive one http://eel.is/c++draft/gram )

In English, the above says that the input format (called start) is made up of any number of triplet lists. A triplet list is a coordinate triplets, followed by either a.) a semicolon followed by another triplet list, or b.) a hash character maybe followed by some whitespace.
It also says that a coordinate triplet is a set of three points separated by commas. A point is two integers separated by a comma.

How does this help? It makes it easy to convert into code:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
# include <string>
# include <vector>
# include <iostream>
# include <iomanip>

// Input format:
// START              ::= { triplet list } ;
// point              ::= integer, literal comma, integer ;
// coordinate triplet ::= point, literal comma, point, literal comma, point ;
// triplet list       ::= coordinate triplet, (';', triplet list) | ('#',
//                        any amount of whitespace) ;
// literal comma      ::= "," ;

// Example input:
// 1,2,3,4,5,6;7,8,9,10,11,12#
// 13,14,15,16,17,18#

namespace {
  struct literal { constexpr literal(char c): ch{c} {} char ch; };
  constexpr literal comma(','), semicolon(';'), hash('#');

  std::istream& operator>>(std::istream& str, literal const& c) {
    char ch{}; str >> ch;
    if (str && ch != c.ch) str.setstate(std::ios::failbit);
    return str;
  }
}

struct point { int x, y; };
std::istream& operator>>(std::istream& str, point& p)
{ return str >> p.x >> comma >> p.y; }

struct coordinate_triplet { point p[3]; };
std::istream& operator>>(std::istream& str, coordinate_triplet& t)
{ return str >> t.p[0] >> comma >> t.p[1] >> comma >> t.p[2]; }

struct triplet_list { std::vector<coordinate_triplet> ts; };

std::istream& operator>>(std::istream& str, triplet_list& lst) {
  for (coordinate_triplet tmp; str >> tmp; ) {
    lst.ts.push_back(tmp);
    char c {}; // handle alternation
    if (str >> c) {
      switch (c) {
        // another triplet follows: read it.
      case ';': { std::cout << "trying to read another triplet\n"; continue; }

        // end of this list.
      case '#': { std::cout << "no more triplets\n"; return str; }
      }
    }
  }
  return str;
}

int main() {
  std::vector<triplet_list> records;
  
  for (triplet_list t; std::cin >> t; )
    records.push_back(t);

  std::cout << records.size() << " records successfully read.\n";
}

Live demo:
http://coliru.stacked-crooked.com/a/ae54440042eb1f6b

This approach works for the simplest of formats, but it will not scale easily. To this end, a generator like a regular expression or a serious parser generator could be a better choice.

Last edited on
another approach might be to: (a) read file line-by-line into std::string, (b) change any non-digit char of this string to whitespace, (c) std::istringstream the resultant string:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
# include <iostream>
# include <string>
# include <fstream>
# include <sstream>
# include <cctype>
# include <vector>
# include <algorithm>
# include <iterator>

int main()
{
    std::ifstream inFile {"F:\\test.txt"};
    std::vector <int> numbers{};
    while (inFile)
    {
        std::string line;
        getline(inFile, line);
        for(auto& elem : line)
        {
            if(!isdigit(elem)) elem = ' ';
        }
        if (inFile)
        {
            std::istringstream stream{line};
            std::copy(std::istream_iterator<int>(stream), {}, std::back_inserter(numbers));
        }
    }
}

note I'm just focusing on extracting the numbers from the file here
Topic archived. No new replies allowed.