Reading a .txt file into Vector

Oct 16, 2019 at 2:24pm
Hello this is simple and fast question. How do I make the while loop (the best way) to read a file ? I can do this (down below) but it's not good looking (for example what if I had like data1 to data10).

1
2
3
    ifstream in("duomenys.txt");
    int data1, data2;
    while(in >> data1 && in >> data2)
Oct 16, 2019 at 2:37pm
Assuming your data is whitespace-delimited (spaces, tabs, newlines)

(for example what if I had like data1 to data10).
1
2
3
4
5
6
7
8
9
10
11
12
13
ifstream in("duomenys.txt");
int data[10];
for (int i = 0; i < 10; i++)
{
    if (in >> data[i])
    {
        // success path
    }
    else
    {
        // failed to read in number, something went wrong
    }
}


If you don't know how many pieces of data you have, use a vector.
e.g.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
// Example program
#include <iostream>
#include <fstream>
#include <vector>

int main()
{
    std::ifstream fin("t.txt");
    
    std::vector<int> data;
    
    int element;
    while (fin >> element)
    {
        data.push_back(element);
    }
}
Last edited on Oct 16, 2019 at 2:39pm
Oct 16, 2019 at 2:44pm
That doesn't work if I have 2 different vectors for example my .txt file is :

10 25
78 3
14 65
78 1
98 2

And then the 1st collumn should be assigned to a vector seq1, second vector to seq2;

So there's the example code that I had with while loop :

1
2
3
4
5
6
7
8
    ifstream in("duomenys.txt");
    int data1, data2;
    while(in >> data1 && in >> data2)
    {
        seq1.push_back(data1);
        seq2.push_back(data2);
    }
    in.close();



vector 1
10 78 14 78 98
vector 2
25 3 65 1 2
Oct 16, 2019 at 2:48pm
So you're saying that
1
2
3
4
5
10 25 42
78 3 1
14 64 82
78 1 3
98 2 4

is also a valid file, each column is a separate sequence, and you don't know in advance how many columns there are?
Oct 16, 2019 at 2:53pm
Yeah it's valid file. What I'm trying to say is if I had like 100000 sequences it would make no sense to do :

while(in >> data1 && in >> data2 && in >> data3 && in >> data4 && in >> data5............)

And then

1
2
3
seq1.push_back(data1);
seq2.push_back(data2);
............................


etc.
Oct 16, 2019 at 2:55pm
Okay, it's a bit more complicated if you don't know in advance how many columns there are.
I would use getline combined with a stringstream to parse how many columns there are from the first line, and then make a vector of sequences (vector of vector) for each column. I'll post an example in a bit.
Oct 16, 2019 at 3:09pm
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
#include <iostream>
#include <fstream>
#include <sstream>
#include <string>
#include <vector>

int main()
{
    using Sequence = std::vector<int>;
    
    std::ifstream fin("d.txt");
    
    // first goal: Figure out how many columns there are
    // by reading in and parsing the first line of the file
    std::vector<Sequence> sequences;
    {
        std::string first_line;
        std::getline(fin, first_line);
        std::istringstream iss(first_line); // used to separate each element in the line
        
        int element;
        while (iss >> element)
        {
            sequences.push_back(Sequence()); // add empty sequence
            sequences.back().push_back(element); // insert first element
        }
    }
    
    // First line and all sequences are now created.
    // Now we just loop for the rest of the way.
    bool end = false;
    while (!end)
    {
        for (size_t i = 0; i < sequences.size(); i++)
        {
            int element;
            if (fin >> element)
            {
                sequences[i].push_back(element);
            }
            else
            {
                // end of data.
                // could do extra error checking after this
                // to make sure the columns are all equal in size
                end = true;
                break;
            } 
        }
    }

    // print results
    for (size_t i = 0; i < sequences.size(); i++)
    {
        std::cout << "seq " << i << ":";
        for (int elem : sequences[i])
        {
            std::cout << ' ' << elem;
        }
        std::cout << '\n';
    }
}
Last edited on Oct 16, 2019 at 3:12pm
Oct 16, 2019 at 3:18pm
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
#include <iostream>
#include <fstream>
#include <sstream>
#include <iomanip>
#include <string>
#include <vector>
using namespace std;

using TYPE = string;                   // the type of your data; use string if unknown

int main()
{
   string filename = "duomenys.txt";
   vector< vector<TYPE> > data;
   
   ifstream in( filename );
   for ( string line; getline( in, line ); )
   {
      stringstream ss( line );
      vector<TYPE> row;
      for ( TYPE d; ss >> d; ) row.push_back( d );
      data.push_back( row );
   }

   cout << "Your data:\n";
   for ( auto &row : data )
   {
      for ( auto &item : row ) cout << setw( 10 ) << item << ' ';
      cout << '\n';
   }
}


Your data:
        10         25         42 
        78          3          1 
        14         64         82 
        78          1          3 
        98          2          4 

Oct 16, 2019 at 3:27pm
You guys are really writing in very experienced hard codes ! :) Anyway I'll study both of yours. Thank you !
Oct 16, 2019 at 3:30pm
Yep!

btw I purposefully avoided making a stringstream each time just as a self-challenge, but that is definitely conciser code. Although one difference is that it's on a row-basis instead of column-basis.
Last edited on Oct 16, 2019 at 3:34pm
Oct 16, 2019 at 3:35pm
@DdavidDLT,
You do know that the python program to do the same is just
1
2
3
import numpy as np
data = np.loadtxt( "duomenys.txt" )
print( data )

C++ has a way to go on the usability front.
Last edited on Oct 16, 2019 at 3:36pm
Oct 16, 2019 at 3:36pm
loadtxt assumes 'rectangular' data?
SciPy wrote:
Each row in the text file must have the same number of values.
Ah yep

To be fair, NumPy is not part of the Python standard library. You could also find/make a C++ library that could call "loadtxt" to load in data from a file.
Last edited on Oct 16, 2019 at 3:40pm
Oct 16, 2019 at 3:40pm
Ganado wrote:
loadtxt assumes 'rectangular' data?


Accoding to the reference, yes. (https://docs.scipy.org/doc/numpy/reference/generated/numpy.loadtxt.html )
Each row in the text file must have the same number of values.


This is what it produced for the same input file (which I borrowed from you!):
[[10. 25. 42.]
 [78.  3.  1.]
 [14. 64. 82.]
 [78.  1.  3.]
 [98.  2.  4.]]



I think there's a numpy.genfromtxt that can handle "missing values".


Ganado wrote:
To be fair, NumPy is not part of the Python standard library.

True, but most scientists and engineers will have NumPy, SciPy and matplotlib as standard!
Last edited on Oct 16, 2019 at 3:43pm
Oct 16, 2019 at 4:12pm
ganado wrote:
You could also find/make a C++ library that could call "loadtxt" to load in data from a file.


Voilà!

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
#include <iostream>
#include <fstream>
#include <sstream>
#include <iomanip>
#include <string>
#include <vector>
using namespace std;

using TYPE = double;

//------------------------------------------------

template< typename T > vector< vector<T> > loadtxt( const string &filename )
{
   vector< vector<T> > data;
   ifstream in( filename );
   for ( string line; getline( in, line ); )
   {
      stringstream ss( line );
      vector<T> row;
      for ( T d; ss >> d; ) row.push_back( d );
      data.push_back( row );
   }
   return data;
}

//------------------------------------------------

template< typename T > void print( const vector< vector<T> > &data )
{
   for ( auto &row : data )
   {
      for ( auto &item : row ) cout << setw( 10 ) << item << ' ';
      cout << '\n';
   }
}

//======================================================================

int main()
{
   auto data = loadtxt<TYPE>( "duomenys.txt" );
   print( data );
}


        10         25         42 
        78          3          1 
        14         64         82 
        78          1          3 
        98          2          4 

Topic archived. No new replies allowed.