Reading a record from a CSV file

I have got a CSV data file which has lots of records. I have to read each of these records by C++ and then put them in a structure to do the processes that I need. I do not have a experience with the databases and do not know where to start. That would be great if you know a blog or tutorial with example which are doing exactly about database extraction. There are a lot of information about how to read a file or ... but they are either so complex or not relevant to CSV files or they are not doing the function that I need. The data that I have is like:

TABLENAME*PURPOSE*EFILENAME*RECCOUNT*PERIOD*DFILENAME*COMPRESSION
"MC"*"20100914_NHPUB_MC_2007.TXT"*11212976*"2007 INCURRED"*"20100914_NHPUB_MC_2007.TXT.gz"*"WINZIP 10.0"
"PC"*"20100914_NHPUB_PC_2007.TXT"*5819119*"2007 INCURRED"*"20100914_NHPUB_PC_2007.TXT.gz"*"WINZIP 10.0"
"HGDX"*"20100914_NHPUB_REF_HGDX_DIM.TXT"*11113*"2005-2009 INCURRED"*"20100914_NHPUB_REF_HGDX_DIM.TXT.gz"*"WINZIP 10.0"

Thanks
This is one way:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
#inlcude <string>
#include <sstream>

using namespace std;

int main() {
    //open ifstream 'file'
    ...

    string line;
    while( getline(file, line).good() ) {
        istringstream s(line);
        string entree;
        while( getline(s, entree, '*').good() )
            ... //do stuff with or store 'entree'
    }

    ...
}


See: http://cplusplus.com/reference/string/getline
http://cplusplus.com/reference/iostream/istringstream

See also: http://cplusplus.com/reference/string/string
See also: http://cplusplus.com/reference/iostream/ifstream
> There are a lot of information about how to read a file or ...
> but they are either so complex or not relevant to CSV files ...

Parsing a delimited file which contains quoted fields is not trivial -
because a quoted field can contain the delimiter within them, quotes can be nested,
a field may contain a quote character which would be escaped and so on.

If this is production code, use a library to do the heavy lifting. For instance Boost tokenizer: http://www.boost.org/doc/libs/1_51_0/libs/tokenizer/escaped_list_separator.htm

An example to parse a line (specific to your sample data):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
#include <vector>
#include <string>
#include <boost/tokenizer.hpp>
#include <iostream>

std::vector<std::string> tokenize( const std::string& line )
{
    // escape char is \ , fields are seperated by * , some fields may be quoted with "
    boost::escaped_list_separator<char> sep( '\\', '*', '"' ) ;
    boost::tokenizer< boost::escaped_list_separator<char> > tokenizer( line, sep ) ;
    return std::vector<std::string>( tokenizer.begin(), tokenizer.end() ) ;
}

int main() // minimal test driver
{
    const std::string line = R"###("MC"*abcd\\\*.txt*11212976*"2007 INCURRED"*"MC_2007.TXT.gz"*"WINZIP 10.*"*"this*is*a*single*comment*field")###" ;
    for( const auto& token : tokenize(line) ) std::cout << token << '\n' ;
}

Output:
MC
abcd\*.txt
11212976
2007 INCURRED
MC_2007.TXT.gz
WINZIP 10.*
this*is*a*single*comment*field


Last edited on
Thank you guys.
Topic archived. No new replies allowed.