CSV Creator

Morning all!,

So I'll admit I'm pretty new to C++.
I normally write in VBA for work but I did have a play around with PHP and C (but never got too good!).
I thought learning C++ would be great for my development so wanted to create a something I could see a real world implementation of.

Because my job revolves around CSV data files, I thought I would create a CSV Toolkit! my first job was a CSV creator. Which, yes I have done, thanks to the WWW. But I need some help:


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
#include <iostream>
#include <vector>
#include <fstream>
#include <string>
#include <string.h>
#include <stdlib.h>
#include <sstream>

using namespace std;

const char NEWLINE = 0x0D + 0x0A;

void ReplaceStringInPlace(string& subject, const string& search,
                          const string& replace) {
    size_t pos = 0;
    while ((pos = subject.find(search, pos)) != string::npos) {
         subject.replace(pos, search.length(), replace);
         pos += replace.length();
    }
}

int create() {

    int headerCount = 0;
    char response;
    string writingFile;

//    if (argc == 3) {
//       if ( strcmp(argv[1], "-f") == 0) {
//           writingFile = argv[2];
//        } else {
//            writingFile = "test2.csv";
//        }
//    } else {
//        writingFile = "test2.csv";
//    }
    writingFile = "default.csv";
    system("cls");
    cout << "'***********************************************************'" << endl;
    cout << "'       Welcome to the CSV Toolkit - Create a CSV File      '" << endl;
    cout << "'***********************************************************'" << endl;
    cout << endl << endl;
    cout << "\tHow many columns appear in your file: ";
    cin >> headerCount;
    vector<string> header;
    cout << "\t\tThere are " << headerCount << " columns in your file? (Y/N): ";
    cin >> response;
    if(response == 'Y') {
        int rowCount = 0;
        cout << "\tHow many rows (including the header) appear in your file: ";
        cin >> rowCount;
        cout << "\t\tThere are " << rowCount << " columns in your file? (Y/N); ";
        cin >> response;
        if(response == 'Y') {
            int i = 0;
            system("cls");
            cout << "'***********************************************************'" << endl;
            cout << "'       Welcome to the CSV Toolkit - Populate CSV File      '" << endl;
            cout << "'***********************************************************'" << endl;
            cout << endl << endl;
            cout << "\tColumns: " << headerCount << endl;
            cout << "\tRows: " << rowCount << endl;
            cout << "\tfile: " << writingFile << endl << endl;
            ofstream myfile;
            myfile.open(writingFile.c_str(), ios::binary | ios::out);
            string vals;
            vector< vector<string> > buff;
            while(i != rowCount) {
                int x = 0;
                vector<string> temp;
                while(x != headerCount) {
                    stringstream ss;
                    ss.str("");
                    if (i == 0) {
                        cout << "\tWhat is header number " << (x + 1) << ": ";
                    } else {
                        cout << "\tWhat is item " << i << "'s " << header[x] << ": ";
                    }
                    cin.sync();
                    getline(cin,vals);
                    ss << "\"" << vals << "\"";
                    if (i == 0) header.push_back(ss.str());

                        temp.push_back(ss.str());
                        if (x != (headerCount - 1)) {
                                temp.push_back(",");
                        }
                        x++;
                    }
                    buff.push_back(temp); // Store the array in the buffer
                    x = 0;
                    i++;
                }
            for(vector<vector<string> >::iterator it = buff.begin(); it != buff.end(); ++it) {

                for(vector<string>::iterator jt = it->begin(); jt != it->end(); ++jt) {
                    myfile << *jt;
                }
                myfile << "\r\n";
            }
            myfile.close();
            cout << endl << "Execution Complete.";
            } else {
                cout << endl << "oh";
            }
        } else {
            cout << endl << "oh";
        }
    return 0;
}


As you can see, at the moment, my code asks the user how many columns and rows will be in the CSV, then takes the input and writes them.

I understand all of my code with the exception of the latter quarter where it starts talking about vector iterators = *jt blah blah.


I feel I need to understand this in order to be able to "tidy up".

My goals:

* instead of adding the quotes (") during the input stage, I would like to move them to the output stage
* I would like to be able to escape quotes
* I would like to turn this into a function/class
This looks a bit over-engineered in the sense that a 2D array (vector of vectors) isn't really necessary. All that is needed is a vector to hold the headings, and a separate vector to hold the current line.

I would break this into separate functions, one to get each line of data values, and another to output a single line (either the headings or the data) to the file. The latter is where I would add the quotation marks.

A couple of other comments. Firstly there is no main() function, so this isn't a valid C++ program. Change int create() to int main().

Everything here is ordinary text, so there's no need to open the file in binary mode. It would be sufficient to simply put
 
    ofstream myfile(writingFile.c_str());
which declares and opens the stream for output, in text mode.

The newline character can simply be specified as '\n' rather than as a pair of values. I realise that Windows does use the CRLF pair for line endings, but in text mode the translation is done automatically.

At line line 79 you have cin.sync(); which is not a reliable way to do what is needed. The behaviour of cin.sync() is implementation-defined, which means the program may seem to work, but if you try to use a different compiler it may not. A better approach to clearing the input buffer here is something like cin.ignore(1000, '\n'); which will ignore up to 1000 character, or until the newline character is found. Its purpose here is to get rid of the unwanted newline which remains in the input buffer after a previous cin >>.

Use of system("cls"); is not recommended. For one thing it is slow, but the use of system() in general can create a security hole, so it's best to avoid the habit.


I had a few other comments, but that will suffice for now I think.

Edit As for your question about
vector iterators = *jt blah blah.

In some ways an iterator can be thought of as being a bit like a pointer. However they are also one of the building blocks used by the different containers such as std::list or std::set in the C++ standard library. One of the good things about the iterator is that once you know how to use it with one type of container, such as std::vector, it becomes relatively straightforward to reuse that knowledge with the other container types.

http://www.cplusplus.com/doc/tutorial/pointers/
http://www.cplusplus.com/reference/stl/
Last edited on
Here's a possible approach. It uses some C++11 features which means it requires a recent compiler. This is a fairly minimal version, I omitted some of the text.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
#include <iostream>
#include <iomanip>
#include <vector>
#include <fstream>
#include <string>

using namespace std;

void getHeadings(vector<string> & v, int cols);
void getData(vector<string> & data, const vector<string> & head, int row);
void writeLine(vector<string> & v, ostream & os);

int main() 
{
    int headerCount = 0;
    int rowCount = 0;
    string writingFile = "default.csv";
    vector<string> header;
    vector<string> data;
            
    cout << "\tHow many columns: ";
    cin >> headerCount;
    
    cout << "\tHow many rows of data: ";
    cin >> rowCount;
  
    cin.ignore(1000, '\n'); // clear the input buffer
    
    ofstream myfile(writingFile);

    getHeadings(header, headerCount);
    writeLine(header, myfile);
    
    for (int row=0; row<rowCount; row++ ) 
    {
        data.clear();
        getData(data, header, row);
        writeLine(data, myfile);
    }
   
    myfile.close();
    cout << endl << "Execution Complete.";
 
    return 0;
}

void getHeadings(vector<string> & head, int cols)
{
    cout << "Please Enter Column Headings" << endl;
    string text;
    for (int col=0; col<cols; col++)
    {
        cout << "Column " << setw(3) << col+1 << ": ";
        getline(cin, text);
        head.push_back(text);
    }
}

void getData(vector<string> & data, const vector<string> & head, int row)
{
    cout << "\nPlease Enter Values for Row " << row+1 << endl;
    string text;
    for (auto label : head)
    {
        cout << setw(12) << label << ": ";
        getline(cin, text);
        data.push_back(text);        
    }
}

void writeLine(vector<string> & v, ostream & os)
{
    string delim = "";
    for (auto word : v)
    {
        os << delim << '\"' << word << '\"';
        delim = ","; 
    }
    os << '\n';
}
sound advise.
I'm trying to write the code all myself in a way I'll understand and learn, so...

How are these looking:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
#include <iostream>
#include <vector>
#include <string>
#include <ofstream>


void setHeaders(vector<string> headings, int colCount){
    string textVal;
    cout << "Please enter header(s): " << endl;
    for (unsigned int h = 0; h < colCount; h++){
        cout << "\t" << h+1 << ". ";
        getline(cin, textVal);
        headings.push_back(textVal);
    }
}

void setItems(vector<string> items, const vector<string> &head, int rowNum){
    string textVal;
    cout << "Please enter items for entry " << rowNum+1 << ":";
    for (unsigned int e = 0; e < head.size(); e++){
        cout << "\t" << e+1 << ". ";
        getline(cin, textVal);
        items.push_back(textVal);
    }
}

void writeLine(vector<string> writer, fileStream){
    for(unsigned int w = 0; w < writer.size(); w++){
        filestream << "\"" << writer[w] << "\"";
    }
        filestream << "\n";
}
I can certainly understand you wanting to write the code for yourself - that's the best way to learn.

However, don't be reluctant to let the compiler assist you. Of course the compiler can't read your mind, but it can be a useful help. For example, #include <ofstream> isn't correct and the compiler will tell you so.

As for the other parts of the code, When you want to modify the contents of a variable passed to a function such as the vector headings, it should be passed by reference. See the tutorial page on functions if you're not sure of how to do that:
http://www.cplusplus.com/doc/tutorial/functions/

Also, use the use of const can be useful to let the compiler know that you do not want to modify a variable, even though it is passed by reference. (I overlooked one place in my own code above where I should have done this).

In function writeLine(), the parameter fileStream is used without a type. Again the compiler can help you to pick up problems such as this.

Still - I'm not aiming to be overly critical, just trying to guide you a little bit.
no no critical is perfect.

here's where I am:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
#include <iostream>
#include <vector>
#include <string>
#include <fstream>

using namespace std;

ofstream filePath("default.csv", ostream::out);

void setHeaders(vector<string> & headings, int colCount){
    string textVal;
    cout << "Please enter header(s): " << endl;
    for (int h = 0; h < colCount; h++){
        cout << "\t" << h+1 << ". ";
        getline(cin, textVal);
        headings.push_back(textVal);
    }
}

void setItems(vector<string> & items, const vector<string> &head, int rowNum){
    string textVal;
    cout << "Please enter items for entry " << rowNum+1 << ":";
    for (unsigned int e = 0; e < head.size(); e++){
        cout << "\t" << e+1 << ". ";
        getline(cin, textVal);
        items.push_back(textVal);
    }
}

void writeLine(vector<string> & writer, ostream & os){
    string delim = "";
    for(unsigned int w = 0; w < writer.size(); w++){
        os << delim << "\"" << writer[w] << "\"";
        delim = ",";
    }
        os << "\n";
}


int main(){


    int headerCount = 0;
    int rowCount = 0;
    vector<string> headings;
    vector<string> items;

    cout << "How many columns: ";
    cin >> headerCount;
    cout << "How many rows: ";
    cin >> rowCount;

    setHeaders(headings, headerCount);
    writeLine(headings, filePath);

    for (int counter = 0; counter < rowCount; counter++){
        items.clear();
        setItems(items, headings, counter);
        writeLine(items, filePath);
    }

    return 0;
}



Only issue I'm getting is with the cin.ignore(1000, '\n')

In my for loops, it is working great for the first cin, but is causing me to have to press enter twice after that.
and I obviously can't set '\n' to a empty value...
How would I get around this?
Only issue I'm getting is with the cin.ignore(1000, '\n')

In my for loops, it is working great for the first cin, but is causing me to have to press enter twice after that.
and I obviously can't set '\n' to a empty value...
How would I get around this?

I don't see any place in that latest code where you use cin.ignore ?

Having said that, you need to consider why you need to use it at all. It's due to the difference between the way that cin >> ... and getline(cin, ... deal with whitespace, and in particular with the newline character which gets stored in the input buffer each time the user presses the "enter" key.

You just need to remove it from the buffer once, after the last cin >>, in order that the following getline will work correctly.

Edit:
In this line ofstream filePath("default.csv", ostream::out); the type ofstream already indicates that this is an output file, so the second parameter ostream::out is redundant here, is needless repetition of the same information.

It is declared as a global variable, which is probably ok here, though generally the use of lots of global variables is regarded as bad practice. (If something is passed as a parameter, you have complete control over which functions are allowed access to that variable).

I notice you have preferred the use of the tab character '\t' rather than the options available using for example setw(). For something fairly simple like this, that's probably ok, though it's certainly worth learning about setw() and related formatting capabilities.
Last edited on
setw() huh, I like it.

Well here we are:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
#include <iostream>
#include <vector>
#include <string>
#include <fstream>
#include <iomanip>

using namespace std;

ofstream filePath("default.csv");

void setHeaders(vector<string> & headings, int colCount){
    string textVal;
    cout << "Please enter header(s): " << endl;
    cout << setw(10) << 1 << ". ";
    cin.ignore(1000, '\n');
    getline(cin, textVal);
    headings.push_back(textVal);
    for (int h = 1; h < colCount; h++){
        cout << setw(3) << h+1 << ". ";
        getline(cin, textVal);
        headings.push_back(textVal);
    }
}

void setItems(vector<string> & items, const vector<string> &head, int rowNum){
    string textVal;
    cout << "Please enter items for entry " << rowNum+1 << ":" << endl;
    for (unsigned int e = 0; e < head.size(); e++){
        cout << setw(10) << e+1 << ". ";
        getline(cin, textVal);
        items.push_back(textVal);
    }
}

void writeLine(vector<string> & writer, ostream & os){
    string delim = "";
    for(unsigned int w = 0; w < writer.size(); w++){
        os << delim << "\"" << writer[w] << "\"";
        delim = ",";
    }
        os << "\n";
}

int main(){

    int headerCount = 0;
    int rowCount = 0;
    vector<string> headings;
    vector<string> items;

    cout << "How many columns: ";
    cin >> headerCount;
    cout << "How many rows: ";
    cin >> rowCount;

    setHeaders(headings, headerCount);
    writeLine(headings, filePath);

    for (int counter = 0; counter < rowCount; counter++){
        items.clear();
        setItems(items, headings, counter);
        writeLine(items, filePath);
    }

    return 0;

}


solved my cin.ignore issues too!
Thanks for all of your help.
Now I just need to work out how to escape commas and quotes.
And I would also like to learn how I can optimise my code and enforce best practices!
Now I just need to work out how to escape commas and quotes.

The first step is to to determine what it is that needs to be done. For example, you could take your favourite spreadsheet program, enter some values which contain commas and/or quotes. Then save or export the sheet as a csv file. Now open the file in a text editor to see how your program has encoded those tricky details.

The second step then is to write some code which will perform a similar transformation of text.

An alternative, which I often prefer, is to use the tab character '\t' instead of a comma, to separate each field. (Or use the pipe '|' character). Save the file as plain text (as it cannot properly be called a csv) and it should be relatively straightforward to import it into other programs.

As for optimising your code and and enforcing best practices, I'd say the main priority is producing code which is readable and understandable. Of course it should also be error-free. But other factors are less easy to summarise - it might take a book, or several books to cover those.
Some hopefully constructive comments on your code:

Lines 14-16: I don't see any reason to input the first heading separately. The first heading can be handled within your loop just as you did in setItems. Be sure and change the starting value of your loop variable accordingly.

Line 20,30: You don't check that the getline operation succeeded.

Line 38: Assumes all columns are quoted. You might want to consider only quoting character fields and output numeric fields without the quotes. You might also want to consider whether the field has embeded quotes or backslashes which might need to be escaped.

Line 39: This will leave a trailing comma after the last column. Some CSV imports don't like empty trailing heading/columns or could be harmless.

Last edited on
Topic archived. No new replies allowed.