speed question

I had written a program that writes to a comma delineated file and it takes forever to finish running. Below is a simplified version of the program that compiles. I was wondering if anyone could tell me if there was a way to make this program more efficient by changing the code so that it finished running sooner? Also I was thinking of buying a new laptop. If I did and wanted this program to run as fast as possible what hardware should I get? For example should I get a ssd hardrive, or as much RAM as possible, or the fastest processor? What component would be the most important?

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
#include <string>
#include <iostream>
#include <vector>
#include <fstream>

using namespace std;

int main()
{
    string h="hello";
    vector<vector<string> > array;
for(int i = 0; i<1000000; i++)
{
    vector<string> myvector;
    for(int j = 0; j<11; j++)
    {
        myvector.push_back(h);
    }
    array.push_back(myvector);
}
    ofstream myfile;
    myfile.open("C:\\file3423.txt", std::fstream::out | std::fstream::trunc);
    for(size_t g=0;g<array.size();g++)
    {
    myfile << array[g][0] << "," << array[g][1] << "," << array[g][2] << "," << array[g][3] << "," << array[g][4] << "," << array[g][5] << "," << array[g][6] << "," << array[g][7] << "," << array[g][8] <<","<<array[g][9]<<","<<array[g][10]<<","<<endl;
    }
    myfile.close();
    array.clear();
    return 0;
}
Last edited on
What exactly is this program supposed to do?
The first step would be to replace endl with '\n' (endl is extremely rarely needed, and this is certainly not that case)

Then of course make sure you have optimization enabled.

Then use a profiler to find the next bottleneck.

as posted, this program runs on my 5+ years old home pc in 7.8 seconds, and with '\n' instead of endl, it runs in 4.9 seconds.
Last edited on
Since you're writing 66 MB of data to a file, I'd tend to say disk performance is the largest factor.
You can also reduce the number of memory reallocations caused by vector by reserving a specified capacity before actual usage.

This is probably closer to how I'd have written it:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
#include <string>
#include <iostream>
#include <vector>
#include <fstream>

using namespace std;

int main()
{
	const size_t vector_count = 1000000U;
	const size_t string_count = 11U;
	string h = "hello";
	puts("Constructing data...");
	vector<vector<string>> array;
	array.reserve(vector_count);
	for (size_t i = 0; i != vector_count; i++)
	{
		vector<string> myvector;
		myvector.reserve(string_count);
		for (size_t j = 0; j != string_count; j++)
			myvector.push_back(h);

		array.push_back(myvector);
	}
	puts("Writing data to file...");
	ofstream myfile;
	myfile.open("file3423.txt", std::fstream::out | std::fstream::trunc);
	for (vector<string> &strings : array)
	{
		for (string &string : strings)
			myfile << string << ','; // Edit: Forgot the comma!

		myfile << endl;
	}
	puts("Data write completed. Closing...");
	myfile.close();
	return 0;
}


Edit: What Cubbi suggests would also help performance, since you'd no longer be constantly flushing the stream.

Edit 2: What ne555 suggests about vector::reserve is more reliable, and probably more efficient.
Last edited on
I/O operations are expensive
First step, get rid of the vectors and directly write your strings into the file

Second, use Cubbi's suggestion. This causes the output buffer to fill up before it is flushed and written to the file

If you are going for a new laptop, get one with ssd and hdd at raid 0. The ssd will provide the increase in read/write speeds.
> I/O operations are expensive
> First step, get rid of the vectors
'foo' is expensive, so modify 'bar'

> directly write your strings into the file
unless you mean myfile.write( big_chunk, sizeof(big_chunk) ); which has a different meaning, I don't see how you may access an string more "directly"


@Cubbi: I would swap the first and second step.
If the code is good enough there is no need to modify it

> specifying a start capacity when constructing, then clearing it.
I think that the capacity is not guaranteed to remain.
If you want to reserve, then reserve http://www.cplusplus.com/reference/vector/vector/reserve/
First step, get rid of the vectors and directly write your strings into the file

I'm under the impression that using vector was specifically part of his goal.

> specifying a start capacity when constructing, then clearing it.
I think that the capacity is not guaranteed to remain.
If you want to reserve, then reserve http://www.cplusplus.com/reference/vector/vector/reserve/

You're correct -- post modified.
Topic archived. No new replies allowed.