Trouble evaluating the time taken by a quicksort

I'm trying to find the time elapsed for each quicksort when given input files of different lengths. To do that, I generate 100 files with random inputs and then sort them. To evaluate the time, I tried to get the time before the sort is called and then again afterwards and just subtract the two values. But every time I run the program I get a difference of 0. I have no idea what I'm doing wrong, but I'd appreciate any suggestions. Here's the code I'm using:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
#include "header.h"

void RunTest(vector<float>& inputs, int test_number, ofstream& output_file)
{
	string name;
	GenerateInputFiles("input", 100, test_number);

	//the time variables, used for keeping track of the elapsed time for each sort
       	timeval initial;
	timeval final;
	long initial_ms;
	long final_ms;
        long difference;
	long average;

	for(int i = 0; i < 100; i++)
	{
	       	inputs.clear();
		name = "input" + GetStringFromInt(i) + ".txt";
		GetInputs(inputs, name);
		//start keeping track of the time
	        gettimeofday(&initial, NULL);
                initial_ms = (initial.tv_sec * 1000) + (initial.tv_usec / 1000);
		Sort(inputs, 0, inputs.size() - 1);
		//end timer
		gettimeofday(&final, NULL);
		final_ms = (final.tv_sec * 1000) + (final.tv_usec / 1000);
		difference = final_ms - initial_ms;
		output_file << "\nElapsed Time: " << difference << "\n";
		PrintVector(inputs, output_file);
	}
}
Last edited on
Your 'clock' doesn't have a high enough resolution to time a single sort accurately.
Is there any way I can deal with that? I need to be able to get the average run time for 100 input files for different number of inputs (10, 100, 1000, etc.) and then graph them against each other.
If you need to get the average run time, time multiple sorts (with the same number of inputs) in a batch, and divide the total time by the number of sorts in the batch.
I am timing multiple sorts right now. Each time it goes through the for loop it sorts one file, and it gets the elapsed time. The problem is that the elapsed time is 0 for every sort, so the average for all 100 sorts is still going to be 0 (which is why I haven't bothered to store them in the variable 'average' yet). Although for bigger inputs (1000 is the biggest right now) the value is sometimes 1.

@ne555:

How would I use the microsecond resolution?
Last edited on
I am timing multiple sorts right now.


No. You're timing individual sorts multiple times. You need to time multiple sorts at one time. Set up your stuff to sort before hand. Then time a loop wherein multiple sorts are executed. Divide the total time it took for the loop to execute by the number of sorts done in the loop.

Also, I'm not familiar with gettimeofday, but it looks like you may be running afoul of integer math. Change 1000 in your divisions to 1000.0
Last edited on
Ah, that makes more sense than what I thought, thanks. I assume since each sort is so fast that I'm having trouble timing it that the answer is yes, but would it be fine to just take the time before and after the for loop then? Or would the other operations possibly mess up the results?
See what I added in the edit.
> but it looks like you may be running afoul of integer math. Change 1000 in your divisions to 1000.0
Or don't divide at all.
I'm not all too familiar with it either, it's just what I found with a google search. But it returns total seconds and microseconds. So multiplying seconds by 1000 converts seconds to milliseconds and dividing by 1000 converts microseconds into milliseconds. It made sense to me, but I might be missing something obvious again. Why should I not divide?

Anyway, I changed things a bit to time only the sorts. Here's what I have
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
#include "header.h"

void RunTest(vector<float>& inputs, int test_number, ofstream& output_file)
{
	string name;
	GenerateInputFiles("input", 100, test_number);

	//the time variables, used for keeping track of the elapsed time for each sort
       	timeval initial;
	timeval final;
	long initial_ms;
	long final_ms;
        long difference;

	//a vector of input vectors used to store the inputs from each text file
	vector<vector<float> > vector_of_inputs;
	
	//get the inputs from each text file
	for(int i = 0; i < 100; i++)
	{
	       	inputs.clear();
		name = "input" + GetStringFromInt(i) + ".txt";
		GetInputs(inputs, name);
		vector_of_inputs.push_back(inputs);
	}

	//start keeping track of the time
        gettimeofday(&initial, NULL);
	initial_ms = (initial.tv_sec * 1000.0) + (initial.tv_usec / 1000.0);

	for(int i = 0; i < 100; i++)
	{
                inputs.clear();
	        inputs = vector_of_inputs[i];
		Sort(inputs, 0, inputs.size() - 1);
		//PrintVector(inputs);
	}

	//end timer
	gettimeofday(&final, NULL);
	final_ms = (final.tv_sec * 1000.0) + (final.tv_usec / 1000.0);
	difference = final_ms - initial_ms;

	output_file << "\nElapsed Time: " << difference << "\n";
}


And as far as I can tell it's working, but the values are still really small (not necessarily wrong, but it's worth asking about). I haven't divided yet to get the average, so it's the total elapsed time. What I have right now is this:

for 10 inputs: 0 milliseconds
for 100 inputs: 2 milliseconds
for 1000 inputs: 27 milliseconds
for 10000 inputs: 414 milliseconds

Is there anything that I'm obviously doing wrong?
Last edited on
> Why should I not divide?
Because integer division returns an integer.
Suppose that you've got 12412 microseconds, you will report just 12 milliseconds (losing precision)

1
2
3
                inputs.clear();
	        inputs = vector_of_inputs[i];
		Sort(inputs, 0, inputs.size() - 1);
could be simply Sort(vector_of_inputs[i], 0, vector_of_inputs[i].size()-1); so you don't do that extra copy
Understood, thanks. I changed it so that I'm multiplying seconds by 1000 to get milliseconds and then 1000 again to get microseconds (I do two multiplications because it's easier on the eyes) and then add that to the microseconds returned by the gettimeofday() function. As far as I can tell, that makes sense. And it's accurate enough to get a value for the smallest input size as well. Do you have any thoughts on that?
Topic archived. No new replies allowed.