Operator Speed

Hello.
I want to make Class for 2D array.

This is part of Class Definition.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
class CMatrix{

public:
	explicit CMatrix( int rows, int cols ){
		m_pArray = new float[ rows * cols ];
		m_nRows  = rows;
		m_nCols  = cols;
	}

	~CMatrix(){
		delete [] m_pArray;
	}

	float& operator()( int i, int j ){
		return m_pArray[ (i + j * m_nRows) ];
	}

	float operator()(int i, int j) const{
		return m_pArray[ (i + j * m_nRows) ];
	}

	float *GetPr(){
		return m_pArray;
	}

private:
	float *m_pArray;
	int m_nRows, m_nCols;
};


I defined Operator() to access Array in CMatrix.
Operator() is defined as inline function.
So I thought that access time by using Operator() is similar access time by Pointer.
But Using Operator() is 2 times slower than Using Pointer (in Release Mode).

This is part of Comparing Direct Access and Operator Access.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
static inline long myclock();

int main()
{
	long t, dt;

	int nRows(16384), nCols(8192);
	CMatrix src( nRows, nCols ), dst( nRows, nCols );
	float *psrc( src.GetPr() ), *pdst( dst.GetPr() );

	t = myclock();
	for( int j = 0; j < nCols; j++ ){
		for( int i = 0; i < nRows; i++ )
		{
			dst( i, j ) = src( i, j );
		}
	}
	dt = myclock() - t;
	printf("Operator Access %ld.%ld s\n",  dt / 1000, dt % 1000);

	t = myclock();
	for( int j = 0; j < nCols; j++ ){
		for( int i = 0; i < nRows; i++ )
		{
			pdst[ i + j*nRows ] = psrc[ i + j*nRows ];
		}
	}
	dt = myclock() - t;
	printf("  Direct Access %ld.%ld s\n",  dt / 1000, dt % 1000);

	return 0;
}

static inline long myclock()
{
	struct timeval tv;
	gettimeofday (&tv, NULL);
	return (tv.tv_sec * 1000 + tv.tv_usec / 1000);
}


Why is Access time different?
Last edited on
First, compare
1
2
m_pArray[ (row + col * m_nRows) ];
pdst[ i + j*col ]


After making them equivalent, I could not explain the difference. So I swapped line 12 and 24... and got the same result (the first loop is slower)
Probably because operators are implemented as function calls. This would result in the CPU possibly having a few cache misses as it follows all the pointers around the place, which would cause a small difference in the time taken. Also keep in mind that pushing the argument values onto the stack and copying them to the function would add a bit of overhead. However, how much of a time difference are we talking about here? None of these things should have any noticeable impact in normal operations. Then again, you are repeating a LOT of times...
Last edited on
http://coliru.stacked-crooked.com/a/ef852a350554213a
operator 1.65023
direct 0.398849
operator 0.399388
direct 0.402081
operator 0.399494
direct 0.401038

I simply execute the tests several times.
Note how only the first time it takes a long time to execute.

You could also invert the test, putting direct access before operator. The result would be similar, only the first time it would take a lot of time.

That means that something is happening the first time, that it does not occur later (the engine was cold)
So your test is not adequate and the measures are not relevant.
Last edited on
Thank you.

If I compile this cpp using -O2, processing time of two methods is similiar.

operator 0.132604
  direct 0.129492


But
If I compile this cpp using -O3,Using Operator() is 2 times slower than Using Pointer .
 operator 0.13763
   direct 0.06625
Last edited on
1. Do not use the wall clock (gettimeofday(), clocks in <chrono>) to measure performance. These measure elapsed time and not processor time.

2. Accessing a large chunk of memory for the first time involves a penalty (cache misses); this can distort the results.

With this added, just before the tests
1
2
3
    // to avoid distortion in results due to cache misses 
    // when accessing the chunk of memory for the first time 
    std::uninitialized_copy( psrc, psrc+(nRows*nCols), pdst ) ;

and using std::clock() (approximate processor time):

echo 'clang++ -O2:' && clang++ -std=c++11 -stdlib=libc++ -O2 -Wall -Wextra -pedantic-errors main.cpp -lsupc++ && ./a.out
echo 'g++ -O2:' && g++-4.8 -std=c++11 -O2 -Wall -Wextra -pedantic-errors main.cpp && ./a.out
echo 'clang++ -O3:' && clang++ -std=c++11 -stdlib=libc++ -O3 -Wall -Wextra -pedantic-errors main.cpp -lsupc++ && ./a.out
echo 'g++ -O3:' && g++-4.8 -std=c++11 -O3 -Wall -Wextra -pedantic-errors main.cpp && ./a.out
clang++ -O2:
Operator Access: 0.24 secs.    Direct Access: 0.25 secs.

g++ -O2:
Operator Access: 0.4 secs.    Direct Access: 0.39 secs.

clang++ -O3:
Operator Access: 0.23 secs.    Direct Access: 0.24 secs.

g++ -O3:
Operator Access: 0.25 secs.    Direct Access: 0.23 secs.

http://coliru.stacked-crooked.com/a/91278343a0d4219f
Last edited on
Thank you.
Topic archived. No new replies allowed.