Strange execution C++ code. Any ideas ?

I have two versions the same code (10 million iterations):

First version.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
bool HArrayFixRAM::insert(uint* key, uint value)
{
	if (pLastContentCell + RowLen > pLastContentCellOnLastPage)
	{
		ContentPage* pLastContentPage = new ContentPage();

		pContentPages[ContentPagesCount++] = pLastContentPage;

		if (ContentPagesCount == ContentPagesSize)
		{
			reallocateContentPages();
		}

		pLastContentCell = pLastContentPage->pContent;
		pLastContentCellOnLastPage = pLastContentCell + MAX_SHORT;
	}

	//insert value ============
	uint keyOffset = 0;

	uint headerOffset = key[0] >> HeaderBits;
	ContentCell* pContentCell = pHeader[headerOffset];

        if (!pContentCell)
	{
		pHeader[headerOffset] = pLastContentCell;

		pLastContentCell->Type = (ONLY_CONTENT_TYPE + KeyLen);

		//fill key
		for (; keyOffset < KeyLen; keyOffset++, pLastContentCell++)
		{
			pLastContentCell->Value = key[keyOffset];
		}

		pLastContentCell->Value = value;
		pLastContentCell++;

		return true;
	}
	
	return true;


Execute time: 850msec

Second version:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
bool HArrayFixRAM::insert(uint* key, uint value)
{
	if (pLastContentCell + RowLen > pLastContentCellOnLastPage)
	{
		ContentPage* pLastContentPage = new ContentPage();

		pContentPages[ContentPagesCount++] = pLastContentPage;

		if (ContentPagesCount == ContentPagesSize)
		{
			reallocateContentPages();
		}

		pLastContentCell = pLastContentPage->pContent;
		pLastContentCellOnLastPage = pLastContentCell + MAX_SHORT;
	}

	//insert value ============
	uint keyOffset = 0;

	uint headerOffset = key[0] >> HeaderBits;
	ContentCell* pContentCell = pHeader[headerOffset];

//COMMENTED HERE, BLOCK WILL BE EXECUTED ALWAYS
//if (!pContentCell)
	{
		pHeader[headerOffset] = pLastContentCell;

		pLastContentCell->Type = (ONLY_CONTENT_TYPE + KeyLen);

		//fill key
		for (; keyOffset < KeyLen; keyOffset++, pLastContentCell++)
		{
			pLastContentCell->Value = key[keyOffset];
		}

		pLastContentCell->Value = value;
		pLastContentCell++;

		return true;
	}
	
	return true;


Execute time: 405msec

It's really strange. By logic should be vice versa. Second piece of code on one block execution longer but in twice works faster !
Any ideas fellows ?
Last edited on
compare the assembly and cpu clocks per call, not the C++ or wall clock time.

I don't see anything obvious... could be something compiler side like removal of the condition tripped the heuristic for qualification to inline. Could be something cool going on with the cache strategy. Did you test it a bunch of times? Could be the computer decided to check to see if the internet was still there in the middle of the first one.

I am not sure, but probably I realized what happens.
This condition change semantic of algorithm in the root.
If we have condition

if (!pContentCell)

then we need execute all iterations before, because we never know, previous iterations will changed state of array cell or no.

If there is no this condition, then many iterations could be paralleled before. Because anyway last iteration will overwrite array cell.
Topic archived. No new replies allowed.