Control Optimization Problems

Hi,

I'm compiling my program using various optimization options. I noticed that whenever I compile my code using -O0 optimization, my program runs fine, and produces the correct results. However, any optimization levels above -O0 causes my program to produce incorrect results. I'm assuming my results, or internal results, are being optimized out. But I don't know how to resolve this problem. How can I incorporate higher control optimization levels without it impacting my final results?

Here's a sample of my make file that I'm compiling (this produces the correct results):

g++ -std=c++11 -o sim -O0 -Wall -lm main.o file1.o file2.o file3.o file4.o file5.o

This and higher levels produces incorrect final results:
g++ -std=c++11 -o sim -O1 -Wall -lm main.o file1.o file2.o file3.o file4.o file5.o
Last edited on
I would like to know this as well: In my own code, I always leave out the optimization. It is much harder to debug code that has been optimized.

Good luck!
I'd have to see the program you are running, but the only thing I can think of that would cause that is that you are relying on UB somewhere.
@koothkeeper: Thanks! I'll keep looking to figure out what's wrong, but I hope someone in the forum whose experienced this themselves might know the solution to this problem.

@firedraco: I don't know what UB is. I know when I was initially debugging my code in Eclipse 3.8 that when I'd step through to observe my programs values that certain sections listed the value as <optimized out> or 0 when my optimization level was set to -O3. Here's are two of the sections that I noticed this happening in that caused me to lower the levels in order to see the generated values:

code #1:
1
2
3
4
5
6
vector<uint32_t>::iterator it = find(table.at(_index).begin(), table.at(_index).end(), 0);	// store PC at invalid location
		if (it != table.at(_index).end())
		{
			auto pos = distance(table.at(_index).begin(), it);   // <--- "pos" is <optimized out>
			table.at(_index).at(pos) = _tag;
		}


code #2: returns from function with a value of 0
1
2
3
4
5
6
7
8
9
10
11
12
13
 
calcIndex(string& address)
{
	uint32_t dec;

	// Convert hex address to decimal integer
	istringstream convert(address);
	convert >> hex >> dec;

	// Calculate tag value
	_tag = dec >> (_iBits + 2);
	_index = ((_imask & dec) >> 2) & ((uint32_t)pow(2, _iBits) - 1);
}
UB is short for undefined behavior, things like dereferencing invalid pointers, writing to memory that you don't own, etc.

code #1:
pos is being removed because you can simply do:
table.at(_index).at(distance(table.at(_index).begin(), it)) = _tag;

code #2:
I don't see a return value on that function (you need one), and I don't see your returning anything from that function anyway.
code #1: Why use index at all when you already have the iterator?


I thought that GCC has options to add debugging symbols into code semi-independent of optimizations.


code #2, like firedraco said, not enough context and possible UB.
code #1:
The iterator is used to search a 2D table. So at that index location, I have n elements that I need to search through to find a value. So what I posted is a very very very small sample of a bigger program. But the "pos", which I use to tell me which element location 0 is found, so I can use it to replace that location. Also, its a vector<vector<uint32_t> > table;

code #2:
Its not explicitly designed to return a value. Its a function that I use to initialize a member variable that's accessible globally in my class. So _index and _tag are my member variables that I'm initializing in the function. When I call the function 1, from another function 2, and it returns from function 1, with the -O3 optimizations _index is set to 0. Yet, without the optimization controls, -O0, it sets _index to the correct value.
code #1:
I don't see where you are using _pos at all in that loop aside from that one location immediately afterward. You realize _pos is local to the loop scope, right? You aren't trying to modify some global variable called _pos?

code #2:
At first glance, I don't see anything wrong with that function (aside from the missing void). Maybe try breaking out the assignment into multiple steps and see what is happening in non-optimized/optimized mode?
code #1:
These are just snippets from my overall code, its not a post of the entire program. pos is supposed to be local to that loop, its only purpose is to find an element location pertaining to that index location in the table.

code #2:
Just like with code #1, its a snippet, its not the whole code. I just posted the function. Its a void function.

But this thread is trending away from the purpose of me posting my question to the forum. I'm not asking for a code review, just how to overcome the optimization issues that are affecting my program. My snippet posts are just examples of what I observed while debugging my code. I was researching into function attributes that would help alter the optimization setting for a function, but most of the posts only applied to GCC 4.2 and I'm running 4.8.2. So if any further posts could remain on topic, in offering solutions to control optimizations, that would be great. But once again, this isn't a code review of my program. What I presented is merely a glimpse into a much larger program. The snippets make up only 1% of my overall program, but are the only portions of my program affected by the optimization controls.
But the "pos", which I use to tell me which element location 0 is found, so I can use it to replace that location.
1
2
3
4
5
auto it = std::find( foo.begin(), foo.end(), 0 );	// find element with value 0
if ( foo.end() != it ) // if such element exists
{
  *it = _tag; // change value of that element
}

Why?
1
2
3
4
5
auto pos = std::distance( foo.begin(), it );

assert( foo.begin() + pos == it );
assert( *(foo.begin() + pos) == foo.at(pos) );
assert( *it == foo.at(pos) );


code #2:
At first glance, I don't see anything wrong with that function (aside from the missing void)

Agreed, although one would expect either
1
2
3
4
5
6
7
8
9
10
11
12
class Foo {
  // code

  void calcIndex(string& address) {
    // code
  }
};

// or
void Foo::calcIndex(string& address) {
  // code
}

It is, after all a member (or friend).


Could the result of pow() be cached in object rather than recomputed?
An explicit loop can compute integral power without messing with floats.

This is scary:
1
2
3
4
5
using namespace std;
uint32_t dec;
istringstream convert( address );
convert >> hex; // std::hex
convert >> dec; // not std::dec 

There is a reason for having namespaces and it is opposite of obfuscation.
@gradstud -

Perhaps this flag (-flto) will solve your problem: https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html#index-flto-1059
But this thread is trending away from the purpose of me posting my question to the forum. I'm not asking for a code review, just how to overcome the optimization issues that are affecting my program. My snippet posts are just examples of what I observed while debugging my code.


I think you may have misunderstood the significance of some of firedraco's points above. To return to code #1:

1
2
3
4
5
if (it != table.at(_index).end())
{
	auto pos = distance(table.at(_index).begin(), it);   // <--- "pos" is <optimized out>
	table.at(_index).at(pos) = _tag;
}


Without any optimizations on the auto pos line one would expect the compiler to allocate memory for the variable pos and then initialize pos with the return value from the distance function which could be on the stack or even in a register (we neither know nor care; that's the compiler's problem). On the next line the pos variable must be read and passed as a parameter to the at function. When you turn optimizations on the compiler does not even need to allocate any memory for pos but treats the code as though you had written:

table.at(_index).at(distance(table.at(_index).begin(), it)) = _tag;

as firedraco showed above. The result is exactly the same. Note that there is nothing wrong with your original code and it is actually clearer than writing it all on a single line with all the nested function calls and it doesn't hurt performance since the compiler will optimize pos away. That optimization is not the cause of your problem.

Now consider the following scenario:

Suppose elsewhere in your program you have an array and are accessing the array through a pointer, but your code has a logic error and your initialization of the pointer has a one-off error and eventually your pointer points one element past the end of array which is then dereferenced (undefined behavior). Further suppose that some other variable in your program is stored just past the end of the array, perhaps a variable like pos.
When you run the program in debug mode you get one result, but when you run the program with optimizations turned on you get a different result because the variable that was stored just past the end of the array was optimized away or just because memory is laid out differently.

In getting different results in debug and with optimizations turned on you appear to be interpreting that outcome to mean:

"There's something wrong with the way the compiler does optimizations, so I need to know how to fix the problem with a switch or compiler setting."

But that is extremely unlikely to be the case. Instead, you should interpret what is happening as a warning that there is something wrong with your code. The example of reading past the end of an array was just to show how undefined behavior could lead to your problem. It could easily be that you are relying on undefined behavior, (as firedraco pointed out above) i.e., you are using some construct that you think works a particular way, whereas the standard says the behavior of that construct is undefined.

But this thread is trending away from the purpose of me posting my question to the forum. I'm not asking for a code review, just how to overcome the optimization issues that are affecting my program.

Unfortunately,a code review is probably what you are going to need to do.
Good luck.
Topic archived. No new replies allowed.