Will compiler ignore the return statement?

From discussions of i++ vs ++i, I realize that the compiler will indeed be able to optimize and get rid of the return statement if i++ is not being assigned to anything when it's called.

My question is similar..

1
2
3
4
5
6
template <typename T>
T ExpressionTree<T>::set(const std::string& variable, const T& value)
{
    mVariableMap[variable] = value;
    return value;
}

Will the compiler be able to optimize and get rid of the return value in this situation, or not?
ex: if I just have the line be expression.set("X", some_function(x)); but I might also call the same function and use its return value for something else in another part of the program.

How would I tell if the final program is indeed getting rid of the useless return statement?
So you are basically saying,
case1:
expression.set("X", some_function(x)); ignore return value

case2:
T variable = expression.set("X", some_function(x)); use return value

Simple program to test the two cases
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
#include <iostream>
using namespace std;

int sum(int a, int b)
{
	return a+b;
}
int main()
{
	int a,b,s;
	a = 3;
	b = 5;
	sum(a,b);   //case 1, return value ignored
        2;  //this is just like case 1. This 2 does not affect the program in anyway
        s = sum(a,b);  //case 2 return value used
	cout<<"The sum of a and b is "<<s<<endl;
	cin.ignore();
	return 0;
}


The sum of a and b is 8
Yes, although my function is a bit different because on your code, the compiler could just completely take out line 13.
Assuming the compiler doesn't completely rip out line 13, would it still be able to take out the return value instruction?

(I do realize this will most likely end up just being some micro-optimization but I am still curious of the compiler's abilities.)
Last edited on
would it still be able to take out the return value instruction?

Do you want the compiler to remove instruction for return only?
Hi Ganado ,

Just wondering why you don't look at the optimised assembly code? Although you need to look at your function - template functions are different to regular functions.

This is my understanding of how it works, hopefully this could be confirmed by an actual expert, not a dabbler like me ! I am sure someone can quote the appropriate bit out of the standard.

In your function are you are worried that mVariableMap[variable] = value; won't be evaluated, if the function call is optimised? For this reason, I don't think the function call could be optimised.

As for optimising just the return , I don't think that would fly either, the compiler is going to create a function (once) for each type of parameter (or combination of) in a template function, and it is that code which is executed. So I doubt the compiler can optimise on a per call basis.

Hope all is well :+)
I'm back now. I see what you mean. I indeed was just concerned by the return. It doesn't really matter as I'm sure I wouldn't notice any difference even though the function is called a lot. I'll look at the assembly code if I have time.

Thanks for the replies, both you.
Gidday,

Another thought: why return the value anyway? You already sent it as a const reference. Just make the function void

Maybe your actual code is more complicated?
> Will the compiler be able to optimize and get rid of the return value in this situation, or not?

It is a small function. Typically, the call would be inlined.

http://www.informit.com/guides/content.aspx?g=cplusplus&seqNum=174
http://www.parashift.com/c++-faq-lite/inline-functions.html
http://www.drdobbs.com/inline-redux/184403879
Last edited on
Hi JLBorges,

Just curious, will the function still be inlined even though it is a template function, as the OP had it? Or were you referring to shadowCODE 's example?

I am struggling to see how a template function could be inlined at all:

Doesn't the compiler create an overload for each combination of argument types, so this means it couldn't be inlined? I mean inlining couldn't happen if the function call was in any kind of loop, or inside a function that is called repeatedly - which is almost certain given the purpose of this function.


Maybe this is an exception to the caveat of "Typically" :+) , more likely there is something I am missing :+D

With regard to all the other types of inlining in the articles - hopefully they do the right thing !

As noted above, I think the OP's function should be void

Regards
> Just curious, will the function still be inlined even though it is a template function, as the OP had it?

http://en.cppreference.com/w/cpp/language/as_if

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
#include <algorithm>
#include <memory>
#include <functional>

// templates:
//    contains, std::less, std::count, std::begin, std::end, std::minus, std::plus,
//    std::addressof, std::cref,  std::reference_wrapper::get
// 
// nested eight levels deep
//    1. contains 2. less 3. count 4. minus 5. plus 6. addressof 7. cref 8. get
template < typename T, std::size_t N > bool contains(  const T(&a)[N], const T& v )
{
    return std::less<T>() ( std::count( std::begin(a), std::end(a),
                                        std::minus<T>()( std::plus<T>() (
                                                                           *std::addressof( std::cref(v).get() ),
                                                                            15
                                                                        ),
                                                                5
                                                       )
                                      ),
                            23
                          ) ;
}

int foo( int v )
{
    int a[] = { 0, 1, 2, 3, 4, 5, 6, 7 } ;
    return contains( a, v ? 3 : 9 ) ? 1 : 99 ;

    /*
        movl $1, %eax
        ret
    */

    // return 1
}

http://coliru.stacked-crooked.com/a/d14172cd335796bc
Last edited on
@JLBorges

Thanks for your in depth reply :+) , as usual an absolute champion !!

I guess your point was to show that template functions can be inlined, but I wonder that there might be lots of cases where they can't be.

Have been ruminating a bit on this complex example (you have succeeded in keeping me quiet for quite awhile! ), as I see it as an analytical challenge:

foo operates on a local variable a[] which has hard coded values, and the values in the template function specification are hard coded.

But also to consider how many times foo is called; and what if v = 0 ? If foo is only called once, and / or v is never 0, then the result might always be the same. Or the result might be independent of the value of v. I haven't done any in depth analysis of the detail of contains, but am instead suggesting that if the input is the same, the result will be the same.

It might simply be that the compiler has determined that line 28 will always evaluate to 1 in this particular case.

I could be completely wrong, the compiler & JLBorges are vastly smarter than me :+D


Maybe I should start my own topic, but will break out my other computer & do a little experimenting of my own first, nothing like trying stuff oneself to improve the ones education.

Would like to test this:

If foo is a templated function, what happens when it is given different types in a loop?

1
2
3
for (int a = 0; a < 10; ++a) {
      foo( getNextValueWithADifferentTypeFromSomewhere() );
}

1
2
3
Another thought: why return the value anyway? You already sent it as a const reference. Just make the function void

Maybe your actual code is more complicated? 

No, it's not really more complicated. It was more so just an idea, but all it really does is save one line of code for no more efficiency, so it's pointless. :p
I am calling the function like this in some places:
expression.set("C", C);
But in another place like this:
Complex Z = expression.set("Z", expression.result());
From the user's point of view, the latter doesn't really make sense, so I agree that I should just make it a void function, and/or perhaps just make the map be public, or if I want to be fancy, overload the [] operator so the user doesn't have to do expression.map[whatever].
I'll probably just do something like this. =)
1
2
3
4
Z = expression.result();
expression.set("Z", Z);
// or
expression["Z"] = Z;


The as-if rule is interesting, I remember reading about that when discovering copy elision, but then had forgotten about it.

Feel free to keep talking about the complicated template stuff here ^^
Last edited on
> If foo is a templated function, what happens when it is given different types in a loop?

Nothing startling.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
#include <algorithm>
#include <memory>
#include <functional>

// templates:
//    contains, std::less, std::count, std::begin, std::end, std::minus, std::plus,
//    std::addressof, std::cref,  std::reference_wrapper::get
//
// nested eight levels deep
//    1. contains 2. less 3. count 4. minus 5. plus 6. addressof 7. cref 8. get
template < typename T, std::size_t N > bool contains(  const T(&a)[N], const T& v )
{
    return std::less<T>() ( std::count( std::begin(a), std::end(a),
                                        std::minus<T>()( std::plus<T>() (
                                                                           *std::addressof( std::cref(v).get() ),
                                                                            15
                                                                        ),
                                                                5
                                                       )
                                      ),
                            23
                          ) ;
}

template < typename T > int foo( const T* p, std::size_t sz )
{
    int a[] = { 0, 1, 2, 3, 4, 5, 6, 7 } ;
    long long  b[] = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 } ;

    int n = 0 ;
    for( std::size_t i = 0 ; i < sz ; ++i )
    {
        if( i%2 == 0 )
        {
            if( contains( a, p[i] ? 3 : 9 ) ) n += 100 ;
            else n -= 8 ;
        }

        else
        {
            if( contains( b, p[i] ? 4LL : 7LL ) ) n += 100 ;
            else n -= 4 ;
        }
    }

    return n ;
}

int bar()
{
    char cstr8[35] = "vjvjvjbvkkbkvkvvjvjvjvjvjvjvjvj" ;
    char16_t cstr16[56] = u"vujufgufufufy" ;
    char32_t cstr32[7] = U"yfyfy" ;
    wchar_t wstr[22] = L"fcyyfhyffhyhfh" ;
    int a[45] = { 1, 45, 78, 999 } ;
    double d[67] = {0} ;
    short s[61] = { 10, 11, 12 } ;
    
    int n = 0 ;
    for( int i = 0 ; i < 10 ; ++i ) 
    {
        n += foo( cstr8, 12 ) + foo( cstr16, 32 ) + foo( cstr32, 8 ) ;
        n -= foo( wstr, 12 ) + foo( a, 13 ) + foo( d, 7 ) ;
        
        int temp = foo( s, 10+i ) ;
        if( temp > 5 ) n += 500 ;
        else n *= 7 ;
    }
    
    while( n < 10000 ) n += 79 ;

    while( n > 100000 ) n -= 979 ;

    return n ;
    
    /*
        movl $25000, %eax
    	retq    
    */
    
    // return 25000 ; // 10 * ( (12+32+8-12-13-7) * 100 + 500 )
}

http://coliru.stacked-crooked.com/a/574227139a4c2caa
Hi JLBorges,

Crikey, I hope you don't think that I am trying to argue with you (good grief, that would be crazy :+D ).

TheIdeasMan wrote:
I could be completely wrong, the compiler & JLBorges are vastly smarter than me :+D


Instead, I try to ask questions in an attempt to further my education (& maybe others too). Also, it is a case of my critical thinking kicking in: what I am hearing doesn't match with my understanding, so I am trying to learn why. Or at least fill in gaps in my knowledge.

So in my last reply, I was trying to reason through to try and make sense of what was happening in a complex example.

What I am trying to get at:

if we have a template function like this:

1
2
3
4
5
template <typename T>
T foo(T a) {
   const T b = 5;  
   return a + b;
}


Now if we have some container which has pointers to int, float and double.

We iterate through the container, and call foo for each dereferenced item.

When the compiler encounters an int as a parameter, does it implicitly create a function where all the types are int? Similar for when it encounters floats or doubles - does it implicitly creates functions for those too, so now we have 3 overloaded functions that differ in their type?

Now if our container has 10,000 items in it, the compiler can call the appropriate function?

So I was thinking this function could not be inlined, because there are really 3 of them. That is, we couldn't have this:

1
2
3
4
5
for (;;) { //whatever looping / iterating construct
  foo(a); // int
  foo(a); // float
  foo(a); // double
}


But there definitions could be elsewhere, which is what I am thinking the complier does implicitly.

I can understand how a simple function could be inlined in a loop where the type is always the same. I guess that is what normally happens: there are multiple containers which have items of a particular type (different to the other containers), and there is one template function for all of them. But as I understand it the compiler still implicitly makes a function for each type even if it is inlined. I imagine this is a problem if there were 20 containers of int, and all the functions were inline - there is now 20 times more code. Now I imagine the compiler can work out a tipping point where the cost of extra code exceeds the cost of a function call.

Is that how it works, or do I have that all screwed up?

Ok, now some questions about your last example:

1. Are you saying that the entire barfunction is optimised down to:

1
2
 movl $25000, %eax
    	retq 

Or, is that just the asm for the return statement?

I am hoping it is not the latter.

2. Given my understanding above, are there multiple implementations of foo in the asm? One for each type of the first argument that is passed? Maybe it is still more efficient to inline them rather than have 7 function calls?

I could try to check that myself, but the last time I did asm was 20 years ago, and it was 16 bit DOS API - shall we say that things are a bit different now :+o
I do know how to compile to asm, and I could look for call instructions with foo in them, but frankly it easier to ask - I am sure you would know straight away. And your examples would have lots of code in them.

Anyway, it is late at my end - I need to be up for work in about 6 hours, so in about 20 hours time I can fool around with some simpler code, and see what I can learn.

Thanks for your help for today (and in advance)
> 1. Are you saying that the entire barfunction is optimised down to:
1
2
> movl $25000, %eax
> retq 


Yes.

In terms of C++, all the different instantiations and calls to contains, std::less, std::count, std::begin, std::end, std::minus, std::plus, std::addressof, std::cref, std::reference_wrapper::get, foo as well as the entire body of the function bar has been optimised to a single statement: return 25000;

Those 82 lines are compiled "as if" we had written only one line: int bar() { return 25000 ; }
..... has been optimised to a single statement: return 25000;


All righty - that's GOLD

BrowniePointsForYou += 1000; :+D

Right, so the compiler is doing more than I thought. Why should I be surprised? I knew there was stuff I didn't think that I knew; and I thought there was stuff that I didn't know. :+D

So to be sure, to be sure - I imagine if all those individual values only became available at runtime (the normal situation - info coming from files, GUI Dialogs etc), then the compiler couldn't optimise much of it, because it depends on what the values are as to what happens in functions. I guess the compiler can sometimes do a lot, but other times not as much.

Your points about optimising and inlining are well taken though, much has been learnt.

Anyway regards, I will sleep well tonight.
Topic archived. No new replies allowed.