Returning a local variable from a function

Hi all,

What's the obvious rule about returning a local value, be it a normal variable or pointer, from a function?

Also, please take a look at this code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
#include <iostream>
using namespace std;

int* f() {
	int i = 6;
	int* p = &i;
	return p;
}

int main()
{
	cout << *f() << endl;
	
	system("pause");
	return 0;
}


Here, when the local pointer p goes out of scope, and its resource (allocated space on memory) is freed making it a dangling pointer inside main. But still *f() will print the correct value of the resource which has already been freed!

That's so strange for me!
Last edited on
> But still *f() will print the correct value of the resource which has already been freed!
That's just dumb luck.

You're basically getting lucky because the size of the object is small, and you're using the value immediately.

But consider say a larger object
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
#include <iostream>
using namespace std;

char * f() {
    char msg[] = "this is a long string of some letters and other stuff";
    return msg;
}

int main()
{
	cout << f() << endl;
	
	system("pause");
	return 0;
}


or increasing the separation between the object and it's final usage
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
#include <iostream>
using namespace std;

int* f() {
	int i = 6;
	int* p = &i;
	return p;
}

int main()
{
	int *p = f();
	system("pause");  // or anything to separate instance from use
	cout << *p << endl;
	system("pause");
	return 0;
}


Sooner or later, it will break.

Also, this.

-fno-defer-pop
Always pop the arguments to each function call as soon as that function returns. For machines that must pop arguments after a function call, the compiler normally lets arguments accumulate on
the stack for several function calls and pops them all at once.

Disabled at levels -O, -O2, -O3, -Os.

GCC would keep the memory of the local variable 'alive' long enough for you to believe that you were being a good coder.
You rent a hotel room. You put a book in the top drawer of the bedside table and go to sleep. You check out the next morning, but "forget" to give back your key. You steal the key!

A week later, you return to the hotel, do not check in, sneak into your old room with your stolen key, and look in the drawer. Your book is still there. Astonishing!

How can that be? Aren't the contents of a hotel room drawer inaccessible if you haven't rented the room?

^Good description and thought it was funny when I read it from here:
https://stackoverflow.com/questions/6441218/can-a-local-variables-memory-be-accessed-outside-its-scope/6445794


You're looking for the value you assigned to that memory location. Even though the memory has been freed, your program accessed it anyway and the previous value assigned to that memory location was still there. If the memory had been reallocated somewhere else by the time you tried to access it, the OS would have shutdown your program.

I'm not to sure about the nature of it though. Is it possible for the program to access any memory location as long as it's free? If not, how come recently deallocated memory by the program is special? It's also possible that the compiler simply doesn't deallocate "i" because it knows that it's going to cout it later.

Read:

http://www.cplusplus.com/forum/beginner/256034/#msg1121219
@frek,

I just had this conversation with the same example elsewhere.

This can illustrate what's wrong, as others have posted.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
#include <iostream>
using namespace std;

int* f() {
	int i = 6;
	int* p = &i;
	return p;
}

void f2()
{
 int x = 0;
}

int main()
{
	int *p = f();
	system("pause");  // or anything to separate instance from use
	cout << *p << endl;
     
        f2();
	cout << *p << endl;

	system("pause");
	return 0;
}


If you first test this code in a debugger, and assuming that debugger doesn't complain to you about stack corruption or at least lets you continue, here's what you'll witness:

As before, "*p" will still show 6. Then, the call to f2 will create "x", initializing it to zero.

At which point the next cout of the same "*p" will show 0, because "x" occupies the same place on the stack as "i" once did.

If you then observe this same code in an optimized build, it will behave differently, likely because the two functions f() and f2() will be inlined, so no stack manipulations will happen.

The "obvious rule" is return locals by value so they are copied, and if you must return a pointer, it should be a location calculated as a position inside RAM already owned elsewhere, or if it does represent allocated memory note this places responsibility for ownership on calling code.

Just because this snippet demonstrates curious behavior doesn't mean there's anything "there" to be considered.

It is just the nature of the functioning of the stack, which you could observe using a debugger to better understand what is really happening.

This is also the kind of behavior and observation upon which viruses are designed.
Last edited on
you can make the local static and return it safely, but then the function may not (read:probably not) be thread safe.

for consistency it is better to pass it in by reference and set it than return something if you want this behavior, but the static thing works nicely if you really, really want a y=f(x) type style to your code and are single threading.
Last edited on
What's the obvious rule about returning a local value

If you need to return a value, then return a value:
1
2
3
4
5
6
7
8
9
10
#include <iostream>

int f() {
	int i = 6;
	return i;
}

int main() {
	std::cout << f() << '\n';
}



If someone commits a crime, but hasn't (yet) been caught, then does that mean that their action was legal? No.
jonnin wrote:
you can make the local static and return it safely, but then the function may not (read:probably not) be thread safe.
That's guaranteed to be thread-safe (as of C++11, until then nothing was guaranteed)
Just to be sure I read that right: if you run 2 copies of the same function in 2 threads and they both return a static reference variable's result, it is assured to be correct? If that is what you just said ... that may be the coolest thing I have learned about new c++ to date!
Last edited on
Just making it thread safe doesn't make it re-entrant.
1
2
3
4
5
6
7
8
9
int* f(int x) {
	static int i = x;
	return &i;
}
int main( ) {
	int *p1 = f(123);
	int *p2 = f(456);
	cout << *p1 << ", " << *p2 << endl;
}

The hidden private buffer is always going to be a PITA.
jonnin wrote:
if you run 2 copies of the same function in 2 threads and they both return a static reference variable's result, it is assured to be correct?

initialization of static local variables is thread-safe (even if they throw exceptions or whatever)
1
2
3
4
5
6
7
8
9
10
11
#include <thread>
#include <vector>
int& stuff() {
  static int var = printf("will only print once");
  return var;
}
int main() {
  std::vector<std::thread> v;
  for(int n = 0; n<100; ++n) v.emplace_back(stuff);
  for(auto& t: v) t.join();
}

to be fair, it's just the initialization; this one would still be unsafe
1
2
3
  static int var = printf("will only print once"); // safe
  ++var; // not safe
  return var;

Thanks. I need to play with this.
@zapshe
Thank you for the link, I got it to a good extent. The one who gave the accepted answer there, merely addressed that question to both answer the question and also hit C++ in favour of c#. I wish those numerous C++ experts would have appropriate responses to him there. Also, the answer really didn't reserve that top level of positive votes; he himself was shocked. It happens as a habit of stackoverflow. I'm sure if I gave such an answer there in place of him, I wouldn't get more than 10 positive votes. But since he is an expert, regardless of the answer, he will be awarded this way!
@niccolo,
thank you.

I understood your explanations. But a side question, do you yourself ever use inline functions? If yes where and when mostly? They seem to embrace numerous defects!

Also, will you explain your last line a bit more, please, in simple language?

Last edited on
@frek,

I frequently use inline functions. Whenever an algorithm starts to become to long it is often useful to wrap simple operations into an inline function, which tends to improve the ability to describe what is happening in code rather than commentary. For example,

1
2
3
4
5
6
7
8
9
10
11
12
inline bool FindLf( char *& p, char *e )
{
  // since all UTF-8 bytes > ASCII values, this works for UTF-8
  while( p < e )
     { if ( *p == '\n' || *p == 0 ) 
          { return true;
          }
       ++p;
     }

 return false;
}


This trivial (and simplified example) originally appeared within a longer function. The purpose was merely to find a line feed or null character in a string. This is code from maybe 20 years ago, pre-C++11 era. If left within the function that initially used it, the function was longer and commentary described why it was there.

Moving this into an inline function didn't impact performance, but left the space in code it previously occupied with merely a call to FindLf, somewhat self explanitory. Not only is it an inline, it is a non-member function, making it applicable to other classes.

If the line feed isn't found, the function returns false and the pointer "p" ends up moved appropriately. If the line feed is found, the function returns true and again the pointer "p" is placed appropriately.

The inline keyword first and foremost declares that the code is probably in a header file, is likely short, and the linker should expect to find duplicates that should automatically resolve to a single representation (avoiding a linker warning/error about duplicate functions). The inline keyword is barely a hint to the compiler about emitting object code as an inline optimization. The compiler will generate inline object code as an optimization for any function already processed which qualifies, even when not declared inline, as long as size is not the optimization priority.

As to the "virus" writing implication, any code which acts upon undefined behavior, which uses memory otherwise not under explicit control, is an attack point (a vulnerability) which opens up an exploitation opportunity for nefarious actors.



I used to use inline / forceinline keywords a lot, but a long time ago, compilers were stupid about optimizations and you could get a lot of performance with these ideas. I have not been able to beat the compiler's speed with these tricks in quite some timeā€¦ maybe as far bas as 2010ish, or VS 2008 era...

My advice on it is to let the compiler decide until you have a real time need to try to hand-tune / force the code to try to make it faster on a specific target ecosystem (compiler/OS/hardware target are the usual group).

If you need it, compile it both ways and profile the difference.
Topic archived. No new replies allowed.