maps and ranged for, implicit copy??

I ran into this interesting behaviour when trying to feed a thread pool with a map of "jobs". In some cases an argument is implicitly copied and a const ref is returned. There's no warning this is happening, and reference to temporary issues pop up.

Here's a sample of some code that illustrates the condition:
<code>
#include <string>
#include <iostream>
#include <map>

struct
constructed
{
constructed()
{ std::cerr << "constructor" << std::endl; }

constructed(constructed const &)
{ std::cerr << "copy constructor" << std::endl; }
};

int
main
()
{
std::map<std::string, constructed> constructs;
constructs["first"] = constructed();

// by explicit value
std::cout << std::endl << "value_type" << std::endl;
for (std::map<std::string, constructed>::value_type const & val
: constructs ) { ; }

// does not invoke copy constructor
std::cout << std::endl << "strict type" << std::endl;
for (std::pair<std::string const, constructed> const & val
: constructs ) { ; }

// does not invoke copy constructor
std::cout << std::endl << "auto" << std::endl;
for (auto const & val : constructs ) { ; }

// invokes copy constructor
std::cout << std::endl << "non strict type" << std::endl;
for (std::pair<std::string, constructed> const & val
: constructs ) { ; }

return 0;
}
</code>
In the last loop, there is a type mismatch between the map's value_type and the type of "val", so val cannot bind directly to the element of the map. However, a reference to const is (it's a general rule in C++) allowed to be initialized with an object of a different type if a temporary can be constructed from that object, and that's what it does. Same as const int& n = 2.5;

This is one of the motivations behind recommending the use of auto type deduction where suitable.
Last edited on
We figured out what's going on, hence the program. However, due to maintainability issues we're not keen on using auto except very sparingly where the maintainer can actually deduce the loop type from the previous several lines.
The "implicit" behaviour is not wanted, we would prefer explicit behaviour so a typo won't force a hidden copy of the data.
For now we put ranged for on the forbidden use list due to this undesired behavior.
That's a good way to lose C++ programmers.
oh boy, lambdas do it too...can't use auto in those arg lists. the collection::value_type is the way to go there.

This is all new stuff with c++11. if you don't mind performance hits from excessive copies, possible corruption problems with hidden temporary creation, etc, then its all fine to use.
In the case discussed here, incorrect use of the for loop, justified by imaginary "maintainability issues" led to the unexpected temporary. Actual modern C++ leads to fewer copies and temporaries than C++03.
The "implicit" behaviour is not wanted, we would prefer explicit behaviour so a typo won't force a hidden copy of the data.

Ranged for loops + auto drastically reduce the opportunity for typos to affect the behavior of your for loop, both within the initialization/condition/update code and the body of the loop itself.
its still a hack that lowers code maintainability. The core problem that won't get fixed is the default implicit behavior of the language.

I realised that passing std::pair<int const, int> to a function with argument signature std::pair<int, int> results in a copy being generated via promotion, which also affects any class/function stl adapters used in c++03.
its still a hack that lowers code maintainability.

It's not clear what you're calling a hack.

Relevant: http://herbsutter.com/2013/08/12/gotw-94-solution-aaa-style-almost-always-auto/

gotw94 wrote:
As we saw in GotW #92 and #93 and will see again below, the main reasons to declare variables using auto are for correctness, performance, maintainability, and robustness—and, yes, convenience, but that’s in last place on the list.


(Emphasis mine.)
I have a personal distaste for the new use of the auto keyword, based solely on experience of working on code that is heaving with unnecessary use of auto. I say this to make it clear that I am not some kind of auto fanboy. I am, however, a great believer in using auto where it makes sense.

Ranged for loops are exactly the place where auto should be used. They make so much sense in that context. The majority of programmers are rubbish at working out what the correct type to use to avoid forcing unnecessary copies. Exactly as in the example code above; the unnecessary copy is because the programmer got the type wrong.

Mandating not to use ranged for loops because some prefer to be explicitly bad at their job rather than let the compiler help them get it correct is totally back to front and it is the kind of fist-bitingly painful management decision based on doctrine rather than understanding that makes competent programmers want to hurt themselves. If people have problems with it, the solution is education; not banning.

I have great sympathy for the idea that one should be able to tell what type something is by looking at the point of origin. I dislike seeing python code of the form
variable = functionOne()
for just this reason.

But the ranged for loop has the container of origin right there. Right there. Not even the "previous several lines", but the exact same line.

I don't agree with Herb Sutter that auto should be used everywhere, but it certainly does improve maintainability and robustness. "Explicitly stating a type" and "maintainability" are not the same thing at all.

Last edited on
Topic archived. No new replies allowed.