What happens when you re-order/delete the vector you're iterating through?

Pages: 12
Hey guys,

Lets say I have a vector myVector = {1, 2, 1, 1, 2, 1, 1, 2, 1, 1, 2}; and I decide to, "For each 1 in the vector, add an additional 2 in the vector":
1
2
3
4
5
for(auto i = myVector.begin(); i != myVector.end(); i++)
{
   if(i == 1)
      myVector.push_back(2);
}


Straightforward? Good. Now lets assume I re-order and delete pieces of the vector *as* I'm iterating through it:
1
2
3
4
5
6
7
8
9
10
for(auto i = myVector.begin(); i != myVector.end(); i++)
{
   if(*i == 1)
      myVector.push_back(2);
      if(myVector.size() > 10)
      {
         std::random_shuffle(myVector.begin(), myVector.end());
         myVector.erase(myVector.begin(), myVector.end() + myVector.size() / 2);
      }
}


Questions:

1) What if I delete the element that the "iterator" is currently pointing to in the for loop?
2) If I reorder the vector as I'm iterating through it, how does it keep track of where the iterator should go next?

Last edited on
1) The iterator you have becomes invalid. Don't use it. Don't increment it. It's of no use to you anymore.
2) What does reorder mean? Changing the value of elements is fine.

Your first example (push_back) is also bad. Don't do that either.

Basically, if the vector changes (not the value of elements in it, but the vector), your existing iterators are no longer to be used.
Last edited on
Both the shuffle (obviously) and the erase invalidate iterators. And you're not using erase properly. It returns the new iterator to one-past the one just erased (which may of course be end()).
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
#include <iostream>
#include <vector>
using namespace std;

int main() {
    vector<int> v{1,2,1,1,2,1,1,2,1,1,2};
    
    // erase all 1's
    for (auto i = v.begin(); i != v.end(); )
        if (*i == 1)
            i = v.erase(i);
        else
            ++i;

    for (const auto& x: v)
        cout << x << ' ';
    cout << '\n';
}

I guess my question then becomes: Is there a safe way to re-order/delete pieces of a vector that you're currently iterating through? Or is that generally just asking for trouble and bad practice?
Both the shuffle (obviously) and the erase invalidate iterators.

Shuffle is fine. It does not invalidate any iterators.
Last edited on
Your first example (push_back) is also bad. Don't do that either.

Basically, if the vector changes (not the value of elements in it, but the vector), your existing iterators are no longer to be used.


Anyone have official documentation or a breaking example of this assertion? I've never heard this before and it's quite shocking. I'd think iterating through a vector and changing its length on iteration is a very common use-case.
Is there a safe way to re-order


You can change the values of the elements. That's reordering, and it doesn't do anything to any existing iterators.

Resizing the vector trashes iterators. Resizing can happen in a push_back.

https://stackoverflow.com/questions/6438086/iterator-invalidation-rules
Peter87 wrote:
Shuffle is fine. It does not invalidate any iterators.

I guess I see what you mean. All of the "elements" are in the same place, it's just that their values have changed. So all the iterators are sill good.

But it's a strange thing to be shuffling the elements of a container that you are iterating through. I can't think of a reasonable example of doing that.
Last edited on
Straightforward?

Maybe not. Line 3 should be if (*i == 1)

Is there a safe way to re-order/delete pieces of a vector that you're currently iterating through

Maybe I just can't find it, but I wish the reference section of this site was clear about the constraints on iterators. As tpb mentioned, erase() returns an iterator, so you can use that. For re-ordering, I'm not sure what "safe" would even mean. Keep the iterator valid? It probably still is, meaning that it refers to the n'th item. But is that what you want? It probably depends on the application.

Why not avoid the problem completely and just use an index instead?
OK, so it sounds like the general consensus is that re-ordering is fine on iteration, but adding and erasing elements on iteration is bad (unless you use the return value of erase).
Thaaaaaanks :)
push_back can be used while iterating through a vector but you need to be very careful. push_back always invalidates the .end() iterator which rules out using it inside a range-based for loop like your first example originally did. All the other iterators are only invalidated when the vector runs out of capacity (myVector.size() == myVector.capacity()) so you can make it work by reserving enough capacity before the loop.

1
2
3
4
5
6
7
8
9
myVector.reserve(myVector.size() + std::count(myVector.begin(), myVector.end(), 1));
for(auto i = myVector.begin(); i != myVector.end(); i++)
{
	if (*i == 1)
	{
		assert(myVector.size() < myVector.capacity()); // just to make sure we didn't do a mistake
		myVector.push_back(2);
	}
}
Last edited on
I would think using size() would be risky as well.

it *may* be possible to do something really weird with volatile and have it actually work. It may work by accident (undefined behavior?) ?? Not sure.

Eg
volatile vector<int> derp;
derp.push_back(-1);

for(i = 0; i < derp.size(); i++)
{
if(i < 100)
derp.push_back(i);
}

you can try that, I don't have time right this second... it may work, or it may explode... I really don't know! Volatile is a performance killer and should be avoided if at all possible, though.

Last edited on
Sometimes it's just easier to use indices because they are not invalidated as easily.

1
2
3
4
5
6
7
for (std::size_t i = 0; i < myVector.size(); i++)
{
	if (myVector[i] == 1)
	{
		myVector.push_back(2);
	}
}
I don't know how some of this stuff works ... does the compiler know when it can and when it can't optimize x.size() to a constant? I was trying to say the same but I thought you might need volatile to tell it to make the function call to size every iteration (???).

I would think using size() would be risky as well.

I don't think so. If you are suggesting that the call might be optimized away then that would be a bug in the compiler, since it has no right to assume that the loop block doesn't change the size. Also, I'm sure the call to size is just an inlined reference to the variable that holds the size anyway, so there's probably nothing to optimize. It's not like calling strlen on a C-string.
Last edited on
does the compiler know when it can and when it can't optimize x.size() to a constant?

If it doesn't know then it simply can't do that optimization.
Makes sense. Forget the volatile idea then :)
This is just so radical to me. Haha. So, given the cited source on Stackoverflow, it seems that if I want to insert into a container while iterating through it, I need to use std::list.
> it seems that if I want to insert into a container while iterating through it, I need to use std::list.

We can also use std::vector or std::deque. For example:

1
2
3
4
5
6
7
8
9
10
11
12
13
#include <iostream>
#include <vector>

int main()
{
    std::vector<int> vec { -2, -1, 0, 1, 2, 3, 4, 5, 10, 6, 5, 7, 5, 8, 9, 10, 11, 20, 12, 13, 14, 15, 16 } ;

    // insert 99 before every multiple of 5
    for( auto it = vec.begin() ; it != vec.end() ; ++it ) if( *it%5 == 0 ) it = ++vec.insert( it, 99 ) ;

    for( int v : vec ) std::cout << v << ' ' ;
    std::cout << '\n' ;
}

http://coliru.stacked-crooked.com/a/2d35e91c527e4da3
Pages: 12