Confused about vectors

Greetings all,

I've been writing C++ for a very long time, and I've used vector<> objects quite a bit but I just became flabbergasted at something I never noticed about their fundamental behavior. If I do this:

1
2
3
vector<int> test;
test[40] = 20;
printf("%i\n", test[40]);


Then I get a segfault, which is exactly what I would expect, because I'm trying to assign the 40th element of an empty thing. However, if I do this:

1
2
3
4
vector<int> test;
test.push_back(1);
test[40] = 20;
printf("%i\n", test[40]);


Now it prints 20 and exits cleanly. I don't understand that, since the vector only has one element in it. And even if I add a clear() line after my push_back, as follows:


1
2
3
4
5
vector<int> test;
test.push_back(1);
test.clear();
test[40] = 20;
printf("%i\n", test[40]);


It still prints 20 and exits cleanly! Can anyone tell me why this is so??

TIA,
Caleb
Many implementations allocate underlying storage that's bigger than one int when you push_back just one int. E.g. RW-based libraries make the first allocation for 32 elements. Provide your own allocator that prints out the allocation requests and see what your library does.

No implementation (that I know) deallocates on clear(). That's what swap-with-temp and shrink_to_fit() are for.

That said, since accessing an array out of bounds is undefined behavior, both outcomes (segfault and clean exit) are correct.
Last edited on
Can anyone tell me why this is so??


A segFault happens when you try to read/write memory that the operating system has not set aside for use by your program. When the vector is of size zero, the "address" of the first element is probably zero (i.e. internally a null pointer) so when you try to dereference location 40, you're asking for memory way out of your allocation - segFault.

When the vector has a size, the first element will be somewhere in your allocated memory and 40 spaces along probably will be too (and depending on your implementation, could be in a section of memory set aside for future use by that vector, but could equally just be some other data, so you'll trash your own data - this is worse than a segFault). Try running this and see if you get the idea:

1
2
3
4
5
6
7
8
9
10
11
12
#include <vector>
#include <iostream>

using namespace std;

int main()
{
vector<int> test;
cout << "Address of test[0] = " << &(test[0]) << endl;
test.push_back(1);
cout << "Address of test[0] = " << &(test[0]) << endl;
}
Last edited on
Indeed. With further testing, I've determined that pushing back a single element, at least on my machine/environment and using ints, causes the vector to stretch to a size of 33789 -- maximum index of 33788. What a large (and strange) number. I suppose the library is assuming (or accurately perceiving) a reasonably modern/powerful machine, but it still strikes me as weird.
Correction to my last post: I theorize that what's happening is that it gives me supposedly (but not really) clean results until I hit memory which is not just unallocated but is allocated to another process, *then* I get a segfault.

Moschops: I know what a segmentation fault is. :P I guess what my question really ends up coming down to is the following:

Why does a high-level object like a C++ standard library "vector" not throw a clean error if you try to access an element that's beyond its correct boundaries? It really wouldn't be hard for the internal code to check the element index you're trying to access against the size of the vector, would it?
It really wouldn't be hard for the internal code to check the element index you're trying to access against the size of the vector, would it?


It wouldn't be hard, but it would violate the C++ (and C) principle that you shouldn't have to pay for what you don't use. Bounds checking costs cycles. It's a price I shouldn't have to pay if I don't need it - the onus is on me to not need it.
The subscript operator for the vector behaves the same way as for arrays. If you need of some sort of checking as generating an exxeption you can use member function at().
Caleb9849 wrote:
Why does a high-level object like a C++ standard library "vector" not throw a clean error if you try to access an element that's beyond its correct boundaries?

What operator[] simply does in each and every library implementation is: access the location n positions after the beggining of the internal array(n is the argument to operator[]).
For range-checks use the at() function!
Often there are ways to turn on bound checking for operator[]. GCC will do bound checking if you define _GLIBCXX_DEBUG (pass -D_GLIBCXX_DEBUG as a compiler flag).
Last edited on
Those answers all make sense. Thanks very much :)
Topic archived. No new replies allowed.