Bjarnnnnnnne!!! Seriously, WTF!

Just need to rant quickly, I can't help it:

Why does c++ have a hackish version of vector< bool >, which you cannot safely access from multiple threads, even if each thread is accessing unique indices? And why doesn't every C++ book and resource stress this oddity in bold letters in the opening paragraph.

This is just messed up.
Last edited on
When you say "access", do you mean read-only access?

PS: It's "Bjarne".
Last edited on
Well, in my case it was concurrent write, but each thread writing to their own exclusive section of the vector, at least so I thought.

1
2
3
4
5
6
7
//Not safe!
vector< bool > b(  100, false ); 
#pragma omp parallel for
for( int i = 0; i < 100; ++i )
{
    b[ i ] = true;
}


The C++ reference on this site, just says:

The pointer and iterator types used by the container are not necessarily neither pointers nor conforming iterators, although they shall simulate most of their expected behavior."

"Data races
Simultaneous access to different elements is not guaranteed to be thread-safe (as storage bytes may be shared by multiple bits).


http://www.cplusplus.com/reference/vector/vector-bool/

And cppreference says:

std::vector<bool> behaves similarly to std::vector, but in order to be space efficient, it:

- Does not necessarily store its elements as a contiguous array (so &v[0] + n != &v[n])
- Exposes class std::vector<bool>::reference as a method of accessing individual bits. In particular, objects of this class are returned by operator[] by value.
- Does not use std::allocator_traits::construct to construct bit values.


...

Since its representation may by optimized, std::vector<bool> does not necessarily meet all Container or SequenceContainer requirements. For example, because std::vector<bool>::iterator is implementation-defined, it may not satisfy the ForwardIterator requirement. Use of algorithms such as std::search that require ForwardIterators may result in either compile-time or run-time errors.

http://en.cppreference.com/w/cpp/container/vector_bool

Basically it's not really a vector, and doesn't comply with many things you expect of a vector, but it still uses the name vector? And treating it like a normal vector can lead to some very bad and potentially hard to trace bugs, and undefined behaviour. I assume it's a huge design flaw that we will be stuck with because changing it might break (and probably in even more cases fix) legacy code.
Last edited on
Well, in my case it was concurrent write, but each thread writing to their own exclusive section of the vector, at least so I thought.
Well, yeah. Concurrent writes even to different elements of an std::vector<bool> are obviously not going to be thread-safe. If you know the internal representation of std::vector<bool> you should know that.
If you know the internal representation of std::vector<bool> you should know that.

But it's implementation defined, so you can't assume to know what the internal representation is, nor what the behaviour will be if it's used like a normal vector. Anyways, the issue is that it's not really a vector. If you assume it is and treat it like one, you're in trouble. It's not even a container.
Last edited on
> Why does c++ have a hackish version of vector< bool > ...
> Basically it's not really a vector, and doesn't comply with many things you expect of a vector, but it still uses the name vector? ... I assume it's a huge design flaw that we will be stuck with ...

Scott Myers talks about a 'noble experiment that failed', which 'for one reason or another ... remained in the Standard'
Item 18. Avoid using vector<bool>.

As an STL container, there are really only two things wrong with vector<bool>. First, it's not an STL container. Second, it doesn't hold bools. Other than that, there's not much to object to.
...
You may wonder why vector<bool> is in the Standard, given that it's not a container and all. The answer has to do with a noble experiment that failed, ...
...
I mentioned earlier that proxy objects are often useful in C++ software development. The members of the C++ Standardization Committee were well aware of this, so they decided to develop vector<bool> as a demonstration of how the STL could support containers whose elements are accessed via proxies. With this example in the Standard, they reasoned, practitioners would have a ready reference for implementing their own proxy-based containers. Alas, what they discovered was that it was not possible to create proxy-based containers that satisfy all the requirements of STL containers. For one reason or another, however, their failed attempt at developing one remained in the Standard.

One can speculate on why vector<bool> was retained, but practically speaking, it doesn't matter. What does matter is this: vector<bool> doesn't satisfy the requirements of an STL container; you're best off not using it; and deque<bool> and bitset are alternative data structures that will almost certainly satisfy your need for the capabilities promised by vector<bool>.

- Scott Meyers in 'Effective STL'


There have been proposals to remove (or at least deprecate) the specialisation for vector<bool>
In 1998, admitting that the committee made a mistake was controversial. Since then Java has had to deprecate such significant portions of their libraries that the idea C++ would be ridiculed for deprecating a single minor template specialization seems quaint.
...
vector<bool> is unusual, however, because it is a template specialization. Other than the flip() member function, most practical functionality is the same as for the primary template. Thus removing the specialization will not necessarily break user code, although it may alter the performance characteristics.

I personally think that it would be preferable to remove the vector<bool> specialization rather than deprecate it, since the impact on users is so minimal, and since the sooner vector<bool> goes away, the better.

Beman Dawes 'Fixing vector<bool>' http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2160.html


Sutter calls it 'something a little embarrassing about the C++ standard library'
(If anyone else had written vector<bool>, it would have been called "nonconforming" and "nonstandard." Well, it's in the standard, so that makes it a little bit harder to call it those names at this point, but some of us try anyway in the hopes that it will eventually get cleaned up. The correct solution is to remove the vector<bool> specialization requirement so that vector<bool> really is a vector of plain old bools. Besides, it's mostly redundant: std::bitset was designed for this kind of thing.)
...
Bottom line: If you care more about speed than you do about size, you shouldn't use std::vector<bool>. Instead, you should hack around this optimization by using a std::vector<char> or the like instead, which is unfortunate but still the best you can do.

- Herb Sutter http://www.gotw.ca/gotw/050.htm
But it's implementation defined, so you can't assume to know what the internal representation is
It's well-known that bit packing is allowed for std::vector<bool> by the standard. It's also well-known that pre-C++14 std::strings allowed internal representations other than contiguous arrays. Writing portable code that assumes otherwise for either type is wrong.
cppreference does mention the issue of writing to different parts of vector<bool> from multiple threads, at the top-level container library page:

http://en.cppreference.com/w/cpp/container#Thread_safety

Different elements in the same container can be modified concurrently by different threads, except for the elements of std::vector<bool> (for example, a vector of std::future objects can be receiving values from multiple threads).

...and now it says that on /cpp/container/vector_bool, too.
Last edited on
For the sake of education...

The reason why you can't access a vector<bool> from multiple threads is because of the nature of how a CPU reads from each bit. It's not just reading that individual bit, it's reading the byte that contains that bit.

That said, most modern CPUs (in the past 15 years I think?) can atomically set a bit within an integer. For x86, there's a few ways to do this... one is the "bts" (TEST BIT and SET) instruction and Linux also uses the "orb" (OR byte) instruction (which I believe is what you'd generally use in this case), both needing a lock prefix. There's support all the way up to a 64-bit integer.

The problem here isn't underlying compiler support or lack of hardware support. The problem is being generic. vector<bool> in this case is a monstrosity...

I oddly don't see a library that already does this...? If anyone knows of one, please chime in. You can accomplish this using built-in compiler atomics as far as I know. Something like:

http://ideone.com/51wmQn
Last edited on
Topic archived. No new replies allowed.