Resizable Array Class (Not Vector)?

I am looking to create an object which, generally speaking, behaves like 'std::array' in the sense the internal array has a generally fixed length but which can be resized to a specifically chosen length. While I know there is 'std::vector', benchmarking shows it to be on the order of about 1/3 slower than 'std::array' and tests with 'std::deque' usually perform slower. Objects of this class will not need to be dynamically resized upon adding another item but, at times, the objects will need, essentially, to have their contents discarded and the array resized to from, for example, 14 elements to 18 elements. I'd like to keep all the functionality of 'std::array' if possible; just the ability to clear-and-resize the array added. Can Anyone point Me in the direction of a class which can meet these requirements?

Thanks.
std::vector has (potentially) 2 levels of indirection:

1) Dereferencing the 'this' pointer
2) Indexing the array itself


std::array has (potentially) 1 level of indirection:

1) Dereferencing the 'this' pointer



std::array does not need to have indirection for it's array indexing because it actually contains the array and not merely a pointer to an array. This is why std::array can often outperform std::vector in high performance situations.

Any kind of dynamically sized container is going to need to have that 2nd level of indirection.


So in short... there isn't likely to be anything that is going to outperform vector while still being resizable. If there were a way to make it faster... vector would already be doing it.


One thing you might consider if performance is such a grave concern... is use std::array but make it large enough to hold the biggest conceivable amount of data. Then you keep a separate 'size' variable and "resize" by just using more or less of the array.

Of course this might not actually be faster because larger memory block might mean more cache misses.




Though I'm curious as to what you're doing that vector does not perform well enough for you.
Actually, looking at My library's implementation of 'std::array' and 'std::vector', You seem to have mischaracterized the two a bit but I think that point is somewhat irrelevant.

My problem is, when adding an item to a vector, a check is made to see if the internal array has to be reallocated and, if so, do so and then copy each and every element. Conversely, I want to be able to say:
1
2
3
4
5
6
7
8
9
10
11
12
resizableArray_t ra;
ra.setSize(64);    // temporarily set a hard limit of 64 elements
// ...
// do stuff where I must not access beyond index # 63
// ...
ra.resize(128);    // ditch data at this point and temporarily set a hard limit of 128 elements
// ...
// do stuff where I must not access beyond index # 127
// ...
ra.resize(32);    // ditch data at this point and temporarily set a hard limit of 32 elements
// ...
// and so on 


As far as what I am doing, I am running a set of simulations, each of which require such resizing. The number of simulations literally numbers in the millions. So, every second I can save is extremely helpful. (Note: Before Anyone says, "just use an optimizer", I have run into enough problems with optimizers to know to never trust one with respect to speed but that's a separate issue.)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
const std::size_t ESIMATED_MAXSZ = 1024 ;
std::vector<int> ra(ESIMATED_MAXSZ) ;

ra.resize(64);    // temporarily set a hard limit of 64 elements
// ...
// do stuff where I must not access beyond index # 63
// ...

// ditch data at this point and temporarily set a hard limit of 128 elements
ra.clear() ;
ra.resize(128);
// ...
// do stuff where I must not access beyond index # 127
// ...

// ditch data at this point and temporarily set a hard limit of 32 elements
ra.clear() ;
ra.resize(32);
// ...
// and so on 

Actually, looking at My library's implementation of 'std::array' and 'std::vector', You seem to have mischaracterized the two a bit but I think that point is somewhat irrelevant.


I suppose it depends on the implementation, but most implement it more or less how I described. See this bit on ideone:

http://ideone.com/PY8BOm

It's not irrelevant because it's the key difference between the two classes and is why they can never have equal performance.

If you could make a class that performed as well as std::array but also be resizable... then std::vector would be useless and nobody would bother with it.

You might be able to write your own dynamic array class which does some shortcuts in how elements are constructed/destructed when the array is resized. And in that area you might be able to outperform vector. Though I wouldn't expect to gain much ground there.


My problem is, when adding an item to a vector, a check is made to see if the internal array has to be reallocated and, if so, do so and then copy each and every element. Conversely, I want to be able to say:


You can do what JLBorges' suggests and resize the vector upfront so it doesn't need to reallocate and copy later (or at least... it'll make it less likely that it'll have to).

Or you can use 'reserve' which does more or less the same thing (allocates space without actually changing the number of elements in the vector).

Reserve actually exists for this exact reason.
Last edited on
Any marginal difference in performance between std::vector<> and a std::array<> (for equivalent functionality) is because the number of elements in a std::array<> is a constant known at compile time. The extra level of indirection is usually irrelevant; it needs to be done at most once per function.

std::vector<> performance would be comparable to that of a c-style array passed to a function (where the number of elements in the array is not hard-coded).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
#include <vector>

int foo( const std::vector<int>& vector )
{
    int s = 0 ;
    for( std::size_t i  = 0 ; i < vector.size() ; ++i ) s += vector[i] ;
    return s ;
}

int foo( const int carray[], std::size_t size )
{
    int s = 0 ;
    for( std::size_t i  = 0 ; i < size ; ++i ) s += carray[i] ;
    return s ;
}

http://coliru.stacked-crooked.com/a/fd2449d5921648c7

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
#include <vector>
#include <memory>
#include <ctime>
#include <iostream>


int foo( const std::vector<int>& vector ) ;
int foo( const int carray[], std::size_t size ) ;

int main()
{
    constexpr std::size_t N = 1024*1024*32 ;
    constexpr std::size_t M = 32 ;
    
    int a = 0 ;
    {
        std::vector<int> seq(N) ;
        const auto start = std::clock() ;
        for( std::size_t i = 0 ; i < M ; ++i ) a += foo(seq) ;
        const auto end = std::clock() ;
        std::cout << "vector: " << ( end - start ) / double(CLOCKS_PER_SEC) * 1000 << " millisecs.\n" ;
    }
        
    int b = 0 ;
    {
        std::unique_ptr< int[] > carray( new int[N]{} ) ;
        const auto start = std::clock() ;
        for( std::size_t i = 0 ; i < M ; ++i ) b += foo( carray.get(), N ) ;
        const auto end = std::clock() ;
        std::cout << "carray: " << ( end - start ) / double(CLOCKS_PER_SEC) * 1000 << " millisecs.\n" ;
    }

    return a + b ;
}

http://coliru.stacked-crooked.com/a/4519ae4ab662ea2e
Any marginal difference in performance between std::vector<> and a std::array<> (for equivalent functionality) is because the number of elements in a std::array<> is a constant known at compile time.


I'm not sure I follow how having a fixed number of elements would make accessing them faster.

Then again understanding the speed of memory accesses is insane and often defies logic.
> I'm not sure I follow how having a fixed number of elements would make accessing them faster.

It may. Loops may be completely unrolled, sse instructions may be used more efficiently (no run-time calculation of tails is needed) etc. AFAIK, any difference would be marginal, even in cases where the number of elements is small.
Ahh.. Yes that makes sense.

=)
And it is more likely that stack frame, where stack allocated array is, will be in processor cache. And if there is only one level of indirection used (like in simple C-style arrays) it is more likely that data prefetch will occur. I'm not sure if modern CPU can reliable deduct need for prefetch in case of double (triple if not inlined?: function call, dereferencing *this, dereferencing resulting pointer) indirection.
Topic archived. No new replies allowed.