Multidimentional arrays are evil

Score: 4.2/5 (91 votes)

Multidimentional arrays are evil.

I see a lot of noobs get sucked into the vortex that is multidimentional (hereon MD) arrays. MD arrays are a "feature" of C/C++ which allow you to use multiple indeces when looking up things in an array. A typical use is to have a 2D array to represent a grid or a game map with 1 index for X coords and another for Y coords. Another common usage is for a 'Matrix' class or somesuch.

The thing these coders don't know (or don't realize) is that MD arrays suck. They're syntax poison. They're confusing, hard to work with, and (depending on the implementation) may actually consume more memory and yield worse performance.

After seeing more and more posts about MD arrays pop up on these forums, I've decided to sketch out this article to shine some light on MD arrays, and provide a reasonable alternative.

-----------------------------------

MD Array Flavor 1: Straight
The most obvious way to make an MD array is just a straight allocation:

1
2
3

int evil[5][7];

evil[3][2] = 5;

Seems harmless enough. And it is, for the most part. This is the friendliest flavor of MD arrays.

However even with this form, problems creep up. What if you need to pass this array to another function? The way you'd do it is like so:

int func(int p[][7])
{
  p[2][3] = 1;
}

//...

func(evil);

This will pass 'evil' by pointer to the 'func' function. This isn't all that horrible... but one of its limitations is that it will only accept an array of [x][7]. What if you want to call the function with an MD array of a different size... like int evil[4][5];? Bottom line is you can't. You'd have to take the MD array out of the function completely and turn it into a 1D array and pass the width as a seperate parameter:

int func2(int p[], int width)
{
  // p[2][3] = 1;         // no good anymore, 1D array
  p[ (2*width) + 3] = 1;  // the 1D apprach to do the same thing

/*  It's worth noting here that the above (y*width)+x calculation is basically what
    MD arrays compile to anyway.  The compiler just hides it from you with MD arrays.
     Or at least it's true with Straight Flavor MD arrays.  Other flavors are more sinister.
*/
}

//...
int evil[4][5];
func2(evil[0],5);  // ugly

You might notice I marked that last line as ugly. It's because it is. It is confusing, has unclear syntax, and it is very easy to make a mistake and pass the wrong value as the width.

The other problem with this flavor is that it is completely incompatible with other MD flavors. As you'll soon see.

MD Array Flavor 2: Nested New (dynamic)
Once people graduate from straight MD arrays, they'll start wondering how to make the size of their arrays dynamic. This is where the dreaded "nested new" technique comes in:

// let's say we want to dynamically make int[Y][X]:
int** superevil = new int*[Y];
for(int i = 0; i < Y; ++i)
  superevil<i> = new int[X];

// now we can do this:
superevil[2][3] = 1;

// but cleanup is just as ugly as allocation:
for(int i = 0; i < Y; ++i)
  delete[] superevil<i>;
delete[] superevil;

Note that this actually makes a pointer to a pointer, and not really a straight 2D array. And while "superevil" and "evil" (from the previous section) are both accessed with [double][brakets], they both function very differently under the hood.

Nested New MD arrays are inefficient. In addition to allocating all the space for the normal 2D array, you're also allocating space for POINTERS. Not only that, but in order to access any element in the array, you have to dereference two pointers!

And really... look at the messy allocation/destruction code that's required.

And let's go back to calling functions... let's take our two functions from the previous section:

1
2

int func(int p[][7]);
int func2(int p[], int width);

How would you call those functions with 'superevil'?

Guess what. You can't. That's right... you're hosed. Nested New arrays are not really 2D arrays in the same sense that straight arrays are. Straight MD arrays only require 1 dereference, but nested new requires 2 dereferences. In order to pass this to a function, you'd need to make another one:

int func3(int** p,int width);

But now you might notice that func3 doesn't work with straight arrays. They're just totally incompatible.

The one saving grace to the nested new approach is that it doesn't require the entire MD array to be contiguous, which means you have a better chance of getting the memory you need if memory is badly fragmented. But this mild benefit is seldom applicable, and is overshadowed by its faults.

MD Array Flavor 3: Vectors of Vectors
The more STL savvy coder will opt for a vector-of-a-vector approach to MD arrays, rather than Nested New:

// to make a [5][6] array:
vector< vector<int> > stillevil( 5, vector<int>(6) );

stillevil[2][3] = 1;

Not a lot to say about this one. Like Nested New it's incompatible with other MD array flavors. If you need to pass this array to a function, none of the above functions will work.

Also, like Nested New, it's inefficent in that is requires additional memory and has double indirection.

On the plus side, you don't need cleanup code, and the allocation code is not as ugly (but still pretty ugly!)

Killing MD: 1D Mimics
An MD array can be simulated in a 1D array as long as you know the dims. For example, if you want a 2x3 array, it would conceptually look like the following:

0 1 2 3 4 5
This array has 6 elements (2 * 3). And you'll notice that each row starts with a multiple of the width (2). Therefore you can simulate a 2D array in a 1D array like so:

// int evil[5][6];      // desired 2D array
int mimic[5*6];         // mimicing 1D array

// evil[2][3] = 1;      // desired [2][3] index
mimic[ (2*6) + 3] = 1;  // mimiced [2][3] index

All that math looks kind of dumb and inefficient. Really, though, the compiler does all that math for MD arrays too, so this isn't any less efficient. It is pretty ugly though, granted.

So you might be asking yourself... "what is the benefit of this?"

Well, unlike MD array flavors which are all different and totally incompatible, the respective 1D flavors are all compatible!

Consider the following:

int func2(int p[], int width);  // remember this function?
                                // it'll work with all flavors of mimicing 1D arrays!

int a[ 5 * 6 ];
func2( a, 6 );  // works
//---

int* b = new int[5*6];
func2( b, 6 );  // works
//---

vector<int> c(5*6);
func2( &c[0], 6 );  // works (but is a little ugly, granted)

See how graceful it all is? Aside from the ugly math in the indexing, everything flows together so much nicer. This is the first step to killing MD arrays. The next step is even more fantastic.

Killing MD: Abstracted array classes
So while 1D arrays are easier to work with, They do suffer from a few problems:

- look up requires you to know the dims of the array (can't do y*width without knowing the width).
- look up syntax is ugly and easy to botch. It only gets worse as you get into larger dimensions. Imagine a mimiced 3D array: z*width*height + y*width + x.

C++ provides an excellent solution to this problem: classes. Just make an Array2D/Matrix/whatever class to wrap all of this behavior.

(Or don't -- there are probably many premade ones available online that you can use, I never actually bothered looking. Boost probably has one, I'd imagine -- I'll have to research this more some day. If you are reading further in this article I'm going to assume you don't have / don't want to use such a library and want to write your own)

While you can't overload the [bracket] operator to take two parameters*, you can overload the parenthesis operator. So syntax is a little different:

// int bad[4][5];
Array2D<int> good(4,5);

// bad[1][2] = 3;
good(1,2) = 3;

(* Yes you can overload the brakets and return a weird type to make double brakets work, but it causes all sorts of problems and defeats all the elegance of using a fully contained class)

So how do you make this Array2D class? As long as you don't want bells and whistles, it can be very simple.

Here is a very simple class with no bells or whistles:

template <typename T>
class Array2D
{
public:
  // constructor
  Array2D(unsigned wd,unsigned ht)
    : nWd(wd), nHt(ht), pAr(0)
  {
    if(wd > 0 && ht > 0)
      pAr = new T[wd * ht];
  }

  // destructor
  ~Array2D()
  {
    delete[] pAr;
  }

  // indexing (parenthesis operator)
  //  two of them (for const correctness)

  const T& operator () (unsigned x,unsigned y) const
  {  return pAr[ y*nWd + x ];   }

  T& operator () (unsigned x,unsigned y)
  {  return pAr[ y*nWd + x ];   }

  // get dims
  unsigned GetWd() const { return nWd; }
  unsigned GetHt() const { return nHt; }


  // private data members
private:
  unsigned nWd;
  unsigned nHt;
  T*       pAr;

  // to prevent unwanted copying:
  Array2D(const Array2D<T>&);
  Array2D& operator = (const Array2D<T>&);
};

That's all there is to it!