Articles

Articles
Pointer craft

Published by Catfish

Nov 14, 2011 (last update: Nov 19, 2011)

Pointer craft

Score: 4.2/5 (120 votes)

About this article

I believe that competition leads to improvement.
There are three other articles on pointers and how they relate to arrays, besides mine and Moschops'.
Then there's the dedicated section in the Documentation.
So I'll try to keep this as short and to the pointer as possible.
(This article assumes you know the basics of C++ programming.)

Pointer facts

A pointer is a variable. It stores a number. That number represents a memory address.
Therefore we say it points to some data.
Pointers can have a type (e.g. int, char) or they can be void.
The type will hint what you want to interpret the data that is pointed to, as.
If you use void, you may need to specify a type later.

Declaring a pointer

You declare a pointer just like how you would any variable, but add an asterisk (*) in between the type and the name.

Example:

void * function(int *i)
{
    void *v;     // we don't know what type of data v will point to
    v = i + 500; // pointer arithmetic
    return v;    // return the resulting memory address
}

The function() above takes a pointer as parameter.
The value of i is the memory address it contains.
After we do the pointer arithmetic we'll have a new memory address.
We use void as type because we're undecided what to treat the data that v points to as.

Pointer arithmetic

Pointer arithmetic refers to addition or subtraction between a pointer and an integer.
The value of a pointer is the memory address it holds. It is expressed in bytes.
Most types occupy more than one byte in memory. (e.g. float uses four bytes.)
The integer represents how many elements of the pointer's type we're shifting the address by.
Finally the address shifts by the number of bytes needed to store that number of elements.

Example:

float *pf = reinterpret_cast<float *> (100);
// force pf to contain the value 100 (0x64 in hexadecimal)
// notice that (sizeof (float) == 4) bytes

pf += 1; // shift pf forward by one float
// pf is now 104 (0x68)
pf -= 2; // shift pf backward by two floats
// pf is now 96 (0x60)

void *pv = reinterpret_cast<void *> (100); // pv is 100 (0x64)
// notice that (sizeof (void) == 1) byte

pv += 1; // pv is now 101 (0x65)
pv -= 2; // pv is now 99 (0x63)

// caution, you should never assign a custom address to a pointer

`NULL` and `nullptr`

The rule of initializing variables applies to pointers as well.
The convention is to use NULL (or nullptr in C++11) to give the pointer a neutral value.

Example:

1
2
3

int *i1;        // caution, i1 has a junk value
int *i2 = NULL; // we mark i2 as unused
i1 = NULL;      // same for i1

NULL most often is the value 0.
Well-designed functions should check if a given pointer is NULL before using it.
In the latest standard of C++ (named C++11), nullptr replaces NULL.

Reference facts

While pointers are a concept inherited from C, references were introduced by C++.
A reference can be described as an alias for an existing variable of the same type.
References do not contain a memory address you can change.
References cannot be re-aliased to another variable.

Declaring a reference

You declare a reference how you would a pointer but by using an ampersand (&) instead of an asterisk (*).

Example:

int a;       // regular variable a
int &ra = a; // reference, must be initialized at declaration
ra = -1;     // now a is -1, too
a = 55;      // now ra is 55, too

What's a reference good for?

It can serve as a better pointer. References cannot be made invalid as easily as pointers can.
A typical use for references is as a safer alternative to pointers in functions parameters.

Example:

void die_string_die(std::string &s)
{
    s.clear();
}
// notice that the real string is not copied as a local variable,
// so when we change s inside our function, the real string changes as well

Using a reference is tempting because not having to make a copy will conserve memory and time.
So in order to prevent any accidental changes to the original variable, programmers will declare the reference as const.

Old school C programmers will do the same for pointers, but they still have to check if their pointer is NULL.
And even if it isn't, they still have no guarantees it is valid.

Example:

void safe(const std::string &s) {}

void still_unsafe(const std::string *s)
{
    if (s == NULL); // we surely can't use s now

    else; // but what if it's still invalid?
}

The dereference (`*`) and reference (`&`) operators

The reason why I wrote the previous sections is because both C and C++ made the uninspired choice of recycling the asterisk (*) and ampersand (&) as operators.
So I wanted to clear up their role in declarations, before moving on to operations.

The dereference operator (*) is used on pointers, to manipulate the data at the memory location they contain.
The reference operator (&) is used on regular variables, to get their memory address.
You can reference a pointer to get its own memory address. Which is why you can have pointers to pointers.
But dereferencing a regular variable will most likely cause a crash.

Example:

int i;       // regular variable i
int *pi;     // pointer to int
int **ppi;   // pointer to pointer to int
int ***pppi; // this is ridiculous, avoid doing things like this

pi = &i;     // apply reference to i, to get i's memory address
ppi = &pi;   // apply reference to pi, to get pi's own memory address
pppi = &ppi; // apply reference to ppi, to get ppi's own memory address

*pi = 5;     // apply dereference to pi, to change the data pointed to by pi

// i has the value 5

**ppi = -17; // apply dereference to ppi twice, i is now -17
***pppi = 9; // apply dereference to pppi three times, i is now 9

C array facts

Arrays can be described as a chain with a known number of elements, of the same type.
They are sometimes described as "constant pointers", because using their name returns the memory address of the first element, but that address cannot be changed.
The size of an array cannot be changed, either.

The old limitation in using arrays was that their size had to be known at compile time.
This isn't the case anymore in the latest C standard (named C99) but the designers of C++ decided not to implement VLAs (Variable-Length Array) in C++.
The "variable" in VLA means that the size is a variable, and not that the size is variable.

Declaring an array

A simple one-dimensional array is declared by using square brackets.
The size can be deduced if you provide an initializer list, otherwise you need to specify the size yourself.

Example:

int ia1[] = {0, 1, 2, 3};     // size deduced to be 4
int ia2[4] = {5};             // size is 4, contents are {5, 0, 0, 0}
int ia3[40];                  // caution, size is 40 but elements are junk
int ia4[40] = {};             // size is 40, all elements are 0
char ca1[] = "car";           // caution, a '\0' character is added to the end, size is 4
char ca2[] = {'c', 'a', 'r'}; // size is 3
// and so on...

char *pc = ca1; // no need to reference ca1, because it returns a memory address

ia1[1] = -3; // changes second element in ia1 (counting starts from 0)

Dynamic memory allocation

In the absence of VLAs and if for some reason we don't want to use the STL containers, we can allocate memory dynamically.
We do this for cases when it's unknown how many elements we need to store, at compile time.

The preferred use for pointers remains pointing to a given variable.
But they can also be used to construct chains containing an arbitrary number of elements.

Example:

#include <cstddef>
// for size_t (which is an unsigned integral type, like unsigned int)

size_t ne=0; // number of elements

std::cin >> ne; // let the user input desired length

double *pd; // declare a pointer to double

pd = new double[ne]; // new[] allocates memory to store ne doubles,
                     // and returns the starting memory address

// ... pd now acts as a doubles array of size ne ...
// caution, the memory address contained in pd must not be changed

delete[] pd; // delete[] frees the memory new[] allocated
             // caution, omitting this step can cause a memory leak

Function pointers

Since functions have addresses too, we can have a pointer to a function.
The use for this is a primitive implementation of polymorphism.
The following example highlights the use of Dispatch Tables.

Example:

#include <iostream>
#include <cstdlib>
#include <cstddef>

void good(int i)
{
    std::cout << "I fed " << i << " little kittens today." << std::endl;
}

void neutral(int i)
{
    std::cout << "I drove " << i << " miles yesterday." << std::endl;
}

void evil(int i)
{
    std::cout << "I steal public toilet paper rolls every day." << std::endl;
}

// notice that the "type" of a function is its signature,
// and all the functions above have the same signature: void name(int )

int main()
{
    void (*wondering[])(int ) = {good, neutral, evil};
    // on the left we have an array of pointers to a function of signature: void name(int )
    // on the right we have the initializer list with the three functions

    size_t user_input = 0;

    std::cout << "GOOD\t== 0\nNEUTRAL\t== 1\nEVIL\t== 2\n\nYour choice is:" << std::endl;
    std::cin >> user_input;

    if (user_input > 2)
        user_input = 2; // just in case...

    (*wondering[user_input])(10);
    // notice how we don't call a specific function for the user

    system("PAUSE"); // you may remove this line if on Linux
    return EXIT_SUCCESS;
}

Conclusion

If you're a C programmer, pointers and arrays can be useful tools.

However, since you're most likely a C++ programmer, you should leave pointer hackery alone.
Use pointers to point to an existing variable (object), and only doing so for the benefits of speed and lower memory usage.
And remember that in some cases, you can use references instead of pointers.

As for C arrays, you should avoid using them, as well. C++11 provides std::array which is an excellent replacement.

C++