Help me wrap my head around pointers and iterators.

Pages: 123
As a lot of you have seen before, I have issues understanding pointers. They just don't click for me. I understand the concept of memory location, I understand that a value is stored in that location, I understand you can dereference a pointer and get the value, or you can reference a variable to get the location in memory. I know there is a lot of times in C++ when you need to use pointers, but I constantly try to find ways around them because I don't know how to use them.

I don't have any exact code to be working with, but a lot of times when someone posts on the forums and I start reading their code, I just get lost. Most recently, I was trying to help someone who created a list class from scratch. I have a basic understanding of how a list works, but the thought of implementing it myself is just scary. How do you tell the compiler to set aside adjacent blocks of memory to be used to traverse through the elements? Why do I see a lot of dereferencing taking place on the left side of an assignment operator?

Those are all relevant to pointers, specifically. I am still having issues with iterators as well. Simply, the iterators will go through a collection, most commonly, an STL container, or similar. In an array, you access elements by using the [] operator and the index. This is also the case for a vector. Why are these accessible through the index, but other containers aren't? Are the elements still not sequential? Can't you have an iterator to traverse through these like you can any other container? Does a pointer not do the same thing as the iterator does? Is there a physical way to get the element number of a container from the actual iterator?

Specifically, I created a list. I used an iterator to display all of the elements of the list using the range based for loop (which is a god send in my eyes) and renames the variable with the name of the iterator. But to know which element I'm on, I needed to use a seperate variable, i, and increment it seperately.

I know this is a long winded post, but this really sums up a good bit of stuff I don't understand. Pointers are my biggest problem. I just started learning about iterators and their uses, which seem very simple, but very powerful as well, so I still have a lot to learn. Reference material is great, and trying to dumb the ideas and concepts down is even better. I am really lost when it comes to looking this stuff up, and I just read Moschops' article on pointers, and it helped a little, but still hasn't "clicked" with me yet.
Last edited on
How do you tell the compiler to set aside adjacent blocks of memory to be used

With new. They don't have to be adjacent in memory, that's the point of a linked list.

Why do I see a lot of dereferencing taking place on the left side of an assignment operator?

Because the left side is a pointer and you want to assign to the object pointed to.

Why are these accessible through the index, but other containers aren't?

How would you implement operator[] for a list? It would run in linear time. Lists aren't suitable for random access, so there's no real point in providing that operator.

Is there a physical way to get the element number of a container from the actual iterator?

Yes, std::distance.

But to know which element I'm on, I needed to use a seperate variable, i, and increment it seperately.

Yes, but you can create a foreach + index macro for that.
With new. They don't have to be adjacent in memory, that's the point of a linked list.

So, an array and vector are adjacent, but all other containers aren't. That explains why an array has to be initialized with a const (I know there are workarounds) and a vector resize is processor heavy since it must locate adjacent blocks, correct?

A list is a sequence of pointers, 0-n, that point to values in memory somewhere, so when you access it, it returns the dereferenced value? A list is nothing more than a container of pointers? Are the rest of the STL containers the same way?

Because the left side is a pointer and you want to assign to the object pointed to.

Something I don't understand is this:
1
2
3
*myPointer = value; // set's the value at the memory location of myPointer equal to value
myPointer = &value; // Changes myPointer's memory location to value
// *myPointer than becomes an alias for value? 

And:
1
2
3
char *myPointer, *myPointer2;
*myPointer = *myPointer2 // Only changes the values to be equal?
myPointer = myPointer2; // Sets the memory location of myPointer to the same as myPointer2? 

I just feel that I have an extra hard time understand exactly the purpose of it. Like, I understand the examples usually well enough, but in practice, it's very hard to use it, and it's even harder for me to read other's code along with coming up with examples on how to use it.

How would you implement operator[] for a list? It would run in linear time. Lists aren't suitable for random access, so there's no real point in providing that operator.

I just found out about how every container has it's own iterators, and only random access containers are capable of using the [] operator. I still understand the difference between the two, but I can't grasp how a pointer works for the random access containers, but not for linked containers. I feel like a pointer-to-pointer should work, in a sense, as the same as the way a regular pointer works on, say, an array or vector.

Yes, std::distance.

There is so many things that I ask about that should make sense. This is one of those times. I didn't know distance existed, but sure enough, it sits right there on the iterator page. I'm kind of disappointed that there isn't just a way to derefence the iterator (maybe this is why I confuse it for a pointer) to get the "index".

Yes, but you can create a foreach + index macro for that.

Not quite sure what you mean by that. I was thinking about doing a basic for loop to iterate through the elements, but then I still have no idea how to know what "index" or pass on the loop I'm at without declaring a separate variable. I feel, anyways, that this should be something to do easily, at least easier than declaring a separate header for std::distance or creating even another variable (I feel my code creates a lot of unnecessary variables).

Edit: I'm reading Professional C++ and ran across a section about loops and iterations. This one specifically talks about iterators with the new std::array. It says that all STL containers can use the STL algorithms. My question is, on the following, where does iter come from? Specifically, what is the difference between that and i?
1
2
3
4
5
6
7
8
9
10
11
12
13
#include <iostream>
#include <array>
using namespace std;
int main()
{
    array<int, 3> arr = {9, 8, 7};
    cout << “Array size = “ << arr.size() << endl;
    for (auto i : arr)
        cout << i << endl;
        cout << *iter << endl;
    // Maybe I'm wrong, but shouldn't there be SRO's here?
    return 0;
}


Edit 2: There was also a segment about using the range based for loop with C style arrays. What type does auto turn into?
1
2
3
4
int arr[] = {1, 2, 3, 4};
for(auto& i : arr) {
    i += 2;
}


Edit: The code in the book was wrong, cout << *iter << endl; isn't even supposed to be there.
Last edited on
A list is a sequence of pointers, 0-n, that point to values in memory somewhere, so when you access it, it returns the dereferenced value? A list is nothing more than a container of pointers?
[snip]
I still understand the difference between the two, but I can't grasp how a pointer works for the random access containers, but not for linked containers.


I think you have the wrong idea about how vectors and lists actually work.

A vector is stored in a group of linear memory. There is exactly one pointer, which points to the first element. If the pointer to the first element is 'x', the pointer to the 2nd element must be 'x+1' because it has to be stored immediately after the first element, so it's address will be one unit higher than the address of the previous element... always.

For a visual representation, here's a 10 entry block of memory that represents how a 10 element vector might look:


   x   x+1  x+2  x+3  x+4  x+5  x+6  x+7  x+8  x+9  <- address of data
+----+----+----+----+----+----+----+----+----+----+
|    |    |    |    |    |    |    |    |    |    |
|  0 |  1 |  2 |  3 |  4 |  5 |  6 |  7 |  8 |  9 |   <- actual data
|    |    |    |    |    |    |    |    |    |    |
+----+----+----+----+----+----+----+----+----+----+


Linked lists work totally differently. Each element, in addition to holding the actual data, also keeps track of a pointer to the element before it and the element after it. The elements can be stored ANYWHERE in memory, it does not have to be continuous.

The list itself only has one or maybe two pointers: one which points to the first element, and maybe one which points to the last element. It does not have pointers to every element -- because there is no way it can keep track of them all. It could keep an array or vector of pointers and access them that way... but that would defeat the entire point of using a list because now it has to maintain an ordered vector of pointers in addition to maintaining the list. If you're doing that you might as well just use a vector.

But anyway... here's a visual representation of what a 4-element list might look like:


   x   x+1  x+2  x+3  x+4  x+5  x+6  x+7  x+8  x+9   <- address
+----+----+----+----+----+----+----+----+----+----+
|    |    |    |    |    |    |    |    |    |    |
|  0 |  - |  - |  2 |  - |  1 |  - |  - |  4 |  - |  <- data
| x+5|    |    | x+8|    | x+3|    |    | ---|    |  <- pointer to next element
+----+----+----+----+----+----+----+----+----+----+
   v             ^ v       v ^            ^
   |             | |       | |            |
   |             +-+-------+ |            |
   |               |         |            |
   +---------------+---------+            |
                   |                      |
                   +----------------------+



I'm kind of disappointed that there isn't just a way to derefence the iterator (maybe this is why I confuse it for a pointer) to get the "index".


This can't be done because list elements don't keep track of which index they are. They could in theory, but it would be stupid to.

The whole point to using a list is so that elements can be added and removed from anywhere in the list very quickly. If you have a list with 1000 elements and you remove element #2, the list only has to modify like 2 pointers (ie: it makes elements 1 and 3 point to each other, rather than having them point to element 2).

If each element kept track of which index it was... removing element #2 would result in elements 3-999 having their index changed. Which means the list would now have to go through all elements and update their index. Sloooooow.
I guess my question then is, how exactly does the linked lists work. I understand that arrays allocate x blocks, as do vectors (allocating more each time it get's resized, and in a different memory location if there isn't sufficient memory blocks to add to the end) but since they have a chunk of memory set aside for modifying, that's how they can use the [] operator, since there is one pointer that points to the beginning, then [x] would be the pointer + x to return the value (since they'll all be sequential), correct?

But a list is different, the pieces can be all over the place, finding memory wherever it can to make it bigger. But if there is only a pointer to the first element, and the last, how does an iterator work? How does each element point to the next one? I'm getting confused when it comes to stuff like this because I see others are able to program it, but I don't understand the concept like I thought I did (I used to think that it contained a vector with pointers to each element).

I liked the examples of the memory blocks, but I don't understand how the list knows where each element is then. Does this get stored as a pair or something similar? I look at a variable as having memory and value, but yet I don't understand how a list can only have a list of variables but know where the next one goes.
Maybe a simple example will help?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
#include <iostream>
struct Node
{
        int data;
        Node *prev, *next;
};
 
int main()
{
        Node a,b,c,d;
        a.data=5;
        b.data=3;
        c.data=8;
        d.data=1;
 
        a.next=&b;
        b.next=&c;
        c.next=&d;
        d.next=0;
 
        Node* p = &a;
        while(p != 0)
        {
                std::cout<<p->data<<'\n';
                p = p->next;
        }
}

Sorry for the names, I wrote this on the fly.
that's how they can use the [] operator, since there is one pointer that points to the beginning, then [x] would be the pointer + x to return the value (since they'll all be sequential), correct?

Technically, no: operator[] does whatever it was programmed to do, it's just a function. For example, std::deque has operator[], but it does not store its contents in a single chunk of memory (it's a bunch of such chunks). Because array, vector, and string are contiguous, you can do a T* ptr = &container[0]; and then access the elements of the container with ptr[idx]. You can't do that with deque. std::map has operator[] too, and it's obviously not contiguous either.

Why are these accessible through the index, but other containers aren't? Are the elements still not sequential?

It was just a design decision: "fast" access (constant-time for array, vector, string, deque, unordered_map and logarithmic time for map) was implemented with operator[]. Slow access (linear time for list and forward_list) was not, because it gives you no advantage over taking a begin iterator and incrementing it n times.
Last edited on
You can't do that with deque.

Technically you can, but only up to 128. place in the deque. That's how many places of contiguos memory there is. But, of course, it's not recommended.
I guess my question then is, how exactly does the linked lists work
[snip]
how does an iterator work? How does each element point to the next one?


When you have a vector<int>, each entry in the vector is a single int.
When you have a list<int>, each entry in the list is what's called a "node". In a doubly-linked list (like std::list), each node contains three things:

1) the actual data
2) a pointer to the next node in the list
3) a pointer to the previous node in the list.

An std::list::iterator is not really a pointer to the data, but rather is a pointer to the node. naraku's example illustrates this. But to get more in depth, it's something like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
struct Node
{
  int data;
  Node* next;
  Node* prev;
};

class list
{
  Node* first
  Node* last;
};

class iterator
{
  Node* currentnode;
};

//////////////////////////
//  given the above simple structures, let's say you do this:
list<int>::iterator i;
i = mylist.begin();  // <- i.currentnode = list.first  .. makes the iterator point to the first node in the list

cout << *i; // <- cout << i.currentnode->data  .. follows its pointer to the current node
  // then takes the data out of it

++i;  // <-  i.currentnode = i.currentnode->next  ..  each node has a pointer to the next element
// in the list.  When you increment the iterator, the iterator follows that pointer to move
// to the next one.

////////////////////////
//  when inserting a new element in the list... all you have to do is adjust the pointers

// insert an element after the node our 'i' iterator points to:
Node* toinsert = new Node;
Node* before = i.currentnode; // i.currentnode is the node before our new node
Node* after = i.currentnode->next;  // the node after 'toinsert' will be the node that *used to* be after i's current node


toinsert->prev = before;  // our previous node is i's current node
toinsert->next = after;  // our next node is whatever used to be immediately after i

before->next = toinsert; // make 'before's next pointer point to the new node, so that the
  // new node will follow 'before' as you step through the list
after->prev = toinsert;  // likewise, make 'after's prev pointer point to the new node,
  // so traversing backwards through the list will work appropriately 
Ok, you guys have just given me a ton of things to think about. And I don't know why, but every time I looked at the node definitions, I kept thinking that the node pointers were new nodes, not pointers. Now that we were talking about them, it makes a lot more sense. I'm still wondering why an indexing system couldn't work with this. if .begin() points to the first node, why can't [0] point to the first node, and [1] point to list->next? Maybe I still am not understanding everything yet.

Aside from that, when reading code, I still have a tough time reading pointers and visualizing how the code is supposed to work. Maybe it's because I've just learned about them, maybe it's because they're not meant to be easily understood and read, or maybe it's because I still don't understand them fully. I do understand each of your examples, it has helped understand not just pointers, but the lists as well.

As an afterthought, are lists the most complicated? maps? what? When I say complicated, I mean in terms of accessing elements. Since lists have been around for quite some time, I'd assume they're simple, but I can't imagine the other STL containers getting much more complicated aside from different member functions.

Thank you guys so much though, it means a lot to have you guys write this stuff out for me.
You could implement oprator[] for a linked list, but for each call you would need to iterate through the list until the desired node is found, which doesn't make sense since it is expected for [] to infer random access.

I found implementing linked lists and binary trees helped a lot with understanding pointers. I would suggest implement a singly linked list (just a next pointer), then adapt it to doubly linked (previous and next pointers), then make it a template. I'd say most of the stl containers are complicated in implementation, but not that complicated with usage once you get used to them.
I'm going to have to look into how the different member functions work as well. I think it'd be helpful to be able to design what is already available.
I wouldn't go to crazy implementing everything the stl containers do. The idea is to understand the data structure, not reinvent the wheel.
I'm still wondering why an indexing system couldn't work with this. if .begin() points to the first node, why can't [0] point to the first node, and [1] point to list->next? Maybe I still am not understanding everything yet.


If you have an array of pointers, then yes, you could access a list that way.

However, maintaining an array of pointers destroys all the advantages of using a list. The whole point of a list is so you can quickly insert and remove elements from anywhere. By adding an array to the picture, you now have to do all the normal work when adding/removing elements from the list plus you have to shuffle around the array to keep it organized.

That effectively would make a list more of a list/vector combination. With the weaknesses of vector added and the strengths of list removed... so really it's pointless. If you are going through the trouble to maintain an array of pointers... then the list is of no benefit to you and you are better off using a vector.


The underlying point here that hasn't really been mentioned is that you don't (or shouldn't) need to access a list with [] anyway. If you need to access with [], then list is a poor choice for a container.
I think he meant that the operator[] for a list would work something like this:
1
2
3
4
5
6
7
template<typename T>
T& list<T>::operator[](size_t i)
{
      iterator it=begin();
      for (int j=0; j<i; j++, it++);
      return *it;
      }
That would work, yeah. But that would be misleading, as you'd expect the [] operator to be fast, and linear traversal is about as slow as it gets.
I have been reading the stuff between working and sleep (seems to be all I do anymore) and you guys have been giving me a lot of things to think about, but something I'm curios about is if the [] operator is so slow, and there is no fast way to traverse through a linked list, how can you insert an object somewhere in the middle? Does it iterate through the list until it finds where to put it?

I'm still trying to get a grasp on pointers, like I just learned what -> does (never actually knew, just knew when I had to use it). Something else I learned is what passing by a const reference does (never made sense to me) and, obviously, how pointers can be incremented to iterate through an array. There is so much I never learned (I believe it was due to not majoring in computer science) but have been causing so much trouble.

Which is better in this scenario? and why?
1
2
int a = 5;
int b = new int(5);


Or here?
1
2
int *c = &a;
int d = a;

In this example, c and d have the same value, just c has the same memory block as a as well, correct?
Does it iterate through the list until it finds where to put it?
Yes, also note traversal of a list is slow compared to [] on an array because operator[] likely uses pointer arithmetic which is fast. Traversal is linear so it becomes increasingly slower as the list grows very large.

Which is better in this scenario? and why?
1
2
int a = 5;//usually better, allocate on stack unless absolutely necessary
int* b = new int(5);



Or here?
1
2
int *c = &a;
int d = a;


In this example, c and d have the same value, just c has the same memory block as a as well, correct?

c has its own memory, its value is the address where a is located in memory and d only has the same value as a.
Edit: Your syntax was a little off.
Last edited on
I was about to make b a pointer, but I wasn't sure. So when you "new" something, it returns a pointer?

In which situation is it better to "new" something instead of creating the object each time? As in:
1
2
myClass myObject(5);
myClass *myObject2 = new myClass(5);


I know that myObject2, when used, is just the memory location and must use the dereference operator to use the class methods. But, why is this more beneficial (I see new objects in "better" code more often) than doing it how I did with myObject?
I was about to make b a pointer, but I wasn't sure. So when you "new" something, it returns a pointer?
Yes, new returns a pointer to successfully allocated memory or null.

In which situation is it better to "new" something instead of creating the object each time?
It really depends on the situation. Off the top of my head the main uses of pointers (and this is by no means a complete list) is array class members, and polymorphism.

But, why is this more beneficial (I see new objects in "better" code more often) than doing it how I did with myObject?
I definitely wouldn't say use of pointers makes better code in any way. But as to beneficial, polymorphism.
Last edited on
Pages: 123