.end() question

Hello,

I am learning the iterators but cannot afford to understand something simple.

Here is the official code :
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
// vector::begin/end
#include <iostream>
#include <vector>

int main ()
{
  std::vector<int> myvector;
  for (int i=1; i<=5; i++) myvector.push_back(i);

  std::cout << "myvector contains:";
  for (std::vector<int>::iterator it = myvector.begin() ; it != myvector.end(); ++it)
    std::cout << ' ' << *it;
  std::cout << '\n';

  return 0;
}


My question is : how does the .end() works ? How the machine knows that the end is really the end ? I read that it is a null pointer.

Should I understand that every string is of the length of the sentence + 1 appended null-pointed value ?

Am I right ?

Thanks,

Larry
Elements of a vector are contained in some memory extent. You can get the address of this extent by using member function data(). This extent keeps number of elements that is equal to the value of member function size().
So the iterator returned by the member function begin usually returns the address of the extent that is the value returned by member function data. And the iterator end() usually returns the address past the last element of the extent that is data() + size().

In fact the loop with iterators could be rewritten the following way

1
2
for ( int *it = myvector.data(); it != myvector.data() + myvector.size(); ++it )
    std::cout << ' ' << *it;


It is simplified desciption because it is not necessary that the vector iterarator represents a pointer. But the idea is correct.

That it would be more clear we can substitute a vector for an array.

int a[5] = { 1, 2, 3, 4, 5 };

1
2
for ( int *it = a; it != a + 5; ++it )
    std::cout << ' ' << *it;


So as you see there is no some last element with the zero value.

As for string literals then they are internally stored as a character array with appended zero character. For example string literal

"Hello"

is stored in memory as a character array the following way

const char s[] = { 'H', 'e', 'l', 'l', 'o', '\0' };

But this have nothing common with vectors.
Last edited on
> I am learning the iterators but cannot afford to understand something simple.

Iterators are quite simple to understand.

a. We have a sequence of elements, and we need to iterate over the sequence (access each element in the sequence, one by one).

b. There are many different possible sequences, and we would like to have a uniform way of iterating over any sequence.

c. To iterate over a sequence, we have to
1
2
3
4
5
    1. start with the first element of the sequence
    2. access the element (do something with it)
    3. move to the next element of the sequence
    4. if we have not moved beyond the last element, go back to step 2; 
        else we have finished accessing every element in the sequence


d. An iterator is an abstract entity that allows us to iterate over any kind of sequence.

e. An iterator 'points' to an element in the sequence. We can access the element pointed to by using the * operator. If iter is an iterator identifying an element, *iter is the element.

f. If we increment an iterator, the iterator moves to 'point' to the next element in the sequence. If iter is an iterator identifying a particular element, after ++iter it identifies the next element.

g. Two iterators compare equal if both identify the same element in the sequence, they compare unequal if they identify different elements.

h. A sequence can be abstracted away by a pair of iterators; begin() results in an iterator 'pointing' to the first element of the sequence, and end() results in an iterator 'pointing' to a non-existent element which is one beyond the last element.

i. To iterate over a sequence, we
1
2
3
4
5
    1. start with an iterator current = begin()
    2. use *current to access the element (do something with it)
    3. use ++current to move to the next element of the sequence
    4. if current != end(), we are not beyond the last element and we go back to step 2; 
       else current == end(), we have moved beyond the last element; we are done with accessing every element in the sequence


j. Different sequences implement iterators differently, and for the same sequence there can be many different ways in which an iterator can be implemented.

k. We do not have to know how an iterator is implemented to be able to start using it. Just as we do not have to know how a compiler is written before we start using it to compile our first "hello world" program.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
#include <vector>
#include <deque>
#include <list>
#include <forward_list>
#include <array>
#include <iterator>
#include <iostream>

int main()
{
    {
        // an array of ten integers
        const int a[10] = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 } ;
        auto current = std::begin(a) ; // a 'pointer' to the first element
        auto beyond = std::end(a) ; // a 'pointer' to a non-existent one beyond the last element
        while( current != beyond ) // as long as we are not beyond the last element
        {
            std::cout << *current << ' ' ; // print out the element
            ++current ; // move current to 'point' to the next element
        }
        std::cout << '\n' ;
    }

    {
        // a dynamically resizeable array of ten integers
        const std::vector<int> a = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 } ;
        auto current = a.begin() ; // a 'pointer' to the first element
        auto beyond = a.end() ; // a 'pointer' to a non-existant one beyond the last element
        while( current != beyond ) // as long as we are not beyond the last element
        {
            std::cout << *current << ' ' ; // print out the element
            ++current ; // move current to 'point' to the next element
        }
        std::cout << '\n' ;
    }

    {
        // a double ended queue of ten integers
        const std::deque<int> a = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 } ;
        auto current = a.begin() ; // a 'pointer' to the first element
        auto beyond = a.end() ; // a 'pointer' to a non-existant one beyond the last element
        while( current != beyond ) // as long as we are not beyond the last element
        {
            std::cout << *current << ' ' ; // print out the element
            ++current ; // move current to 'point' to the next element
        }
        std::cout << '\n' ;
    }

    {
        // a doubly linked list of ten integers
        const std::list<int> a = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 } ;
        auto current = a.begin() ; // a 'pointer' to the first element
        auto beyond = a.end() ; // a 'pointer' to a non-existant one beyond the last element
        while( current != beyond ) // as long as we are not beyond the last element
        {
            std::cout << *current << ' ' ; // print out the element
            ++current ; // move current to 'point' to the next element
        }
        std::cout << '\n' ;
    }

    {
        // a singly linked list of ten integers
        const std::forward_list<int> a = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 } ;
        auto current = a.begin() ; // a 'pointer' to the first element
        auto beyond = a.end() ; // a 'pointer' to a non-existant one beyond the last element
        while( current != beyond ) // as long as we are not beyond the last element
        {
            std::cout << *current << ' ' ; // print out the element
            ++current ; // move current to 'point' to the next element
        }
        std::cout << '\n' ;
    }

    {
        // a wrapper over an array of ten integers
        std::array< const int, 10 > a = { { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 } } ;
        auto current = a.begin() ; // a 'pointer' to the first element
        auto beyond = a.end() ; // a 'pointer' to a non-existant one beyond the last element
        while( current != beyond ) // as long as we are not beyond the last element
        {
            std::cout << *current << ' ' ; // print out the element
            ++current ; // move current to 'point' to the next element
        }
        std::cout << '\n' ;
    }
}


http://liveworkspace.org/code/3afazd$1
Thanks Vlad,

Stop me if I am not right :

So, vectors are kind of upgraded arrays, since they are dynamic ones; So they work as your array example.

Vectors, as containers can be counted : with the end() function, c++ counts the number of elements the array has and return it. Since we start at 0, it is right to say "< size()".

So in the case of vectors, it is indeed simple.

As for strings :
1
2
3
4
  string c = "hello you";
    cout << c.size() <<endl;
  for (int it = 0 ; it < c.size(); it++)
    std::cout << c[it];


c.size() returns the right length. And not length+1.
I cannot see the appended \0; so strings work like arrays/vectors since string is a cheat over chars to create sentences.

So the length+1 only apply to chars, when we manually make the sentences ? The coder has to think to append \0 at the end of his input, like yours ?

 
const char s[] = { 'H', 'e', 'l', 'l', 'o', '\0' };


string family endles that automatically for us , right ?

Thanks,

Larry


Objects of type std::string are not the same as string literals.

"Hello" is a string literal

std::string s( "Hello" ) is an object of type std::string that is initialized by a string literal.

String literals are stored in memory as an arrays of const char with the terminating zero character.

So then you for example write

const *s = "Hello";

the compiler does two things. It stores the string literal as a character array as I showed in the previous post



const char NoName[] = { 'H', 'e', 'l', 'l', 'o', '\0' };

and pointer s is set to the address of the first element of this array. Here NoName is used only for the demonstrative purpose.

Then an object of type std:;string is initialized by a string literal it does not copies the terminating zero though according to the C++ standard if you will write

std::string s( "" );

where s is initialized by a string literal that contains only terminationg zero the result of s.empty() will be true. That is the object of std::string is indeed is empty. But you can write expression s[0] though s[0] is not element of the string.

In my opinion this only confuses users and the Standard shall say that objects of type std::string indeed contain terminating zero.



Last edited on
The c-string, as stated is an array of characters, the end of the string is marked by the '\0' byte.

When it comes to other types of containers, such as std::string or std::vector, the size is maintained internally as a separate value, it does not depend upon some particular end marker.

Example:
1
2
3
4
5
	std::string test    = "qwerty";
	std::cout << test << std::endl;

	test[3] = '\0';
	std::cout << test << std::endl;

Output:
qwerty
qwe ty


Notice that the '\0' does not signal the end of the string.
JL,

I didn't see your answer, might be writing at the same time :)

I get everything you wrote.

The obscure part was the word "beyond".

In my mind, when computer allocates the memory for a 4 letters chars long (with const char s[] = "blah"), I am ok if it stores 4+1 chars.
But I cannot understand why the compiler doesn't complain if ask it s[9] as an example.

How is it we are not able to get, say our login password or whatever since we visit other memory allocations.

How the compiler knows that s[9] should never introduce something else than.. nothing ???
That was the starting point of my question. I know I am a weird to ask some questions like that, but I really need to get it.

I think my question is really memory allocation oriented :)
Last edited on
Let's say we have an array: char a[4] = {0} ;

Storage is provided for four elements of type char (each initialized with 0); and by rules of the language five pointers are well defined.

1
2
3
4
5
char* first = a ; // first points to the first char in the array
char* second = first + 1 ; // second points to the second char in the array
char* third= first + 2 ; // third points to the third char in the array
char* last = first + 3 ; // last points to the last (fourth) char in the array
char* beyond = first + 4 ; // beyond points to one past the last char in the array 


*beyond will lead to undefined behaviour (there is no fifth element). However, beyond - first is well defined and yields 4, last + 1 is well defined and yields beyond, beyond - 1 is well defined and yields last, beyond > last is well defined and yields true.


> But I cannot understand why the compiler doesn't complain if ask it s[9] as an example

The subscript operator is a binary operator between a pointer and an integer; it is not an operation on an array.
char c = a[9] is equivalent to:
1
2
3
char* temp1 = a ;
char* temp2 = a + 9 ; // undefined behaviour
char c = *temp2 ; 


The compiler does not complain because adding 9 to a pointer can be a well defined operation - if the pointer points to an element in an array and there are at least 8 more elements in the array. Just as the compiler does not complain if we add 9 to an int - it is well defined if the int in question is smaller than std::numeric_limits<int>::max() - 8
******* ****** ********.
Last edited on
Got it !!!


Many thanks JL !

Larry
Topic archived. No new replies allowed.