C++ for a thirteen year old

Pages: 1... 3 45

There is this exercise in the book:

std::string s;
std::cout << s[0] << std::endl;

They ask whether it is legal or not, and what it does.

My answer was that yes it is legal, and it prints a blank character. Reason - uninitialized strings are given a default value of " ". There is one character in there, and it is blank. So zero-eth index works. I tried the same code, but instead of zero, there was a three. According to me, it should have worked and errored at compiler time, as there is only one character, and I'm trying to access the fourth character. But it worked! What am I getting wrong here?

LB (13399)

std::string defaults to being empty - that is, having zero characters.

GreatBlitz (47)

Then how does even index 0 work?

LB (13399)

The index does not exist. In this particular case, accessing it results in undefined behavior. This means anything could happen - the compiler/computer is not required to behave in any specific way. It could work, giving you some arbitrary value, or it could crash. It could also open your web browser to a web page, if it so wanted, but usually this is not the case (why would the compiler/computer have any reason to do this?) The best case scenario is a crash, if you're lucky.

See JLBorges' post below.

Last edited on

JLBorges (13770)

> Then how does even index 0 work?

std::string has the member function c_str() ;
This returns a pointer to a c-style string (a null terminated array of characters).
http://en.cppreference.com/w/cpp/string/basic_string/c_str

The characters in this array are the characters in the string with an extra null character (a character with a value of zero) immediately after the last character in the string.

C++11 requires that the complexity of invoking this function be constant (the function must execute in constant time, irrespective of the number of characters in the string).

Even though the IS does not explicitly state it, the only practical way to meet this complexity requirement would be by leaving space to store an extra null character at the end, immediately after the actual last character in the string. Many implementations may pre-assign a null character to this extra character at the end.

That is a technical answer; nevertheless, it does not make accessing the character at position 0 of an empty string logically correct.

> I'm trying to access the fourth character. But it worked!

What you have got is undefined behaviour.

operator[] does not check if the position is a valid position; it does not perform bounds checking
http://en.cppreference.com/w/cpp/string/basic_string/operator_at

To perform bounds checked access to characters in a string, use the member function at().
http://en.cppreference.com/w/cpp/string/basic_string/at

#include <iostream>
#include <string>

int main()
{
    std::string str ; // empty string
    std::cout << std::boolalpha << str.empty() << '\n' // true
              << str.size() << '\n' // 0
              << int( str[0] ) << '\n' ; // 0 (the null character after the last character has a value of zero)

    try { std::cout << str.at(0) << '\n' ; }
    catch( const std::exception& e ) { std::cerr << "out of range accesss: " << e.what() << '\n' ; }
}

http://coliru.stacked-crooked.com/a/2c25093feea3e0ff
http://rextester.com/XFHUC6796

GreatBlitz (47)

@JLBorges

Well I didn't understand a bit of what you said, but this is what I understood:

There is a function which returns an extra null character at the end of the string. However, the only way to insert this is by storing a space at the end. So, when you make an empty string, there is actually one empty character, left there to store the null character if needed. This means that accessing index 0 is fine. However, accessing the fourth character (index 3) is not okay because there is only one character, no more.

That is what you said. But LB says that it is undefined behavior I faced (talking about index 0 here btw), and nothing else.

Which one is correct?

fabtasticwill (218)

They're both correct. They were saying the same thing, just in different words.

JLBorges (13770)

string element access with operator[]:

1
2

const_reference operator[](size_type pos) const;
reference operator[](size_type pos);

Requires: pos <= size().

Returns: *(begin() + pos) if pos < size(), otherwise a reference to an object of type T with value charT();
the referenced value shall not be modified.

Throws: Nothing.

Complexity: constant time.

- IS

This means:

std::string empty_string ;
int i = empty_string[0] ; // fine; i is initialised with the value zero
empty_string[0] = 'a' ; // *** error: undefined behaviour
i = empty_string[1] ; // *** error: undefined behaviour

empty_string.at(0) ; // throws std::out_of_range

std::string non_empty_string = "hello world" ;
const auto sz = non_empty_string.size() ;
i = non_empty_string[sz] ; // fine; the value zero is assigned to i
non_empty_string[sz] = 'a' ; // *** error: undefined behaviour
i = non_empty_string[sz+1] ; // // *** error: undefined behaviour

non_empty_string.at(sz) ; // throws std::out_of_range

GreatBlitz (47)

If you can assign a value to a variable using a subscript of an empty string, as shown here:

int i = empty_string[0];

then it must mean that there is something there, right? And if there is something, why can't you assign a value to it?

empty_string[0] = 'a'; //why is this undefined?

JLBorges (13770)

> then it must mean that there is something there, right?

Yes. There is an object of type char there; and its value is zero (the null character).

> And if there is something, why can't you assign a value to it?

We can assign a value to it.

> empty_string[0] = 'a'; // why is this undefined?

Undefined behaviour means that the behaviour of a program (as a whole) that does this is 'undefined'. The C++ standard has nothing to say about what the behaviour of such a program should be; it imposes absolutely no requirements on what a C++ implementation should do with such a program.

A program construct that is erroneous, but does not engender undefined behavior, is diagnosed - the compiler must generate an error while compiling the ill-formed program.

In contrast, undefined behavior need not be diagnosed, and the results could be completely unpredictable.

Anything at all can happen; the Standard imposes no requirements. The program may fail to compile, or it may execute incorrectly (either crashing or silently generating incorrect results), or it may fortuitously do exactly what the programmer intended. - C FAQ

To repeat: We certainly can assign the value 'a' to any character; even to the character that is at empty_string[0]; there is nothing that will prevent us from doing that. We certainly can increment any signed int; even a signed int that holds the value INT_MAX. However, if we modify the value of empty_string[0], or increment the value of a signed int beyond INT_MAX, our program as a whole has undefined behaviour.

To emphasise: it is not just the erroneous operation engendering undefined behaviour that has an unpredictable result. A construct that engenders undefined behaviour causes the entire program to behave in an unpredictable manner; it causes the entire program to be essentially meaningless (if we consider C++ to be portable at source code level).

Though they are not required to do so, in some cases, implementations may make an attempt to alert us of undefined behaviour when they detect it. http://coliru.stacked-crooked.com/a/453bbc2a74570d6b

In many cases, it is not possible for an implementation to diagnose the undefined behaviour (for instance, some violation of ODR across translation units). In, probably even more cases, the implementation just doesn't care - it proceeds as if the program just can't have a construct that engenders undefined behaviour.

#include <iostream>
#include <string>
#include <cstring>

int main()
{
    std::string str = "~!@#$%^&*()_+" ;
    str = "ABCD" ;

    char c_copy_of_str[100] {} ;

    std::cout << int( str[ str.size() ] ) << '\n' ; // fine; not modified
    std::cout << "          str: " << str << '\n' ;
    std::strcpy( c_copy_of_str, str.c_str() ) ;
    std::cout << "c_copy_of_str: " << c_copy_of_str << '\n' ;

    std::cout << '\n' ;
    str[ str.size() ] = 'a' ; // this engenders undefined behaviour
    // the standard specifies that, in this case, 'the referenced value shall not be modified'
    // the implementation of the library can assume that this would never happen in a 'correct' program
    // it can proceed relying on the knowledge that, in a 'correct' program, the character at position
    // str.size() must still be a null character; if it is not a null character, undefined behaviour
    // has been engendered, and the C++ standard places no requirement on a conforming implementation
    // on how a program which has 'undefined bahaviour' should behave

    std::cout << int( str[ str.size() ] ) << '\n' ;

    // don't complain if the program crashes when we do this; this program has undefined behaviour
    std::cout << "          str: " << str << '\n' ;

    // don't complain if the program crashes when we do this; this program has undefined behaviour
    std::strcpy( c_copy_of_str, str.c_str() ) ;

    std::cout << "c_copy_of_str: " << c_copy_of_str << '\n' ;
    // don't complain if c_copy_of_str has a different sequence of characters from what str holds,
    // this program has undefined behaviour
}

http://coliru.stacked-crooked.com/a/0f15bbe6354c3a5f
http://rextester.com/HDX12030

Undefined behavior exists in C++ because C++ is designed to be an extremely efficient programming language. Good C++ programmers are those who appreciate, need and are grateful for this design goal, and realise that as a consequence, the programmer is squarely responsible for ensuring that the program does not contain any construct that would engender undefined behaviour.

GreatBlitz (47)

So undefined behavior is totally unpredictable.

Anyways, sorry for the long break. Just finished the chapter, bitset was wayyyyy easier that I thought it would be.

But in the end, what are they even useful for?

JLBorges (13770)

> So undefined behavior is totally unpredictable.

Undefined behaviour is totally unpredictable behaviour in portable code.

undefined behavior
behavior for which this International Standard imposes no requirements

[Note: Undefined behavior may be expected when this International Standard omits any explicit definition of behavior or when a program uses an erroneous construct or erroneous data.

Permissible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message).

Many erroneous program constructs do not engender undefined behavior; they are required to be diagnosed. —end note ]

> But in the end, what are they even useful for?

We have talked about hardware memory concepts, such as bits, bytes, and words, before, but in general programming those are not the ones we think much about. Instead we think in terms of objects of specific types, such as double, string, Matrix, and Simple_window. Here, we will look at a level of programming where we have to be more aware of the realities of the underlying memory.
...
Why do we actually manipulate bits? Well, most of us prefer not to. “Bit fiddling” is low-level and error-prone, so when we have alternatives, we take them. However, bits are both fundamental and very useful, so many of us can’t just pretend they don’t exist. This may sound a bit negative and discouraging, but that’s deliberate.

Some people really love to play with bits and bytes, so it is worth remembering that bit fiddling is something you do when you must (quite possibly having some fun in the process), but bits shouldn’t be everywhere in your code. To quote John Bentley: People who play with bits will be bitten” and “People who play with bytes will be bytten.”

So, when do we manipulate bits? Sometimes the natural objects of our application simply are bits, so that some of the natural operations in our application domain are bit operations. Examples of such domains are hardware indicators (“flags”), low-level communications (where we have to extract values of various types out of byte streams), graphics (where we have to compose pictures out of several levels of images), and encryption (see the next section)

-- Stroustrup in 'Programming: Principles and Practice Using C++ (2nd edition)'

LB (13399)

http://stackoverflow.com/a/7349767/1959975

Though, really, the point of such programming assignments is to help you to understand base conversions.

GreatBlitz (47)

Okay got it @LB @JLBorges.

So there is a question as such:
Write a program to compare two arrays for equality. Write a similar program two compare to vectors.

My code:

#include <iostream>
#include <cstddef>
#include <vector>

int main() {
    int arr_a[] = {0, 1, 2, 3, 4};
    int arr_b[5];

    for (size_t i = 0; i != 5; ++i){
        arr_b[i] = arr_a[i];
    }

    for (size_t x = 0; x != 5; ++x){
        std::cout << "Array A val: " << x << " " << arr_a[x] << "\n";
        std::cout << "Array B val: " << x << " " << arr_b[x] << "\n\n\n";
    }

    if (arr_a == arr_b) std::cout << "Arrays are equal.\n\n\n";

    std::vector<int> ivec;
    ivec.push_back(6);
    ivec.push_back(7);
    ivec.push_back(8);
    ivec.push_back(9);
    ivec.push_back(10);

    std::vector<int> ivec2(ivec);

    for (std::vector<int>::size_type i = 0; i != ivec.size(); ++i){
        std::cout << "Vector A val: " << i << " " << ivec[i] << "\n";
        std::cout << "Vector B val: " << i << " " << ivec[i] << "\n\n\n";
    }

    if (ivec == ivec2) std::cout << "Vectors are equal.\n\n\n";

    return 0;
}

My output:

Array A val: 0 0
Array B val: 0 0


Array A val: 1 1
Array B val: 1 1


Array A val: 2 2
Array B val: 2 2


Array A val: 3 3
Array B val: 3 3


Array A val: 4 4
Array B val: 4 4


Vector A val: 0 6
Vector B val: 0 6


Vector A val: 1 7
Vector B val: 1 7


Vector A val: 2 8
Vector B val: 2 8


Vector A val: 3 9
Vector B val: 3 9


Vector A val: 4 10
Vector B val: 4 10


Vectors are equal.

It says vectors are equal. But why not the two arrays?

Last edited on

JLBorges (13770)

> It says vectors are equal.

The standard library defines an == operator to lexicographically compare two vectors of the same type for equality.
http://en.cppreference.com/w/cpp/container/vector/operator_cmp

> But why not the two arrays?

There is no built in == operator to compare two arrays for equality.
However, there is a built in == operator to compare two pointers for equality.
The arrays are implicitly converted to pointers (this yields the address of the element at position zero in the array), and the two pointers are then compared for equality.

#include <iostream>
#include <vector>

int main()
{
    {
        // uniform initialisation (C++11): http://www.stroustrup.com/C++11FAQ.html#uniform-init
        const int a[] { 0, 1, 2, 3, 4 } ;
        const int N = sizeof(a) / sizeof( a[0] ) ; // size of array divided by size of an element

        const int b[N] { 0, 1, 2, 3, 4 } ;  // N is a constant known at compile-time:

        // compare arrays a and b for equality
        // compare element by element in sequence, in a loop
        bool equal_so_far = true ;
        for( int i = 0 ; ( i < N ) && equal_so_far ; ++i )
            if( a[i] != b[i] ) equal_so_far = false ;

        if( equal_so_far ) std::cout << "all elements compare equal\n" ;

        const int* pa = a ; // implicit conversion from array to pointer
        const int* pb = b ; // implicit conversion from array to pointer
        if( pa != pb )
            std::cout << "but pointers to the respective first elements do not compare equal\n" ;

        if( a != b ) // implicit conversion from array to pointer 
            std::cout << "but pointers to the respective first elements do not compare equal\n" ;
        // that ( a != b ) is known to be true at compile time (a and b are two different arrays); 
        // therefore, clang++ warns us about the tautology (did you seriously think that this may sometimes be not true?)   
    }

    std::cout << "--------------\n" ;

    {
        // initialiser list (C++11): http://www.stroustrup.com/C++11FAQ.html#init-list
        const std::vector<int> a { 0, 1, 2, 3, 4 } ;
        const std::vector<int> b = a ; // b is another vector which is a copy of a (copy initialisation)

        // use the overloaded operator== for vectors (provided by the library)
        // http://en.cppreference.com/w/cpp/container/vector/operator_cmp
        if( a == b ) std::cout << "all elements compare equal\n" ;

        if( &( a.front() ) != &( b.front() ) ) // also see std::addressof(): http://en.cppreference.com/w/cpp/memory/addressof
            std::cout << "but pointers to the respective first elements do not compare equal\n" ;
    }
}

http://coliru.stacked-crooked.com/a/f893f4dbb9fe79fb

GreatBlitz (47)

Alright that is great because the next section is all about pointers.

Topic archived. No new replies allowed.

Pages: 1... 3 45