Char array, extra garbage from buffer

I create an char array (my_c_array) of size 50, then I read in exactly 50 char from a buffer...

Now I wonder...Why do I get extra garbage charecters when I do following?:

string str = my_c_array + another_str;
cout << str;

The position of the garbage is after (my_c_array), between my_c_array and another_str.

(There is no garbage or extra charecters when I cout ONLY "my_c_array" or ONLY "another_str". This only occurs when I print out "str")


I fix this when I create (my_c_array[50 + 1]) instead of 50, and then add '0\' to the last position of the array(50)

Now I wonder, why does this even need null terminator when I create an array of size 50 and read in exactly 50 char from buffer? There is no room for extra garbage to be printed out?
If you don't put a null character at the end the string functions will just go on and try to read whatever happens to be stored after the array in memory until it finds a null character (i.e. a zero byte).

There is a string constructor that takes a size as second argument that you can use.

 
string str = string(my_c_array, 50) + another_str;
Last edited on
I assume the type of another_str is std::string?

When trying to concatenate a c-style string with an std::string, the standard library expects the c-style string to be null-terminated. This is how it knows when to stop copying characters.


char arr[] = { 'H', 'E', 'L', 'L', 'O' };
memory layout:
1
2
[ ..., junk  junk   H     E    L    L    O    junk  junk  junk, ... ]
       0x1   0x2    0x3   0x4  0x5  0x6  0x7  0x8   0x9   0xA

When it goes to copy the letter after O, it will continue to copy the junk.


char arr[] = { 'H', 'E', 'L', 'L', 'O', '\0' };
or
char arr[] = "HELLO";
memory layout:
1
2
[ ..., junk  junk   H     E    L    L    O    '\0'  junk  junk, ... ]
       0x1   0x2    0x3   0x4  0x5  0x6  0x7  0x8   0x9   0xA

This will not copy the letter after O because it is guaranteed to be zero.
Last edited on
Thank you both, really appreciate it.
But if there is no '\0', why does it eventually stop ? What are the reasons behind that?
A char is one byte in size.
'\0' is represented by the value zero.
That means that it will stop as soon as it finds a byte with the value zero (if it doesn't crash before that).
That's not what I mean. Even if I do not use '\ 0', it will eventually stop. How is it that? Because it does not print garbage forever.
It will look at the first byte after your array, and if by chance that is zero, it will stop. If, by chance, that value is not zero, it will be used and then next byte will be examined.

If that next byte by chance that is zero, it will stop. If, by chance, that value is not zero, it will be used and then the next byte will be examined.

If that next byte by chance that is zero, it will stop. If, by chance, that value is not zero, it will be used and then the next byte will be examined.

If that next byte by chance that is zero, it will stop. If, by chance, that value is not zero, it will be used and then the next byte will be examined.

If that next byte by chance that is zero, it will stop. If, by chance, that value is not zero, it will be used and then the next byte will be examined.

If that next byte by chance that is zero, it will stop. If, by chance, that value is not zero, it will be used and then the next byte will be examined.

And so on.

It will stop when it finds a byte that happens to be zero. All memory has a value, all the time. Every byte has a value from zero to 255. If you look for long enough, you'll find a zero.
Last edited on
Strictly speaking, from the language point of view, reading past the end of an array causes undefined behaviour which means there is no guarantees what will happen. What we describe here is just what happens in practice. It's also possible that you get a crash if the program tries to read from unmapped memory addresses or from memory that is restricted in some way (https://en.wikipedia.org/wiki/Segmentation_fault), this is however less likely to happen when the array is located on the stack as compared to the heap.
Last edited on
So basically, I could skip adding a null terminator manually after I read in the buffer, IF I create a array of any size + 1 and then initiate the array the same moment as I declare it. (like this: myCharArray[size + 1] = {}) This would make each index equal to null and therefore save me a null charecter at the end, right?

Or maybe this is a much heavier operation behind the scenes than just adding the last index equal to '\0' after I read in the buffer?
So basically, I could skip adding a null terminator manually after I read in the buffer, IF I create a array of any size + 1 and then initiate the array the same moment as I declare it. (like this: myCharArray[size + 1] = {}) This would make each index equal to null and therefore save me a null charecter at the end, right?

Yes, that'll work.

Or maybe this is a much heavier operation behind the scenes than just adding the last index equal to '\0' after I read in the buffer?

Initializing the whole array is more work but if you're reading data from file it's probably not worth worrying about because the vast majority of time will be spent waiting for the data to be read in anyway.
Thanks man, appreciate it
Topic archived. No new replies allowed.