Machine-Level Representation of the Built-in Types

Hai,

I am going through C++ Primer (5th edition) and in chapter 2, I read the Machine Level Representation of Built in Types, and there's that part that I couldn't understand.. Let me show you:

We can use an address to refer to any of several variously sized collections
of bits starting at that address. It is possible to speak of the word at address
736424 or the byte at address 736427. To give meaning to memory at a
given address, we must know the type of the value stored there. The type
determines how many bits are used and how to interpret those bits.

If the object at location 736424 has type float and if floats on this
machine are stored in 32 bits, then we know that the object at that address
spans the entire word. The value of that float depends on the details of
how the machine stores floating-point numbers. Alternatively, if the object at
location 736424 is an unsigned char on a machine using the ISO-Latin-1
character set, then the byte at that address represents a semicolon.


My questions are:
- How is type responsible for our interpretation (does he mean by interpretation that we can translate machine level language (010101) into human readable language?)
- How did he manage to find out what those bits represent (@last two sentences)?

I'm sorry if I'm asking too many questions, but I searched (hope I done it right) and I couldn't find anything about this), and skipping is not an option.

Thanks, I appreciate it :)

Joseph
Last edited on
Type first and foremost specifies how much memory a particular block of memory is. For example, in C++: a char takes up 1 byte, a short takes up 2 bytes (a word on an x86 processor), an int/float, 4 bytes (a double word/dword). Secondly, it affects how the data in that block of memory is interpreted. Take this 16-bit/1 word block of binary:

1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

As an unsigned short, the computer would evaluate this value as 32768. As a signed short, you can think of the leading digit as representing the number's sign, so it's value you be interpreted as -32767 (if I did my 1s compliment correctly).

As for how he managed to derive a semicolon, I imagine there was a list of binary digits on the page somewhere, and he converted it a decimal number and matched its value on this symbol table: https://en.wikipedia.org/wiki/ISO/IEC_8859-1#Codepage_layout (If I chose the correct one).

I probably did a horrible job of explaining this. Follow up questions are encouraged. XD
Last edited on
There was a page similar to this in my textbook regarding how many bits, etc. each type uses in memory and related value ranges:
http://en.cppreference.com/w/cpp/language/types

To me, it has only ever really mattered when you are trying to save space, because there are usually perfectly fine ways to get memory addresses of variables - you can use pointers or simply the & operator before a variable, and more. But pay attention to the range values. Also, there are always exceptions, such as that it could depend on the character set (as mentioned in your original post). And some data types, even though they may be smaller, don't always perform the same when used in similar situations.

I remember I had an assignment recently where I assigned float a to 2.4 or something like that, and said if (a == 2.4), and it failed every time. But double compared just fine... As a matter of fact, if you write this ((float)a == (float) 2.4), it also compares OK...

If you are even more confused now, I apologize.
Last edited on
I now kind of understand the first part... but as for the second question, it's still a little bit unclear

@Keene: You're right, there was actually binary digits in the page:
http://s8.postimg.org/dje0moggl/screenshot_34.png (if this is against the rules, I'm sorry)

the bits at 736424 address are 0 0 1 1 1 0 1 1, I kind of lost you where you said "converted it a decimal (...) )

How?

(I feel like banging my head on the desk right now)

thanks for your replies :)


EDIT: I converted the digits at 736424 address and they turn out to be 59, which is the semicolon in the link Keene posted.

Last question (it's kinda optional).. I tried converting all the bits on each address and they kinda don't make sense. Are they supposed to make sense or not? It came out ;ESCK something..

Thank you for your replies, I can see it clearly now.
Last edited on
Computer memory is just data (ones and zeroes). By just looking at the data you can usually not tell what it means.

The address tells you where in memory a value starts.

The type tells you how many bytes the value takes up, and it also gives you information how the bits should be interpreted to form a value and how to carry out calculations with it.

Example:

Memory: 011000111010100000001100110000111000011000011

By just looking at this we don't know what it means.

The address gives us a (starting) position in memory where the value is stored.
1
2
3
011000111010100000001100110000111000011000011
        ^              
        address

The type gives us information how big the value is. By using the address and size information we know what data belongs to the value. Lets assume the size of the value is 16 bits.
1
2
3
4
         size = 16 bits                          
011000111010100000001100110000111000011000011
        ^              
        address

By just looking at the bits 1010100000001100 we still don't know what it means. We need to know the type of the data to know what it means.

If the type is a short then we know that the type is signed (can be positive or negative). Signed values are usually stored in two's complement representation ( https://en.wikipedia.org/wiki/Two%27s_complement ), which gives us the value -22516. If the type is instead unsigned short then the value would be 43020.

Note that this explanation is a bit simplified. In reality data are split up in chunks called bytes and you also have to take byte order (endianness) into account.
Last edited on
- How is type responsible for our interpretation (does he mean by interpretation that we can translate machine level language (010101) into human readable language?)

It means that the proper way to interpret the data depends on the data's type. Think of it this way: if a program writes a float at address 736424 then it doesn't make sense to interpret the data there as an integer. Put another way, if the program later reads an integer at address 736424, the integer value won't be very meaningful because the bits at that address represent a float, not an integer.

- How did he manage to find out what those bits represent (@last two sentences)?

There is nothing in memory that says what the bits represent, so he didn't "find out" the representation. What he means is that if you *interpret* the bits as a character or a float or whatever.

The bottom line is this: if you write data of some type to a location in memory, then you should read the data with the assumption that it's the same type.

Hope this helps.
Thanks everyone, it's clear now.
Topic archived. No new replies allowed.