When you subtract pointers to get the index...

Address(ptr) - Address(ptr) returns the index/pos of a value in a array. Now I wonder, does the system already now in advance what two positions/numbers to use in the subtract to get the index of?

Or does it have to loop through and count every index/value from adr1 to adr2, just to find the distance between them?


A function like string::find() returns the position of value instead of a pointer, So If it is the pos/index I want, I dont need to make a calculation afterwards(address - address) to get the position/index of the value.

But I rather use strstr() (because my sequence is already stored in a c-string and I dont want to convert it), but then I have to subtract to find the index/pos of the value. That could slow things down if I have a massive sequence(alot of data) and it has to do some extra heavy work on it just to find a pos/index.

Maybe somone can enlighten me on how it works?


Best regards
Volang


Last edited on
It is a straightforward computation: one subtraction (as integers) and one division.

These two functions would generate the same code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
#include <cstddef>
#include <cstdint>

std::ptrdiff_t foo( const double* a, const double* b )
{ 
    return b-a ; 
}

std::ptrdiff_t bar( const double* a, const double* b )
{ 
    // subtract the integer values of the two pointers, 
    // divide result by size of the element in the array
    return ( std::intptr_t(b) - std::intptr_t(a) ) / sizeof(*a) ; 
}

https://gcc.godbolt.org/z/cB-wsb
Thanks for the website, it's going to be useful.

Why division? The example below is returning the right result:

int s[] = {7,6,3,121,246,78};

int main(){


cout << &s[2] - s; //2



}
When you do arithmetic on pointers (without doing purposeful casting like what JLBorges' bar function does), it automatically takes into account the size of the variable.

But if it's just interpreting the arithmetic as integers, you need to tell it how big the pointers are so that you convert the computed address offset back to index values. Hope saying that didn't confuse you more :P

e.g. if the difference in addresses (interpreted as ints) between two elements is 64, but sizeof(element) is 4, then is actually 64 / 4 = 16 logical elements.
Last edited on
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
#include <iostream>
#include <cstddef>
#include <cstdint>

int main()
{
    const int s[] = {7,6,3,121,246,78};

    const auto pa = s ;
    const auto pb = std::addressof( s[2] ) ;

    // integer values of the pointers
    const auto ia = std::intptr_t(pa) ;
    const auto ib = std::intptr_t(pb) ;

    // size of each element
    const std::intptr_t sz = sizeof( s[0] ) ;

    // (pb-pa) == ( (ib-ia) / sz )

              // subtract pointers
    std::cout << "       pb-pa == " << pb-pa << '\n' // 2

              // subtract their integer values
              << "       ib-ia == " << ib-ia << '\n' // (pb-pa)*sz

              << "          sz == " << sz << '\n'

              // subtract their integer values and divide by size of each element
              << "(ib-ia) / sz == " << (ib-ia) / sz << '\n' ; // 2 (same as pb-pa)
}

http://coliru.stacked-crooked.com/a/ce58909034bb5953
I got it.

You guys mean, in case I would have it like this:

int b = (int) &s[2];
int a = (int) &s[4];


Then I would have to do it like this:
cout << (a - b) / sizeof a ;


So if it's not cast to int, it's not necessary because as you mention it does the division for you
Last edited on


0x40301c - 0x40300c = 4


4 is the result, but what values are in use in the calc. How does 0x40301c - 0x40300c become 4, you said it was straighforward, so are there some other values stored also along with the addresses that are used instead of the addresses as values in the subtraction?

Or what is happening?


I think "straightforward" was referring to the syntax; the work of determining the type was done for you.

Yes, if you are just given integers, then 0x40301c - 0x40300c = 16. But if the types are pointers, then it's smart enough to know that sizeof(int) == 4, so it converts it back to units of "index", because the raw pointer number isn't usually what people want.
Last edited on
How does the translation work from 0x403010 to a int number like 4206612?
Not sure what you mean. There is no translation. Hex literals (0xF) and int literals (15) alike are handled at compile-time. Everything is binary when the actual program runs.
Last edited on
what you have above are 2 different ways to write a number in text forms that humans may choose to use somehow. so one way to answer the above is that the code that turns a value into human text is different: one turns the value to base 16, the other to base 10.

inside the machine, the number is not either one. it is in a form of binary, or, to be more precise, voltages inside the electroics, a high voltage for a bit is a '1' and a low voltage for a bit is zero. It just so happens that base 16 and binary are very easy to convert back and forth, while base 10 is not in any way similar to binary. The reason is that each hex 'digit' is one byte, eg 0xFF is [15][15] or [1111][1111] (see how the ff and 1111 are fairly easy to connect, but 15 is sort of random?) F is the largest hex digit, 1111 is the largest binary 4 digit item, but 15 isn't anything related, just as 255 and 11111111 are difficult to relat but FF and 11111111 are the largest values possible for the digits again...

you can convert hex or base 10 input to a value with string to number functions. You can write a value in either hex or base 10 (or a few other options as well) with print statements. But the computer works in bytes (it works in bits but for hardware simplicity it works in them as groups of 8 rather than track 1 bit at a time everywhere). hex is the most compact form we use.
255 //3 bytes as text
FF //2 bytes as text
11111111//8 as text
true binary (eg a binary file) is even smaller, its base 256. the bigger the base, the more data a digit holds, and the less digits you need to write. 255 takes up 1 byte there.
Last edited on
so 0x403010 is just another way of writing 4206612? (they both end up the same way when the program is compiled)
Actually, 0x403010 is just another way of writing 4206608. But, yeah, you get the idea.
If I understand what you are looking for, the zero-based index into a regular array for a particular value:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
#include <iostream>
#include <cstddef>
#include <algorithm> // for std::find
#include <iterator>  // for std::distance

template <class T, std::size_t N>
int GetIndex(const T(&arr)[N], T value)
{
   // find the desired value in the array
   auto itr = std::find(std::begin(arr), std::end(arr), value);

   // if not found return -1 as an error
   if (itr == std::end(arr))
   {
      return -1;
   }

   // if found, return the distance....the index
   return std::distance(std::begin(arr), itr);
}

int main()
{
   // let's create a C-string
   char str[] { "Hello World!" };

   int dist { GetIndex(str, 'e') };

   std::cout << "The index # is: " << dist << '\n';

   dist = GetIndex(str, 'W');

   std::cout << "The index # is: " << dist << '\n';

   // let's create an int array
   int arr[] { 2, 4, 6, 8, 10, 12 };

   std::cout << "The index # is: " << GetIndex(arr, 6) << '\n';

   std::cout << "The index # is: " << GetIndex(arr, 7) << '\n';
}

The index # is: 1
The index # is: 6
The index # is: 2
The index # is: -1
Line 19 could be written as this: return itr - std::begin(arr);
Topic archived. No new replies allowed.