Pointer to non-static member variable?

I was looking at a problem on Stack Overflow, and came across the following:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
class A
{
public:
        A(int _a, char _b, char _c, char _t):a(_a), b(_b), c(_c), t(_t){}

private:
        char t;
        int a;
public:
        char b;
        char c;

        static void print(){

                int A::*pa = &A::a;
                char A::*pb = &A::b;
                char A::*pc = &A::c;
                char A::*pt = &A::t;
                printf("data member : %p, %p, %p, %p\n", pa, pb, pc, pt);
        }
};


with results (g++ 4.7):
data member : 00000004, 00000008, 00000009, 00000000


I don't think I've come across this syntax; what are pa,pb,pc, and pt? It seems like the print function outputs the offset of each member variable from the beginning of A, but pa outputs 0x00000004, when 0x00000001 would be expected (given t is a char). Anyone have a reference to a good description of this functionality?
Last edited on
what are pa,pb,pc, and pt?

They are pointers to members

It seems like the print function outputs the offset of each member variable from the beginning of A,

That is correct.

when 0x00000001 would be expected (given t is a char).

That is not correct: offset of a cannot be 1 because a is an int, which, on your and almost everyone's system, is aligned to 4 byte boundary (its memory address must be divisible by 4)

I've never seen this syntax before either. But it looks like you are correct in your assessment of it outputting the offset of each member.

but pa outputs 0x00000004, when 0x00000001 would be expected (given t is a char)


Structure padding. It's putting the int on the next 4-byte boundary. That is very typical.
Last edited on
Thanks guys - have a couple interviews coming up, trying to cram as much obscure knowledge as I can :)
OK, here's are a couple of interview questions:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
#include <cstdio>

struct A
{
    int i, j, k, l ;
};

int main()
{
    int A::*points_to_member_at_offset_0 = &A::i ;
    int A::*does_not_point = nullptr ;
    std::printf( "%ld\n", points_to_member_at_offset_0 ) ;
    std::printf( "%ld\n", does_not_point ) ;
}

void foo( A& a, int A::*pm, int v )
{
    if( pm != nullptr )
         a.*pm += v ;
}


1. If a particular implementation prints 0 for line 12, what could it print for line 13? How would this compiler evaluate the condition in the if statement on line 18?

2. If a particular implementation prints 0 for line 13, what could it print for line 12? How would it evaluate the dereference operator on line 19?
Last edited on
I'm pretty sure pointer-to-members with a null value are represented as -1, but your compiler tries to trick you:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
class A
{
public:
	int a;
};

int main()
{
	union
	{
		int A::*union_ptr;
		int union_value;
	}u;
	int A::*a_ptr = &A::a;
	int A::*A_null_ptr = nullptr;
	u.union_ptr = nullptr;
	std::cout<<u.union_value<<std::endl;
	std::cout<<A_null_ptr<<std::endl;
	std::cout<<a_ptr<<std::endl;
	std::getchar();
	return 0;
}

My output:
a_ptr = 1.
A_null_ptr = 0.
union_value = -1.

Both union_ptr and A_null_ptr hold the nullptr, but only union_value prints the actual value because it's forced to print it as an int.

Pointers-to-members are trippy.
trying to cram as much obscure knowledge as I can
That's plain stupid.

Edit: Learn COBOL then.
Last edited on
Still haven't seen a great article describing the rules of pointer-to-member, but it seems like they aren't implicitly castable to long, so I would have to assume the lines

1
2
std::printf( "%ld\n", points_to_member_at_offset_0 ) ;
std::printf( "%ld\n", does_not_point ) ;


are going to give misleading results. If the first gives 0 though, I would expect the next to give -1 (as BlackSheep said), but it seems this would be entirely compiler dependent. It could theoretically be implemented as '5'; the compiler would know that while values '3', and '4', represent the 4th and 5th data members, values 6, 7, etc would represent the 6th, 7th, etc data members. Or for your 2nd question, 'null' could be '0' when viewed as an integer, with 1, 2, etc representing the 1st, 2nd, etc member variables.

The logic in foo() seems like it will work as expected regardless of the compiler. Whatever the conversion of nullptr to type int A::* results in during the execution of main, the result will be the same when evaluating the line if( pm != nullptr ) (which I believe could be simply written as if( pm ), given that conversion to bool does seem to be defined, and evaluates as 'true' even if the integral value is '0'). Likewise, regardless of the internal implementation of pointer-to-member, a.*pm += v ; seems like it would work in any case, as this is the proper syntax for this type of construct.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
#include <cstdio>
#include <cstddef>

struct A
{
    int i, j, k, l ;
};

int main()
{
    // just being safe
    static_assert( sizeof( int A::* ) <= sizeof(long),
                   "casting pointers to member variables to a long loses information" ) ;

    std::printf( "%ld %ld\n", offsetof(A,i), offsetof(A,j) ) ; // 0 4 (say)

    int A::*pi = &A::i ;
    int A::*pj = &A::j ;
    int A::*p0 = 0 ;

    std::printf( "%ld %ld %ld\n", pi, pj, p0 ) ;

    // possible implementation 1:   0 4 -1  (assuming the above offsets)
    // --------------------------
    // current versions of many compilers (eg. GNU, Microsoft)

    // a pointer to member variable holds the offset of the member,
    // and a null pointer holds an invalid offset (typically -1)

    // dereference: add the numeric value of the address of the object
    // to the numeric value of the pointer to member
    // to get the numeric value of the address of the bound member variable



    // possible implementation 2:   1 5 0 (assuming the above offsets)
    // --------------------------
    // Stroustrup/Lippman's original cfront, older versions of many compilers

    // the pointer holds the offset of the member plus a constant (typically 1),
    // and a null pointer holds a zero

    // dereference: add the numeric value of the address of the object
    // to the numeric value of the pointer to member minus the constant
    // to get the numeric value of the address of the bound member variable

    // The cfront implementation is described in Lippman's 'Inside the C++ Object Model'
}



> It could theoretically be implemented as '5'; the compiler would know that while values '3', and '4', represent the 4th and 5th data members ...

No. Because this is well defined:
1
2
struct B ; // declared, not defined
int B::*ptr_int_member = 0 ; // How many non-static member variables does B have? 



Far more interesting question: How could pointers to non-static member functions be implemented?

Hint: If the pointer points to a virtual function, it behaves as expected - polymorphically. The same pointer may be null, or may point to either virtual or non-virtual functions.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
#include <iostream>

struct A
{
    virtual ~A() {}
    virtual void foo() const = 0 ;
    void bar() const { std::cout << "A::bar\n" ; }
};

void baz( const A& a, void (A::*pfn)() const ) { (a.*pfn)() ; }

int main()
{
    struct B : A
    {
        virtual void foo() const override { std::cout << "*** main::B::foo\n" ; }
        void bar() const { std::cout << "*** main::B::bar\n" ; }
    };

    void (A::*pfn)() const = 0 ;

    B b ;

    pfn = &A::foo ;
    baz( b, pfn ) ; // *** main::B::foo

    pfn = &A::bar ;
    baz( b, pfn ) ; // A::bar
}
Last edited on
Far more interesting question: How could pointers to non-static member functions be implemented?


Would posting a link to the Fast Delegates webpage qualify as a spoiler?
Cubbi, I've read that in the past, definitely an interesting article, one I think I will review :)

JLBorges, int A::*p0 = 0 ; is defined, but doesn't seem to set p0 to "0", it sets it to -1 (same as nullptr). If an implementation stored the nullptr (0) value as '5' in p0, it would be no different than storing it as -1, except that the dereference logic would have to be unnecessarily complicated/inefficient, such as

1
2
3
4
5
6
7
8
9
10
11
12
if (mvp == 5)
{
    return 0;
}
else if (mvp > 5)
{
    return obj + mvp - 1;
}
else // mvp < 5
{
    return obj + mvp;
}


Wouldn't this still be a (silly) standards-compliant implementation, or am I not understanding correctly?

Without using the fast delegates article, my recollection is something along the lines that for non-virtual functions, the address (or offset from A) is stored in the fp. For virtual functions, the index to lookup in the vtable, along with the offset required for the 'this' pointer to make it look like the appropriate type of object is stored. IIRC, this can be a single union. Some implementations (in the past?) had generated a thunk function to offset the this pointer and then call the appropriate function. I believe this would result in member function pointers being the same size as non-member function pointers, but you'd potentially have a lot of auto-defined functions.
Last edited on
> If an implementation stored the nullptr (0) value as '5' in p0, it would be no different than storing it as -1,
> except that the dereference logic would have to be unnecessarily complicated/inefficient, such as ---
> Wouldn't this still be a (silly) standards-compliant implementation

Yes it would be. I stand corrected. Thanks.

1
2
3
4
5
6
7
8
if (mvp == 5)
{
    // return 0;
    return -1 ; // the key point is that the null pointer must be
                // distinguishable from pointer to member at an offset of zero
  
}
else ...



> For virtual functions, the index to lookup in the vtable, along with ...

The key point again being that the null pointer must be distinguishable from the pointer to member function with an offset of zero in the vtable.

As a follow up, the studying paid off in the end, so finally gainfully employed again :) Thanks all! I did have one amusing situation though, where an interviewer asked me:

"If 2 binaries both link to the same shared library, are there 2 copies of the shared library in memory or just one used by both binaries?"

I fairly confidently answers "2 copies, one for each binary". He explained I was wrong and that, in fact, only 1 copy ever exists in memory, and proceeded to use the rest of his interview time building on and asking questions about behavior related to this shared-in-memory library. I even started to doubt myself by the end...suffice it to say, I didn't take that job!
> "If 2 binaries both link to the same shared library,
> are there 2 copies of the shared library in memory or just one used by both binaries?"

>> I fairly confidently answers "2 copies, one for each binary".

I think the two of you used two different meanings of 'in memory' - in physical memory (pages in RAM/swap) or in virtual memory (virtual pages in a process address space).

Typically there is only one object in physical memory for each non-writable section (code,read only data).

If the implementation supports COW, initially there is only one object in physical memory for each writable section; but shadow objects are created on a per-process basis as writes into memory take place.


Hmm, maybe should have done more investigation. The stack overflow questions I saw seemed to indicate each process received its own copy - I suppose I was wrong in this case after all!

http://www.linuxquestions.org/linux/articles/Technical/Understanding_memory_usage_on_Linux
Topic archived. No new replies allowed.