Struct pointer conversion, possible undefined behavior?

I found a place where some weird casting was happening, and I wanted to confirm if someone knew it was legal.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
// Example program
#include <iostream>

struct Small {
    int data;
};

struct Big {
    Small small;
    int more_data;
};

int main()
{
    Big big { {42}, 100 };
    Small* small = (Small*)(&big);
    
    small->data = 43;
    
    std::cout << small << '\n'
              << &big.small << '\n'
              << small->data << '\n'
              << big.small.data << '\n';
}

0x7d0b336ecec0
0x7d0b336ecec0
43
43


Is the above example undefined behavior (Specifically, casting the Big* to a Small*)?
My guess is that this is actually legal, because of "reinterpret_cast" shenanigans, but I'm not sure.

Second code is the dual of this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
// Example program
#include <iostream>

struct Small {
    int data;
};

struct Big {
    Small small;
    int more_data;
};

int main()
{
    Small small { 42 };
    Big* big = (Big*)(&small);
    
    big->small.data = 43;
    
    std::cout << &small << '\n'
              << &big->small << '\n'
              << small.data << '\n'
              << big->small.data << '\n';
}

0x7bc5f88d45c0
0x7bc5f88d45c0
43
43


Is this legal, so long as big->more_data is not accessed?
Again, my guess is that it is legal, and this is why the address of the first member of a struct is the same as the address of the struct.

And does anyone know what this construct or method is called, regardless of its legality (if it has a "name")? Would you just call it "object slicing", despite not using inheritance?

PS: Obviously this is horrible code that is hard to read. I'm trying to make existing code better.
Last edited on
Additionally, I believe part of cppreference might be wrong, but I don't know enough to say how it should be fixed, because it invalidates what the whole example is saying. Maybe a cppreference editor will see this.
(edit: it appears cppreference is not wrong, the comment maybe could be a bit clearer)

On https://en.cppreference.com/w/cpp/language/reinterpret_cast, in the second code example under "Notes" there is a line of code that says:

1
2
S s = {};
auto p = reinterpret_cast<T*>(&s); // value of p is "pointer to s" 

Why would p not be a T*? Why is it saying that p is an S*?

This does not match what I get:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
// Example program
#include <iostream>
#include <typeinfo>

struct S { int x; };
struct T { int x; int f(); };

int main()
{
    S s = {};
    auto p = reinterpret_cast<T*>(&s); // value of p is "pointer to s" [Edit: s is not S, see doug4]
    
    std::cout << typeid(p).name() << '\n';
    std::cout << typeid(&s).name() << '\n';
}

P1T
P1S

(The strings generated are implementation-defined, but the point is that one is "pointer to t" and the other one is "pointer to s".)
Last edited on
Careful. "Pointer to s" and "Pointer to S" mean different things in this context.

"Pointer to s" means a pointer to the object named 's'.

"Pointer to S" is a type which can point to an object of type S.

So, cppreference, while slightly confusing, is correct. p is pointing to s. It is a "pointer to type T" that is pointing to the object s, which happens to actually be of type S.

zapshe, it's hard to say whether those examples are applicable, because one is talking about conversions, and the other is about inheritance.

doug4, okay, yes I see that now, thanks.

But that's even more interesting, because it suggests that my OP is in fact undefined behavior, because it says
auto i = p->x; // class member access expression is undefined behavior; s is not a T object
Last edited on
its not senseless; it gives you a form of very simple polymorphism, and it can be done in C, which is probably what happened here (looks like something a C coder would do when pulled in to work in c++ without knowing it deeply).

Ill give you that c++ has a better way to do this. But what is being accomplished has uses; its just the syntax that is questionable.
If one struct STARTS with the identical fields as contained by a shorter struct, it is acceptable to cast a pointer to an object of the longer struct to a pointer to the shorter struct. It is similar to inheritance, from a C perspective.

Casting the other direction is undefined behavior.

Rather than having Small be a member of Big (as in your example), consider common fields.

1
2
3
4
5
6
7
8
9
10
11
12
struct Small {int a; int b;};
struct Big { int a; int b; int c;};

Big b;
Small* s_ptr = (Small*)(&b);  // legal

Small s;
Big* b_ptr = (Big*)(&s)    // technically legal because you can cast anything you
                           // want with unsafe C-style casts
b_ptr->c = 6;  // Undefined.  Possibly not technically undefined, but certainly unsafe.
               // You will be overwriting data you don't own.

Ganado wrote:
Is the above example undefined behavior (Specifically, casting the Big* to a Small*)?

casting on its own doesn't do much, the more interesting question is whether the subsequent member access expression small->data is well-defined. In your case, it is, because Big is standard-layout and so pointer-interconvertible with its first member small.

on cppreference, it's under https://en.cppreference.com/w/cpp/language/static_cast "if the original pointer value points to an object a, and there is an object b of the target type (ignoring cv-qualification) that is pointer-interconvertible (as defined below) with a, the result is a pointer to b." as well as https://en.cppreference.com/w/cpp/language/reinterpret_cast you already linked

(and yes, as jonnin correctly notes, this rule has its roots in supporting C programming idioms)
Last edited on
Written here:

http://www.cplusplus.com/doc/oldtutorial/typecasting/


The only guarantee is that a pointer cast to an integer type large enough to fully contain it, is granted to be able to be cast back to a valid pointer...

The conversions that can be performed by reinterpret_cast but not by static_cast are low-level operations, whose interpretation results in code which is generally system-specific, and thus non-portable. For example:

class A {};
class B {};
A * a = new A;
B * b = reinterpret_cast<B*>(a);


This is valid C++ code, although it does not make much sense, since now we have a pointer that points to an object of an incompatible class, and thus dereferencing it is unsafe.



So while still valid C++, it will also bring undefined behavior apparently.


I'd still say senseless, as there seems to be no real reason to do this. Polymorphism in C++ is cleaner, and it wouldn't even make sense to try and implement it with C.
Cubbi: Okay, that's starting to make more sense. I didn't realize the pointer-interconvertible note in the reinterpret_cast article applied here, but now I see that it does. Kind of confusing that clicking on the "pointer-interconvertible" link takes you to the static_cast page, but it is what it is.

I wasn't aware of the pointer-interconvertibility and "standard layout" rules, so I'll read up on them more. For now, things seem clear, thanks, so I'll mark this as solved unless something else comes up.

zapshe: Actually, after reading Cubbi's post/links, I think that code excerpt is fine (b is safe to dereference), but useless because class A and B don't have any members.
Two objects a and b are pointer-interconvertible if:
 • they are the same object, or
 • one is a union object and the other is a non-static data member of that object, or
 • one is a standard-layout class object and the other is the first non-static data member of that object, or, if the
   object has no non-static data members, any base class subobject of that object, or
 • there exists an object c such that a and c are pointer-interconvertible, and c and b are pointer-interconvertible.


jonnin wrote:
(looks like something a C coder would do when pulled in to work in c++ without knowing it deeply).
Yes, that's most likely exactly what it was.
Last edited on
Topic archived. No new replies allowed.