Still learning new stuff about C++

Forum

Forum
Lounge
Still learning new stuff about C++

Still learning new stuff about C++

Pages: 12

LOL, https://stackoverflow.com/questions/70712797/is-it-safe-to-bind-an-unsigned-int-to-a-signed-int-reference

That surprised me. Cast first, I guess.

seeplus (6457)

Please don't get me started on mixing signed/unsigned. This is a gigantic rabbit hole that coders can get lost in for days and emerge a gibbering wreck requiring urgent medical care...

George P (5600)

RE: mixing signed/unsigned.....just bash 'em on the head with casts and be done with the abuse.

helios (17506)

The lifetime of the temporary is prolonged to the lifetime of s, i.e. it'll be destroyed when get out of main.

Is that really true? Can someone confirm this? Like, if I do something stupid like

const auto &foo = bar();

is it safe to access foo? I suspect it's not, but I've learned long ago to stay away from C++'s pointy bits, so I don't know.

In you guy's experience, have you been more often bitten in the ass by unsigned mistakes (e.g. accidental underflows into large values when you really intended to get negatives) or by signed mistakes (e.g. various forms of sign extension)? I'm personally of the opinion that one should default to unsigned types unless signedness is explicitly needed, because unsigned types have both less undefined and less surprising semantics.

mbozzi (3911)

Is that really true? Can someone confirm this?

Yeah, it's true, see [class.temporary]/5, 6
https://eel.is/c%2B%2Bdraft/class.temporary#5

It says that usually, a temporary object that is bound to a reference, has its lifetime extended to that of the reference.

The consequences of this rule are not very obvious, but the "weirdness" in the Stack Overflow thread is just a result of confusion about which object is actually bound to the reference. In the SO thread the temporary object whose lifetime is extended is the result of a integral conversion followed by a materialization conversion.

#include <cstdio>
struct b { /* virtual */ ~b() { std::puts("b::~b()"); } }; 
struct d : b { ~d() { std::puts("d::~d()"); } }; 

int main() 
{ 
  b const& r = d{};
  std::puts("x"); 
} // output: x d::~d() b::~b()

The point of the example code is that d's destructor runs even though b has a nonvirtual destructor. This is because the lifetime of the temporary d was extended to match the lifetime of r.

Last edited on

coder777 (8439)

helios wrote:
is it safe to access foo?

It is safe as long as bar() doesn't return a [dangling] reference.

Consider this legit looking code:

#include <string>
#include <iostream>

const std::string& bar(const std::string& s)
{
    return s;
}


int main() 
{
    const std::string& foo = bar("a");

    std::cout << foo; // crash: foo is invalid here
}

However when you do this:

#include <string>
#include <iostream>

const std::string& bar(const std::string& s)
{
    return s;
}


int main() 
{
    const std::string foo = bar("a"); // & removed -> hence a copy

    std::cout << foo; // foo is valid
}

George P (5600)

@helios,

In my admittedly limited experience of being a self-taught C++ hobbyist my experience with signed/unsigned issues the more common is expecting a large value to get larger instead of wrapping around to negative.

I do agree default integer usage should be unsigned, using signed should be a deliberate design choice.

Using int in a for loop, for instance. it should be size_t or unsigned for most purposes. "Walking through" a container like a regular C style array for instance.

lastchance (6980)

Use of unsigned types (like size_t) for loops involving container indices is probably a consequence of c++ not allowing arrays to have negative indices. But some other languages allow them: maybe c++ in future.

The link with pointers probably rules them out for c-style arrays, but I can't see any reason why std::vector shouldn't be re-vamped to cope with negative indices.

Last edited on

seeplus (6457)

c/c++ does allow -ve indices - as long as the resultant memory referenced is part of the allocated memory for that object. Not recommended though.

#include <iostream>

int main() {
	const int arr[] {1, 2, 3, 4, 5, 6, 7};

	const auto* ap {arr + 5};

	std::cout << ap[-3] << '\n';
}

Duthomhas (13130)

Therein lies the problem with signed-vs-unsigned: you can’t have it just one way or another. There was a time when I also thought to myself, “self, you should just use unsigned for everything”.

But that just doesn’t work. There will always be some point where it is needed or useful to treat what is normally useful as an unsigned value as signed, and vice versa.

The real goal, then, is to recognize limits — recognize when an integer is operating at or near its limitations and design which behaviors to tweak it for given specific conditions (even if the condition is other specific conditions don’t apply).

There is a very good overview and a link to a Scott Meyers article at https://stackoverflow.com/questions/10168079/why-is-size-t-unsigned about choosing between signed and unsigned (and why C++ has unsigned types like size_t).

I agree with Meyers: unsigned is annoying. But it is still relevant, if for nothing more than dealing with bit patterns (where sign is not a thing) or with bytes (which in C and C++ we must access through char sequences).

The consequence, then, is to be pedantic when dealing with signedness: make sure to be specific about a type’s signedness as early as possible and straight-up C-cast or static_cast your basic integer types to same-sized signed/unsigned types.

For example, in C and C++, the signedness of char is implementation defined. So if you want to simply print a hexadecimal representation of a byte value, make sure to do the proper signedness casting first, at the deepest level:

1
2

 std::cout << std::hex << (int)( (unsigned char)'A' ) << "\n";    
 //                              ------------------

The tricky part about the original SO code snippet isn’t the signedness, but the fact that the casting creates a temporary object, and the reference is bound to the temporary.

This requires us to get rid of the temporary int by another means: go through a pointer temporary:

#include <iostream>

int main (void)
{
    unsigned int u = 42;

    const int& s = *(int*)(&u);  // ← redirect through a pointer temporary, toss the pointer

    std::cout << "u=" << u << " s=" << s << "\n";

    u = 6 * 9;

    std::cout << "u=" << u << " s=" << s << "\n";
}

By casting the signedness of the pointer temporary we achieve our goal, because there is no temporary copy of u created.

Messy.

helios (17506)

my experience with signed/unsigned issues the more common is expecting a large value to get larger instead of wrapping around to negative.

Keep in mind that overflowing a signed integer has undefined behavior. This has interesting consequences, such as

if (x > 0){
    x += k;
    if (x < 0)
        foo();
}
std::cout << x;

The compiler is allowed to generate no code for the inner if, even if the value of x is printed as negative at the end.

There's never any valid reason to access a negative offset of a pointer. Anyone who does that is being cute.

Duthomhas (13130)

You are applying an always/never rule.

Valid reasons for negative offsets do exist and are useful in the real world with real algorithms and real data structures.

Heck, if you’re using Linux, your OS is doing it all the time behind the scenes.

helios (17506)

Nah. If you're using a negative offset then that means one of two things: either you incremented your pointer earlier than you should have, in which case you can delay that increment and increment the offsets between those two places by the same amount; or your function is accepting a pointer while implicitly requiring that at least one more element before that pointer should be accessible, in which case you can simply change the function's contract to accept a pointer to a lower address. I.e. there's no difference between

void foo(T *p){
    p[0] = p[-3];
}
//...
foo(array + k);

and

void foo(T *p){
    p[3] = p[0];
}
//...
foo(array + k - 3);

The only reason I would accept to keep to former version is if it would be prohibitively expensive or impossible to change all calls to foo(). It's a mistake people have to live with now, but that doesn't change the fact that the function should never have been designed like that to begin with.

Last edited on

lastchance (6980)

The pointer bit is a red herring. I was after whole arrays with negative indices. The tying of arrays to pointers is unfortunate, and probably not necessary once you've wrapped them up in vectors.

You can do it in Fortran as below, and according to Wikipedia, also in languages such as Visual Basic, Ada, and, apparently, C# (though I couldn't personally verify any of those). I'd just like to be able to do the same in C++.

program stuff
   integer A(-10:10)                   ! array indices -10, -9, -8, ..., 8, 9, 10
   integer i

   A = [ ( 10 * i, i = -10, 10 ) ]     ! sets A(-10) = -100, A(-9) = -90, ... , A(10) = 100
   print "( *( i0, 1x ) )", A
end program stuff

-100 -90 -80 -70 -60 -50 -40 -30 -20 -10 0 10 20 30 40 50 60 70 80 90 100

helios (17506)

I was going to suggest using std::span, but it seems its operator[]() takes a size_t, so it wouldn't work.

Eh. I think that's of rather limited use, and it breaks several conventions in C++, such as array[array.size() - 1] not necessary being a valid element. You couldn't safely pass a container like that to a function template expecting a vector-like sequence. If your project really needs something like that, it takes just a few minutes to write a usable vector or deque wrapper.

seeplus (6457)

You can do it in Pascal where you can specify the type/range of the array index(s).

There's now also ssize_t (signed size_t) !

jonnin (11333)

you can do it in c++.

there are valid reasons, consider the counting sort for say -100 to 100.
then you say
int buckets[201];
int *offset = &buckets[100];
for all the data
offset[data]++; //-100 is 0, ok, 100 is 201, ok, etc

and the above works just fine on a fixed size vector as well.
I cannot think of a lot of great uses for it, but its there if you want. I suppose it would allow negative hash function results, if for some reason you really needed that (matching some other system with - keys??). Generally, its kind of weird and uncommon to do. This is just smoke and mirrors, of course, but its valid.

you can make a container to support this, I guess... the hard part would just be efficiency. Anyone can make a double linked list and iterate from the middle "zero/center" of it in either direction... bleh. Doing something a bit smarter and faster would take a little creativity.

Last edited on

TheIdeasMan (6782)

seeplus wrote:
There's now also ssize_t (signed size_t) !

I am having trouble finding std::ssize_t anywhere, not in cppreference or an external search.

There is std::ssize and std::ptrdiff_t

There is a C ssize_t from POSIX

There is a literal suffix for signed size_t in c++23, but I couldn't see a specific reference to ssize_t

And this: std::make_signed_t<std::size_t>>);

This page:
https://en.cppreference.com/w/cpp/language/integer_literal

has literals

cppreference wrote:
z or Z the signed version of std::size_t (since C++23) the signed version of std::size_t (since C++23) std::size_t (since C++23)

but no specific reference to ssize_t

In the c++23 draft there is one footnote (273) on page 1428 to POSIX ssize_t

cpp23draft wrote:
Most places where streamsize is used would use size_t in ISO C, or ssize_t in POSIX

After all that, it does exist, because this works !!!!!

#include <cstdio>

int main()
{
    ssize_t a = -1; // maybe the POSIX version?
}

I guess one should prefer std::ptrdiff_t for standard c++ ?

Last edited on

George P (5600)

@TheIdeasMan,

Your code snippet doesn't work using Visual Studio. "identifier 'ssize_t' (or std::ssize_t) is undefined."

C or C++ code, no diff.

VS doesn't do POSIX apparently.

againtry (2313)

https://stackoverflow.com/questions/15739490/should-use-size-t-or-ssize-t

Pages: 12