Correctly writing numerically into a ANY

Forum

Forum
Beginners
Correctly writing numerically into a ANY

Correctly writing numerically into a ANY integer variable

Just for perfectionism and for sport, I'm looking for a way to correctly read (numerically) into ANY integer variable, under ANY circumstances and on ANY platform that supports C++, as per the C++ Standard, and not as per some de-facto standards.

Of course, problems start as soon as this variable starts to be of a char type. If so, cin will write into the variable the ASCII code of the first character from standard input, rather than the numbers themselves. We might run into these problems for example when we start using typedefs from <cstdint>. I found out that int_least8_t, for example, is very likely to bind to char (if not even guaranteed). Also, AFAIK there is at least a theoretical chance that, for example, uint_fast32_t binds to char32_t. I'd like my program to handle such cases properly.

Therefore I have to typecast my integer variable to a type that will be guaranteed not to have such problems. The most general solution seems to be to use the unary + operator; AFAIK it is guaranteed to typecast a char variable into an integer variable of at least as much size, and will do nothing with true integer variables. So, this is better than an explicit typecast, which would require some more work. Unfortunately, since I'm not allowed to read into a temporal, I'll have to define a helper variable:

1
2
3

  auto hlp = +var;
  cin >> hlp;
  var = hlp;

This is ugly and adds the overhead of useless copying. Is there any way make this better, without the overhead? I'd like to know it.

But that's not the end of problems. Not near. The unary + actually might convert the variable to a broader type. For example, applying it to a usual char is likely (or guaranteed) to produce an int, which is very likely (although not guaranteed) to be broader than a char. Normally cin does range checking, and if input exceeds the variable's range, it sets up ios::failbit. Not here, however. cin will write to hlp, which is an int and has limits of an int. So an integer overflow might happen when var is being assigned the value of hlp.

To prevent this we have to do range checking by hand. This produces a horrible code:

  auto hlp = +var;
  cin >> hlp;
  
  auto maxval = numeric_limits<decltype(var)>::max();
  auto minval = numeric_limits<decltype(var)>::lowest();

  if(!cin.fail() && (hlp > maxval || hlp < minval))
  {
    cin.setstate(ios::failbit);
    
    if(hlp > maxval)
      var = maxval;
    else
      var = minval;
  }
  else
    var = hlp;

And this is not only ugly, but also broken (from my perfectionistic point of view). The problem is, I wanted to make sure the program behaves well for all cstdint typedefs. But here we have the use of numeric_limits, which may only be specified for fundamental types; and cstdint typedefs are not guaranteed to bind to fundamental types (AFAIK).

Is there any better way to do this? My only other idea is to check whether var is of any of these types: char, unsigned char, wchar_t, char16_t or char32_t, and if so, perform the promotion with the unary + and do range checking with numeric_limits which is guaranteed to work properly since numeric_limits is specified for these types. But then again, are we guaranteed not to have any other implementation defined char types? If we are not, (which I'm afraid is the case), then even this idea breaks.

And what about the overhead? Is it going to be horrible? How to make it more efficient?

Still looking for a neat solution ;P

Last edited on

JLBorges (13770)

Something like this, perhaps:

#include <iostream>
#include <type_traits>
#include <cstdint>

template < typename T > struct is_char : std::false_type {} ;
template <> struct is_char<char> : std::true_type {} ;
template <> struct is_char<signed char> : std::true_type {} ;
template <> struct is_char<unsigned char> : std::true_type {} ;

template < typename T >
typename std::enable_if< std::is_integral<T>::value && !is_char<T>::value && !std::is_enum<T>::value,
                         std::istream& >::type
read_int( std::istream& stm, T& v ) { return stm >> v ; }

template < typename T >
typename std::enable_if< is_char<T>::value, std::istream& >::type
read_int( std::istream& stm, T& v )
{
    decltype(+v) i ;
    if( stm >> i && T(i) == i ) v = i ;
    else { v = 0 ; stm.clear( std::ios_base::failbit ) ; }
    return stm ;
}

int main()
{
    std::int8_t c ;
    if( read_int( std::cin, c ) ) std::cout << +c << '\n' ;
    else std::cout << "error in input\n" ;

    std::cin.clear() ;

    std::uint16_t s ;
    if( read_int( std::cin, s ) ) std::cout << s << '\n' ;
    else std::cout << "error in input\n" ;
}

kmph (4)

Seems neat. Thanks for this idea! But I have one question.

You wrote:

if( stm >> i && T(i) == i ) v = i ;

Here you compare a value of type T and a value of type decltype(i). The latter type might be larger than the former.

I am unsure if this scenario might happen, perhaps it mightn't. But I had such an idea...

Okay, so let's say that i exceeds the bounds of type T and therefore T(i) produces an overflow. The value of i needs for example four bytes in the memory, but only one byte is taken into account - after the typecast, the value of T(i) is interpreted as if it dwelt only in one byte, but the other bytes are not erased from the memory. But since we do the comparison T(i) == i, then [T(i) gets promoted again to decltype(i), just for the comparison - and therefore, the other three bytes are again taken into account. Since they were not erased, T(i) regains its original value that exceeds the bounds of T, and therefore the comparison erroneously returns true.

Please tell me I'm mistaken here...

P.S. You do the check if our variable is not an enum. Honestly though - even purely theoretically, can cstdint typedefs bind to enums? It would be really weird, but there should exists some guarantee in the standard...

Last edited on

kmph (4)

BTW I've recently learned that in fact, numeric limits is specialized for all cstdint typedefs, and therefore my first solution was not that broken ;P http://stackoverflow.com/questions/31527219/can-cstdint-typedefs-bind-to-some-implementation-specific-types-stdnumeric-lim/31527797

JLBorges (13770)

> But since we do the comparison T(i) == i, then T(i) gets promoted again to decltype(i), just for the comparison - and therefore, the other three bytes are again taken into account.

T(i) is a prvalue of type T; the value of T(i) is the value of i narrowed to T.

If the destination type is unsigned, the resulting value is the least unsigned integer congruent to the source integer (modulo 2 n where n is the number of bits used to represent the unsigned type).

If the destination type is signed, the value is unchanged if it can be represented in the destination type; otherwise, the value is implementation-defined. - IS

This means that the T(i) == i will be true if and only if the value of i can be represented in the type T (the value of i does not overflow the range of T).

#include <iostream>
#include <type_traits>
#include <limits>

template < typename CHAR > void test_it()
{
    CHAR c = 0 ;
    decltype(+c) i = std::numeric_limits<CHAR>::max() + 1 ;
    std::cout << +CHAR(i) << " == " << i << " ? "
              << std::boolalpha << ( CHAR(i) == i ) << '\n' ;

    i = std::numeric_limits<CHAR>::min() - 1 ;
    std::cout << +CHAR(i) << " == " << i << " ? "
              << std::boolalpha << ( CHAR(i) == i ) << '\n' ;
}

int main()
{
    std::cout << "\nchar\n------------\n" ; test_it<char>() ;
    std::cout << "\nsigned char\n------------\n" ; test_it< signed char >() ;
    std::cout << "\nunsigned char\n------------\n" ; test_it< unsigned char >() ;
}

http://coliru.stacked-crooked.com/a/07f2bdaa456f6907

> You do the check if our variable is not an enum.

Yes. Because the check for valid input of an enum should be more stringent than a simple range check on the the underlying type of the enum.

Topic archived. No new replies allowed.