Why does argc need to be signed?

What does the standard say are the exact limits on the types of the parameters given to main? I know it's usually int,char** or int,char*[], but I thought I could do it in a way that makes more sense:
1
2
3
int main(unsigned nargs, char const *const *args)
{
}
Unfortunately, this only works with MSVC and gcc - with clang, it tells me the first parameter must be of type "int". (It's ok with the definition of the second parameter)

Is there some reason that argc could be negative? Does the standard require that argc be signed? If so why?

EDIT: I've looked up why it wasn't unsigned in the C standard (unsigned didn't exist then), but that's not my question. My question is why/if it is not acceptable for argc to be unsigned.
Last edited on
The only two valid signatures for main() are int main(int,char **) and int main(). Any other signature accepted by a compiler is an extension.
Note that both char *[] and char [][] are syntactic sugar for char **.

C conventionally used int for all integer parameters and return types (except for buffer sizes, for which it used size_t), even in contexts where negative values made no sense. For example, rand() returns an int, even though rand()>=0 is assured.
I can only guess that this is because "int" is shorter than "unsigned".
Last edited on
Seems to me that the posts in L B link about iterating backwards (and avoiding underflow), is a very valid argument on it's own - despite all the other arguments.

It's great to find out about things like this. Trivial & unimportant in the great scheme of things - but good to know nonetheless - Thanks L B !
int because int is the most efficient integral type:
Plain ints have the natural size suggested by the architecture of the execution environment; the other signed integer types are provided to meet special needs.


int and not unsigned int because whether an integral type is signed or unsigned is a matter of interpretation, not one of representation.
Each signed integer type has the same object representation as its corresponding unsigned integer type. Likewise, ... the value representation of each corresponding signed/unsigned type shall be the same.

This is not good:
1
2
3
4
5
void foo( unsigned int arg ) // invariant: 0 <= arg <= 1000
{
    // how does foo to assert the invariant?
    // ...
}


An int with a value less than zero can be passed to foo without compile-time or run-time error; foo will just interpret it (perhaps incorrectly) as unsigned.

1
2
3
4
5
6
7
int main()
{
    int i = -1 ;
    foo(i) ;
    unsigned int j = -1U ;
    foo(j) ;
}


This is much better:
1
2
3
4
5
6
void bar( int arg ) // invariant: 0 <= arg  <= 1000
{
    // assert the invariant
    if( arg < 0 ) throw std::out_of_range( "bar::arg must be non-negative" ) ;
    // ...
}


int is the default integral type; another integral type is used if and only if an int will not do.
So for( int i = 0 ; i < 100 ; ++i ) { /* ... */ }
And not for( unsigned int i = 0 ; i < 100U ; ++i ) { /* ... */ }
And certainly not: for( unsigned char i = 0 ; i < (unsigned char)100U ; ++i ) { /* ... */ }
Last edited on
I am aware of the issues with using unsigned types. What I want to know is why it is actually illegal for argc to be unsigned.
closed account (o1vk4iN6)
1
2
3
4
5
6
void bar( int arg ) // invariant: 0 <= arg  <= 1000
{
    // assert the invariant
    if( arg < 0 ) throw std::out_of_range( "bar::arg must be non-negative" ) ;
    // ...
}


You have an upper bound set so why not:

1
2
3
4
5
6
7
8
void bar( unsigned arg ) // invariant: 0 <= arg  <= 1000
{
    // assert invariant
    if( arg > 1000 ) throw std::out_of_range( "bar::arg > 1000" );
}

bar( -1 ); // out of range
> why it is actually illegal for argc to be unsigned.

The type and linkage of main() is implementation-defined; even though an implementation is not allowed to pre-define main().

The global main() is the only function that we can define without knowing either its type or its linkage. Though from within our program, address of main can't be taken, and main that we wrote can't be called, the implementations should still be able to transfer control to the main() that we write. Therefore, there are extra restrictions on what main() can be.
@xerzi

I noticed that too, but I ignored it because I knew what he meant.

When you're designing e.g. a vector class, you can't put a limit on the max number of elements that can be initialized - so how would you tell apart a negative value from an extremely large value that may be valid?

@JLBorges that doesn't explain the reason argc needs to be signed, at least not in an obvious manner. Is it in the standard or is the signed/unsigned requirement implementation-defined?
Last edited on
@helios (10257)
Note that both char *[] and char [][] are syntactic sugar for char **.


You are wrong. Type char[][N] can not be implicitly converted to char **.


@L B
Is there some reason that argc could be negative? Does the standard require that argc be signed? If so why?


This has a historical reason. Originally C had no such type as unsigned int. It had only type int. So to preserve compatibility with old code int argc is used. Also there is another reason as the pointer arithmetic. C/C++ standards define that difference between two pointers shall be specified by some signed integral type.

vlad from moscow wrote:
You are wrong. Type char[][N] can not be implicitly converted to char **.
I don't think char[][N] and char[][] are the same.
vlad from moscow wrote:
This has a historical reason. Originally C had no such type as unsigned int. It had only type int. So to preserve compatibility with old code int argc is used.
Allowing unsigned doesn't break reverse compatibility.
vlad from moscow wrote:
Also there is another reason as the pointer arithmetic. C/C++ standards define that difference between two pointers shall be specified by some signed integral type.
I don't understand where ptrdiff_t comes into play here?
> Is it in the standard or is the signed/unsigned requirement implementation-defined?

int argc is specified by the standard. Though a specific implementation may allow other forms of main(), including one that has an unsigned int argc,

All implementations shall allow both of the following deļ¬nitions of main:
int main() { /* ... */ }
and
int main( int argc, char* argv[] ) { /* ... */ }

... The value of argc shall be non-negative. The value of argv[argc] shall be 0. ...


The int argc version is the only one that is guaranteed to work portably.
@L B
I don't think char[][N] and char[][] are the same.


There is no such type as char[][]. So I considered it as some valid type char [][N] where N is some constant expression.:)

void f( int );

and void f( unsigned int );

are two different functions.

By the way C allows to call main recursively.

As for ptrdiff_t then it plays role in some algorithms.Even to get the last iterator you write

argv + argc
Last edited on
@JLBorges thanks. I can live with both "The value of argc shall be non-negative" and "the int argc version is the only one that is guaranteed to work portably" at the same time, even if it bothers me.

vlad from moscow wrote:
By the way C allows to call main recursively.
It's 2013, and I'm using C++11 ;)
vlad from moscow wrote:
Even to get the last iterator you write

argv + argc
That would give you a pointer, not the difference between two pointers ;p
Last edited on

@L B
argv + argc
That would give you a pointer, not the difference between two pointers ;p


See a step ahead. What will be if you will apply for example algorithm std::find and then will try to get an index. And then you will be compare values of two different types.

@L B
It's 2013, and I'm using C++11 ;)


Even in 2013 C++ tries to keep compatibility with C. Consider for example type long long that was at first introduced in C or macros with variable number of parameters.
Last edited on
Topic archived. No new replies allowed.