main is not a function

Pages: 12
When this program is compiled, how many functions will there be?
1
2
3
4
5
6
7
8
9
10
11
12
void f(){}

int g();

int main()
{
}

int g()
{
    return 0;
}
If my understanding is correct, the answer is 2 because main is (technically) not a function and cannot be called like normal functions. But, should this technicality be considered when answering this question? (I made up this question, I didn't come across it anywhere).
But main() is a function:

WG14 N1494 Committee Draft — June 25, 2010 ISO/IEC 9899:201x

5.1.2.2.1 Program startup
1. The function called at program startup is named main. The implementation declares no
prototype for this function.

It may be special but it is a function none the less.

C++
3.6.1 Main function

A program shall contain a global function called main, which is the designated start of the program.
...
An implementation shall not predefine the main function. This function shall not be overloaded.
...
The function main shall not be used within a program
- ISO/IEC 14882:2011(E)



> how many functions will there be?
> should this technicality be considered when answering this question?

I don't have a strong opinion either way; I suppose it depends on what the questioner means by 'function'. The identifier ::main is not an ' lvalue that refers to a function'; so is it a function?
Last edited on
I would have to say that main is technically a function. The only arguments against that idea is that you can't call it and that it can't be overloaded*. Otherwise it follows every other restriction in place on functions as far as scope, object lifetime etc. It seems to me that the restrictions on put on main are there so that C\C++ binaries can meet the PE\COFF file format standards for what an entry point is.



*: Does that second point seem a bit redundant to anyone else in light of the first requirement?
Does that second point seem a bit redundant to anyone else in light of the first requirement?
Nope, or this would be legal:
1
2
3
4
5
6
7
8
9
10
int main()
{
    return EXIT_SUCCESS;
}

int main(int argc, char** argv)
{
    return EXIT_FAILIRE;
}
//What will be called? 
But if it can't ever be called, then why would you ever need to overload it? After it's compiled any exported names are decorated, so nothing after that point would have an issue with ambiguity in the name resolution. Overloading is a convenience for the programmer while they are writing the code. It's not like a template where the call might have to be resolved at run time.

It's just a weird thought that I had, not really anything I can argue in depth about.
I always considered main to be a special thing that just re-uses existing syntax and semantics so as to not need to invent new syntax and semantics. If the standard calls it a function, I guess it is (I somehow got it in my head that it didn't).
It has got nothing to do with PE\COFF file formats for executable images; they do not require anything more than the address of an entry point.

IIRC, C allows a program to call main().
Presumably, in C, the implementation may have to do something under the hood to be able to distinguish the initial call to main from subsequent calls to main.
http://coliru.stacked-crooked.com/a/3719795e93385fee
It has got nothing to do with PE\COFF file formats for executable images; they do not require anything more than the address of an entry point.

Right, and the standard says that the address placed in that field will be the address of main. My assumption was that this was the decision made in order to provide some standardization on where a human being should start reading the code. I suppose the standard could say that defining an entry point as an argument to the compiler is good enough, but they don't. I hadn't realized that about C though.

I had noticed that both standards made it a point to state that neither one is strictly necessary in an environment that would not require it (like stubs). Maybe that last idea is what led me to draw a false correlation? There has to be some reason right?

Your link is defaulting to C++ for me by the way. Compiler settings for that site seem to be saved locally.

EDIT: Good catch LB.
Last edited on
Computergeek01 wrote:
You link is defaulting to C++ for me by the way. Compiler settings for that site seem to be saved locally.
http://i.imgur.com/WvXlRgW.png
The command line is saved with the code.
Last edited on
well in gcc at any rate, when i do gcc -O0 test.c -S it generates an assembly file with main in it:
http://paste.ubuntu.com/8631338/

and then with g++ -O0 test.cpp -S it still has a main:
http://paste.ubuntu.com/8631360/

is my assembly rusty? is that not actually a routine? or does it just disappear when its assembled into an ELF/COM/PE/MACH-O file?
Last edited on
How are we defining a function?
If it's as a set of instructions for a specific task, pretty sure main fits the bill.
@xkcdreference: that's just a label; there is no language-level support for procedures in assembly.

@NoXzema: my point is not to rethink the definition of a function, but instead to observe how main deviates in semantics from other functions.
@LB: ah yes i remember now... that would be a label. i guess my assembly is rusty after all.
LB wrote:
When this program is compiled, how many functions will there be?

It depends on the implementation you use. For example, the standard says main is the entry point, but with the GNU toolchain, it actually isn't. (Usually the standard allows such deviations when the programmer can't tell the difference.) The dynamic loader calls _start, _start calls _init (which calls the global constructors), then main, then _fini (which calls the global destructors), then _exit which calls syscall_exit. Other toolchains most likely do something similar. Also, since f and g are never called, the compiler will almost certainly remove them. So in all likelihood that program has about five functions.

LB wrote:
that's just a label; there is no language-level support for procedures in assembly.

It's no more "just a label" in assembly than in C/C++; don't forget that C/C++ will be compiled to assembly, and then your main function will be "just a label". Labels in C/C++ are less useful than labels in assembly languages because C and C++ only allow jumping within a stack frame (unless you use setjmp and longjmp or something similar).

Assembly languages usually do have special support for procedures. x86 has call and return, ARM has bl, etc. which are special cases of jump/branch instructions specifically to support procedures. You can't do procedures in x86 without call and return, because you can't otherwise read or write the instruction pointer.

Main is definitely a function/procedure/whatever, whether I write it in assembly or C.
Last edited on
@chrisname: we are not considering any implementation or machine, only an embedded environment as defined by the standard. For all we know the C++ could be interpreted as if it were a scripting language, no assembly involved.

To put some perspective on what I mean:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
template<typename T>
T &f(T &t)
{
    return t;
}

//currently 0 functions exist

int x = f(7);

//currently 1 function exists

int y = f<int>(6);

//currently 1 function exists

int z = f<short>(2);

//currently 2 functions exist 
At the end of compiling the program there are 2 functions, but if we removed x, y, and z, there would be 0 functions. In other words, you need to be able to take the address of it for it to be a function by my definition. Can you take the address of main? My definition probably conflicts with the standard's.
Last edited on
LB wrote:
At the end of compiling the program there are 2 functions, but if we removed x, y, and z, there would be 0 functions.

But that wouldn't be a valid program according to the standard because it has no main function. (A library isn't a valid program until it's linked, statically or dynamically, with something that is. A library is basically a special case of object files.)

In other words, you need to be able to take the address of it for it to be a function by my definition.

Earlier you said that assembly labels aren't procedures, but you can take the address of them. In fact the assembler converts any label references to addresses (whether absolute or relative).

Can you take the address of main?

Probably not according to the standard, but almost certainly in practice, although the compiler might warn you, and the optimiser is free to remove any undefined behaviour without issuing a warning.

Your definition of a function doesn't really make sense though. "Something you can take the address of" includes literally any object in a C++ program, and in functional languages, "Something that is callable" is ambiguous, because code and data are considered equal yet you wouldn't really talk about "callable data". Also, on a non-Von Neumann computer, it's possible that functions don't have addresses in the same way variables do. The Harvard architecture, for example, explicitly separates code and data. I don't think any Harvard architecture computers are actually in use but the standard is written with them in mind - that's why it says casting a function pointer to uintptr_t is undefined (although POSIX defines it, so a POSIX-compliant compiler will not remove that behaviour).

The mathematical definition of a function is something like "something that maps values from one set [the domain] to another [the range]" but that doesn't really apply to languages that aren't functionally pure, since they don't have to return values, and they can modify global state.

Generally, for non-functional languages, a function is something that's callable, and for a purely functional one, it's something that maps values from its domain to its range. main is callable, just not by you, so I would say it's a function.
I am incapable of explaining what I mean. That's why I marked the topic as solved earlier. I guess I should not have tried once again to explain what I mean when I know I cannot.
Last edited on
The way to determine if some entity in C++ is a function or not is embarrassingly simple:

Look at its declaration in the C++ code.
Verbiage about anything that is not present in the source code is utterly irrelevant.

8.3.5 Functions

In a declaration T D where D has the form
1
2
D1 ( parameter-declaration-clause ) cv-qualifier-seq opt 
    ref-qualifier opt exception-specification opt attribute-specifier-seq opt
and ... <elided> ....

In a declaration T D where D has the form
1
2
D1 ( parameter-declaration-clause ) cv-qualifier-seq opt 
    ref-qualifier opt exception-specification opt attribute-specifier-seq opt trailing-return-type
and ... T shall be the single type-specifier auto ....

A type of either form is a function type.
@JLBporges: what is the difference between a function template and its instantiations?
Pages: 12