Postfix Vs. Prefix

OK, So I understand for built-in data types, it doesn't really matter -- in terms of optimization -- which operator you use: postfix or prefix; as the compiler probably optimizes those either way.

But for user-defined types, is it faster to use prefix than postfix? Consider each:

1
2
3
4
int myInteger::operator++() // Prefix ++x
{
   return ++intDataMember;
}


to

1
2
3
4
5
6
int myInteger::operator++(int) //postfix x++
{
   int temp = intDataMember; // Copy the internal data member
   ++intDataMember; // Increment internal data member
   return temp; // return copy of internal data member
}


Notice the postfix uses several operations; where prefix only uses 1. Of course for small amounts of data, this is negligible; but for larger programs, will it make a noticeable difference?
Do you think that when you increment intDataMember on line 4 you'll also be incrementing temp too..?
No. The point of the postfix operator is to return the original value, but internally update the data member; in this case intDataMember, so the next time you use the value, it's incremented.

Consider:

myInteger bigInt;
bigInt.intDataMember = 5;

std::cout << bigInt++ << std::endl; // Outputs 5; intDataMember = 6.
std::cout << ++bigInt << std::endl; // Outputs 6; intDataMember = 6.

So the temp variable is simply to return the unedited value, while I'm acutally incrementing the internal data member. Make sense?

Poor example. Your class is way too integer and non-canonical.

Think about any given class Foo.
Foo Foo::operator++ (int)
You have a Foo and get a Foo. No surprising conversions.

Now, what do we know about construction and destruction of Foo? Are they costly? If they are, then that temporary will make an impact. Naturally, calling something once won't make a dent, but operator like increment is hardly of once in a blue moon type.
I'm kind of unsure what all of this means. Simply put -- even if the performance gain is minimal -- is it more expensive to use x++ over ++x?
is it more expensive to use x++ over ++x?

It could be. It depends on the compiler and how the ++ operators are implemented.

I like to think of ++x as being at least as fast, and sometimes faster than x++ so therefore I always use ++x if I don't care about the returned value.
Last edited on
++x just changes the value and then uses it and x++ uses the value and then changes it

++x could be faster as it just changes the value simply.... x++ on the other hand may need more time theoretically....

But with the speed of processors these days I don't think anybody will notice :D unless you are coding million lines which may effect the code....
Peter87 wrote:
I like to think of ++x as being at least as fast, and sometimes faster than x++ so therefore I always use ++x if I don't care about the returned value.


This.

I use ++x simply because there is no reason not to. Unless you need the returned value from x++ for something -- but that's rare.
So, I ran a simple test against this and found something interesting. Given this, possibly invalid, test (I'm not sure):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
#include <omp.h>
#include <iostream>
#include <climits>

int main()
{
  
  unsigned int  i = 0;
  long total = 0;

  
  double start = omp_get_wtime();
  while(i < UINT_MAX)
  {
    total = (++i);
  }
  double end = omp_get_wtime();
  double delta = end - start;
  std::cout << "Took: " << delta << " seconds." << std::endl;

 
  i = 0;
  start = omp_get_wtime();
  while(i < UINT_MAX)
  {
    total = (i++);
  }
  end = omp_get_wtime();
  delta = end - start;
  std::cout << "Took: " << delta << " seconds." << std::endl;
  return 0;
}


I got results:
Took: 8.0428 seconds.
Took: 6.83406 seconds.

Which means that ++i actually took 2 seconds LONGER than i++. Regardless of how trivial the example is - is that a valid test?
I originally had a post here about the timing issue what was in favor of postfix increment, but it turned out to be completely invalid. Why? Because I wasn't assigning the values from the increments to anything, and my compiler turned out to be not stupid.

I ran the code a few more times, and ultimately, the times ended up being inconsistent for me. Sometimes postfix ran faster, sometimes prefix. So I decided to have a look at the assembly instead.

I compiled this C program to assembly with all optimizations disabled:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
#include <stdio.h>
#include <limits.h>
#include <time.h>
#include <stdint.h>

int main()
{
  uint_least32_t  i = 0;
  int start, end;
  uint_least32_t a, b;

  start = clock()*1000/CLOCKS_PER_SEC;
  while(i < UINT32_MAX)
	a=++i;
  end = clock()*1000/CLOCKS_PER_SEC;
  printf("Prefix took: %d ms.",end-start);

  i = 0;

  start = clock()*1000/CLOCKS_PER_SEC;
  while(i < UINT32_MAX)
	b=i++;
  end = clock()*1000/CLOCKS_PER_SEC;
  printf("Postfix took: %d ms.",end-start);
  return 0;
}


Why C? To reduce the amount of digging around I'd have to do for the loop implementations. EDIT: I went back and did this in C++, and the assembly for the loop bodies turned out to be the same.

##Prefix
movl	-8(%rbp), %eax
addl	$1, %eax
movl	%eax, -8(%rbp)
movl	%eax, -20(%rbp)
##Postfix
movl	-8(%rbp), %eax
movl	%eax, %ecx
addl	$1, %ecx
movl	%ecx, -8(%rbp)
movl	%eax, -24(%rbp)


As you can see, postfix increment calls for one extra mov operation and one extra registry. This is generally negligible. For comparison, here's the assembly for both loop bodies when the value is discarded:
##Both
movl	-8(%rbp), %eax
addl	$1, %eax
movl	%eax, -8(%rbp)


For incrementing basic variables, with modern compilers, there probably isn't much of a difference (if any).

When getting to complex types (i.e. classes) or when needing to store the value of the increment operation, that's a (very) different story.

I'll still use prefix ++ as a matter of habit, and you probably should too.

-Albatross
Last edited on
Do you really care about the performance of unoptimized code?
Yes. The developer should not make assumptions that a compiler will get the code to optimal performance on its own.

Besides, had I enabled optimizations on this code, clang would have just reduced both loops to a single add instruction.

-Albatross
Cool! Thanks for your in-depth answer, TheRabbitologist! How did you access the assembly code of a compiled program? I might do the same that you've done for overloaded prefix and postfix operators on user-defined classes, as that's originally what this question was about; but I understand it's way more intense for that; so I'll try to do it by myself.

I just told the compiler to emit assembly instead of object code. How this is done varies per compiler:

g++: Add -S to command line options.
clang: Same.
cl (MS compiler): Add /FA to command line options.
Intel: ???

-Albatross
Topic archived. No new replies allowed.