type

Which type is nowadays usually used by most programmers for floating point numbers.
I think it depends on language. For C++, it's double since you can represent more bits with it. If you want to represent even more bits, you'll need to get some big number package.
I'm not sure if it's still true, but 30 years ago, float was ironically slower than double in many cases.
- On the intel FPUs, floating point operations were internally done with an 80 bit (?) value so arithmetic was no faster with float.
- C (and presumably C++) rules would frequently promote arguments to double, so you were doing double arithmetic anyway and then wasting time converting back to float:
1
2
float f;
f = f * 2.617;  // 2.617 is a double, so this promotes f to double, adds, and converts back to float. 


So my rule of thumb is to always use double unless I really need to space savings provided by float.
Pretty much they get promoted from whatever to internal FPU size.

If I am up to date on this, the FPU can be using 80, 96, or 128 bit variations on some hardware, and even that may not cover all the variations.

long double may be faster, as on some hardware it will be the same size as the FPU register, saving the conversion step.

you have to be careful using the full FPU size though: the reason they made it bigger was the last few (not sure how many) are there to protect against roundoff errors, and may be incorrect. If you use them, you are subject to the roundoff problems.
Last edited on
Topic archived. No new replies allowed.