Fast float type

Is there an analog to std::int_fastN_t for float? E.g. float_fast32_t. It doesn't have to be in std, just a library that provides such a type for a given platform.
No. double will be the best choice for speed on all platforms.

Erg, quick Googling http://articles.emptycrate.com/2012/02/11/double_vs_float_which_is_better.html

There are better articles out there, but I don’t remember where.
What is the fastest floating point type depends on the kind of operations that are being performed; it could vary from program to program.

For simple arithmetic operations like multiplication, when the data set is large and vectorisation is possible, a smaller sized float would be faster than double.

http://coliru.stacked-crooked.com/a/c42051953ac73631
http://rextester.com/GBHKQ30060

Notes:
In the Microsoft implementation, the representation of double and long double is identical.
In the GNU/LLVM implementations, operations on long double are not (can't be) vectorised.
Last edited on
For the sake of simplicity let's assume that the operations in question are not vectorizable. After all, std::int_fastN_t may also not be the fastest type if the particular operation could be vectorized.

Duoas: But what if the hardware doesn't have double and it has to be emulated in software? I'm not aware of any such platform, but one could exist out there.
> For the sake of simplicity ...

There is no simple, one-size-fits-all-situations answer. What is the fastest floating point type depends on the kind of operations that are being performed; it could vary not only from implementation to implementation, but also from program to program.

Choosing the right precision for a problem where the choice matters requires significant understanding of floating-point computation. If you don’t have that understanding, get advice, take the time to learn, or use double and hope for the best. - Stroustrup in 'The C++ Programming Language (Fourth Edition)
Maybe, maybe not. If one of the types is emulated, then the other will definitely always be faster for all operations, in all programs.
Just because it's not possible to give the best answer in all situations always doesn't mean it's not possible to give reasonably good answers in most situations. For example, it could be possible to define one fast type for simple operations (addition, multiplication, etc.) and another for transcendental functions, perhaps separating the latter group into exponential functions, trigonometric functions, etc. That seems like it could be reasonable easily automated on a given platform by timing a few benchmark programs.
I have not dug into this in ages but ...
- consider the size of the thing if you have many of them. A billion doubles takes twice the memory of a billion floats, leading to cache management, bus bandwidth, and various hardware bottlenecks.
- consider the cost of promotion. The FPU on a system (used to at least) work on only 1 size of floating point and anything else had to be promoted/demoted to fit into it. I believe this is MOSTLY still the 10byte (80 bit) value, but there are other versions out there.

depending on the hardware, one or the other of the above will 'win out' in speed... either it costs more to deal with cache issues and bus bandwidth, or it costs more to promote. I would bet that on modern machines, promotion is extremely cheap and bus/cache issues are still troublesome for large data sets.

also, speaking of emulation... if the data will FIT into an integer, it MAY be faster to do integer maths. For example if dealing with money, you can usually use a 64 bit int that represents pennies rather than deal with doubles for only 2 decimal places of need. The FPU tends to burn through ops fast but you have more integer parallelism on the chips (or did, again, I am a little behind on hardware). You can test this easily enough to see.

Some compilers used to have a fast float flag that could mangle precision in favor of speed. I have no idea what that was all about, or if it is still a thing.



Yeah, I've done fixed point arithmetic in the past. It's really fast, but it's seriously a pain to do multiplications if you care about not dropping too many bits. I had to carefully plan each operation I was doing.
And obviously computing transcendentals on integers is out of the question. Not without lookup tables, anyway.
Topic archived. No new replies allowed.