16 bit floating-point data type for C++
HalfFloat class that implements all the common arithmetic operations for a 16 bit
floating-point type (10 bits mantissa, 5 bits exponent and one sign bit) and can thus be used (almost)
interchangeably with regular
floats. Not all operations have efficent implementations (some just convert to
compute the result and convert back again) - if in doubt, check out the source code.
The implementation tries to adhere to IEEE 754 in that it supports NaN and Infinity, but fails in other points:
- no difference between qnan and snan
- no traps
- no well-defined rounding mode
We also supply a specialization for
half be usable in template code
dependent on type traits.
// get some halfs (half is a typedef for HalfFloat) half a = 1.0f; half b = 0.5f; // and have some FUN half c = (a+b) / (a-b); ++c; // now that we have a result in loosy precision, // convert it back to double precision. // if anybody asks, it's for the lulz. double result = c;
Credits to Chris Maiwald for the conversion code to
double and extensive testing.
3-clause BSD license: use it for anything, but give credit, don't blame us if your rocket crashes and don't advertise with it (who would).