We implement 64-bit floating point (double) to ASCII conversion using the GRISU-1 efficient algorithm. This clever and very efficient algorithm was detailed in 2009 by Florian Loitsch, and is a standard of reference for this particular process.
Encoding integers into text is pretty straightforward. But encoding doubles is a real P...A - the IEEE standard is quite complex.
We extracted a double-to-ascii only cut-down version of flt_core.inc
flt_conv.inc flt_pack.inc files from FPC RTL, which implemented this
algorithm.
As usual, we made a huge refactoring to reach the best performance, especially
tuning the Intel target, with some dedicated asm and code rewrite.
Some information and numbers extracted from the new source code comments:
With Delphi 10.3 on Win32: (no benefit)
100000 FloatToText in 38.11ms i.e. 2,623,570/s, aver. 0us, 47.5 MB/s
100000 str in 43.19ms i.e. 2,315,082/s, aver. 0us, 50.7 MB/s
100000 DoubleToShort in 45.50ms i.e. 2,197,367/s, aver. 0us, 43.8 MB/s
100000 DoubleToAscii in 42.44ms i.e. 2,356,045/s, aver. 0us, 47.8 MB/s
With Delphi 10.3 on Win64:
100000 FloatToText in 61.83ms i.e. 1,617,233/s, aver. 0us, 29.3 MB/s
100000 str in 53.20ms i.e. 1,879,663/s, aver. 0us, 41.2 MB/s
100000 DoubleToShort in 18.45ms i.e. 5,417,998/s, aver. 0us, 108 MB/s
100000 DoubleToAscii in 18.19ms i.e. 5,496,921/s, aver. 0us, 111.5 MB/s
With FPC on Win32:
100000 FloatToText in 115.62ms i.e. 864,842/s, aver. 1us, 15.6 MB/s
100000 str in 57.30ms i.e. 1,745,109/s, aver. 0us, 39.9 MB/s
100000 DoubleToShort in 23.88ms i.e. 4,187,078/s, aver. 0us, 83.5 MB/s
100000 DoubleToAscii in 23.34ms i.e. 4,284,490/s, aver. 0us, 86.9 MB/s
With FPC on Win64:
100000 FloatToText in 76.92ms i.e. 1,300,052/s, aver. 0us, 23.5 MB/s
100000 str in 27.70ms i.e. 3,609,456/s, aver. 0us, 82.6 MB/s
100000 DoubleToShort in 14.73ms i.e. 6,787,944/s, aver. 0us, 135.4 MB/s
100000 DoubleToAscii in 13.78ms i.e. 7,253,735/s, aver. 0us, 147.2 MB/s
With FPC on Linux x86_64:
100000 FloatToText in 98.47ms i.e. 1,015,465/s, aver. 0us, 18.4 MB/s
100000 str in 38.14ms i.e. 2,621,369/s, aver. 0us, 60 MB/s
100000 DoubleToShort in 14.77ms i.e. 6,766,357/s, aver. 0us, 134.9 MB/s
100000 DoubleToAscii in 13.79ms i.e. 7,248,477/s, aver. 0us, 147.1 MB/s
As you can see:
- Our rewrite is twice faster than original flt_conv.inc from FPC RTL (str)
- Delphi Win32 has trouble making 64-bit computation - no benefit since it has good optimized i87 asm (but slower than our code with FPC/Win32)
- FPC is more efficient when compiling integer arithmetic; we avoided slow division by calling our Div100(), but Delphi Win64 is still far behind
- Delphi Win64 has very slow FloatToText and str() implementation (in pure pascal) - so our new version is welcome.
And any feedback is welcome in our forum, as usual!