Synopse Open Source - Tag - doublemORMot MVC / SOA / ORM and friends2022-02-15T11:24:48+00:00urn:md5:cc547126eb580a9adbec2349d7c65274DotclearFaster Double-To-Text Conversionurn:md5:4418b32b1ad34daab0da1624c3faae172020-03-28T20:15:00+01:002020-07-03T10:29:59+02:00AB4327-GANDImORMot Framework64bitblogDelphidoubleFreePascalJSONmORMotperformance<p>On server side, a lot of CPU is done processing conversions to or from text.
Mainly JSON these days.</p>
<p><img src="https://inmyownterms.com/wp-content/uploads/2016/02/convert-button.jpg" alt="" /></p>
<p>In <em>mORMot</em>, we take care a lot about performance, so we have
rewritten most conversion functions to have something faster than the Delphi or
FPC RTL can offer.<br />
Only float to text conversion was not available. And RTL str/floattexttext
performance, at least under Delphi, is not consistent among platforms.<br />
So we just added a new Double-To-Text set of functions.</p> <p>We implement 64-bit floating point (double) to ASCII conversion using
the GRISU-1 efficient algorithm. This clever and very efficient algorithm
was detailed in 2009 by Florian Loitsch, and is a standard of reference
for this particular process.</p>
<p>Encoding integers into text is pretty straightforward. But encoding doubles
is a real P...A - the <a href="https://en.wikipedia.org/wiki/Double-precision_floating-point_format#IEEE_754_double-precision_binary_floating-point_format:_binary64">
IEEE standard is quite complex</a>.</p>
<p>We extracted a double-to-ascii only cut-down version of <em>flt_core.inc
flt_conv.inc flt_pack.inc</em> files from FPC RTL, which implemented this
algorithm.<br />
As usual, we made a huge refactoring to reach the best performance, especially
tuning the Intel target, with some dedicated asm and code rewrite.</p>
<p>Some information and numbers extracted from the new source code
comments:</p>
<pre>
With Delphi 10.3 on Win32: (no benefit)<br /> 100000 FloatToText in 38.11ms i.e. 2,623,570/s, aver. 0us, 47.5 MB/s<br /> 100000 str in 43.19ms i.e. 2,315,082/s, aver. 0us, 50.7 MB/s<br /> 100000 DoubleToShort in 45.50ms i.e. 2,197,367/s, aver. 0us, 43.8 MB/s<br /> 100000 DoubleToAscii in 42.44ms i.e. 2,356,045/s, aver. 0us, 47.8 MB/s<br /> With Delphi 10.3 on Win64:<br /> 100000 FloatToText in 61.83ms i.e. 1,617,233/s, aver. 0us, 29.3 MB/s<br /> 100000 str in 53.20ms i.e. 1,879,663/s, aver. 0us, 41.2 MB/s<br /> 100000 DoubleToShort in 18.45ms i.e. 5,417,998/s, aver. 0us, 108 MB/s<br /> 100000 DoubleToAscii in 18.19ms i.e. 5,496,921/s, aver. 0us, 111.5 MB/s<br /> With FPC on Win32:<br /> 100000 FloatToText in 115.62ms i.e. 864,842/s, aver. 1us, 15.6 MB/s<br /> 100000 str in 57.30ms i.e. 1,745,109/s, aver. 0us, 39.9 MB/s<br /> 100000 DoubleToShort in 23.88ms i.e. 4,187,078/s, aver. 0us, 83.5 MB/s<br /> 100000 DoubleToAscii in 23.34ms i.e. 4,284,490/s, aver. 0us, 86.9 MB/s<br /> With FPC on Win64:<br /> 100000 FloatToText in 76.92ms i.e. 1,300,052/s, aver. 0us, 23.5 MB/s<br /> 100000 str in 27.70ms i.e. 3,609,456/s, aver. 0us, 82.6 MB/s<br /> 100000 DoubleToShort in 14.73ms i.e. 6,787,944/s, aver. 0us, 135.4 MB/s<br /> 100000 DoubleToAscii in 13.78ms i.e. 7,253,735/s, aver. 0us, 147.2 MB/s<br /> With FPC on Linux x86_64:<br /> 100000 FloatToText in 98.47ms i.e. 1,015,465/s, aver. 0us, 18.4 MB/s<br /> 100000 str in 38.14ms i.e. 2,621,369/s, aver. 0us, 60 MB/s<br /> 100000 DoubleToShort in 14.77ms i.e. 6,766,357/s, aver. 0us, 134.9 MB/s<br /> 100000 DoubleToAscii in 13.79ms i.e. 7,248,477/s, aver. 0us, 147.1 MB/s
</pre>
<p>As you can see:</p>
<ul>
<li>Our rewrite is twice faster than original flt_conv.inc from FPC RTL
(str)</li>
<li>Delphi Win32 has trouble making 64-bit computation - no benefit since it
has good optimized i87 asm (but slower than our code with FPC/Win32)</li>
<li>FPC is more efficient when compiling integer arithmetic; we avoided slow
division by calling our Div100(), but Delphi Win64 is still far behind</li>
<li>Delphi Win64 has very slow FloatToText and str() implementation (in pure
pascal) - so our new version is welcome.</li>
</ul>
<div>In a nutshell, this routine is now used on all platform (even ARM and
AARCH64), with the exception of Delphi Win32, in which the built-in x87 asm is
a bit faster, mainly due to performance problems of the Delphi compiler when
handling 64-bit logical and arithmetic process on the i386 CPU.</div>
<div>You can <a href="https://github.com/synopse/mORMot/blob/master/SynDoubleToText.inc">check the
source code</a> of our implementation of Grisu. You may find some nice
performance tricks.<br />
And any feedback is <a href="https://synopse.info/forum/viewtopic.php?id=5353">welcome in our forum, as
usual</a>!</div>