Synopse Open Source - Tag - LazarusmORMot MVC / SOA / ORM and friends2024-02-02T17:08:25+00:00urn:md5:cc547126eb580a9adbec2349d7c65274DotclearEnd Of Live OpenSSL 1.1 vs Slow OpenSSL 3.0urn:md5:f20e1a3a1c96e8f65f1fc8ef5a04498c2023-09-08T11:59:00+01:002023-09-08T14:13:49+01:00Arnaud BouchezOpen SourceAESCertificatesDelphiFreePascalGoodPracticeLateBindingLazarusMaxOSXmORMotmORMot2OpenSSLperformancesecuritySource<p>You may have noticed that the OpenSSL 1.1.1 series will reach End of Life (EOL) next Monday...<br />
Most sensible options are to switch to 3.0 or 3.1 as soon as possible.</p>
<p><img src="https://blog.synopse.info?post/public/blog/mormotSecurity.jpg" alt="mormotSecurity.jpg, Sep 2023" /></p>
<p>Of course, our <a href="https://github.com/synopse/mORMot2/blob/master/src/lib/mormot.lib.openssl11.pas"><em>mORMot 2</em> OpenSSL unit</a> runs on 1.1 and 3.x branches, and self-adapt at runtime to the various API incompatibilities existing between each branch.<br />
But we also discovered that switching to OpenSSL 3.0 could led into big performance regressions... so which version do you need to use?</p> <h4>OpenSSL 1.1 End Of Live</h4>
<p><img src="https://upload.wikimedia.org/wikipedia/commons/thumb/6/6a/OpenSSL_logo.svg/320px-OpenSSL_logo.svg.png" alt="OpenSSL logo" /></p>
<p>The well known and well established OpenSSL 1.1.1 series will reach End of Life (EOL) on 11th September 2023. So next Monday! <img src="https://blog.synopse.info?pf=sad.svg" alt=":(" class="smiley" /> <br />
Users of OpenSSL 1.1.1 should consider their options and plan any actions they might need to take.</p>
<p>Note that Indy users are <a href="https://github.com/IndySockets/Indy/issues/183">still stuck to the OpenSSL 1.0 branch</a>, even 1.1 is not yet officially supported. Some <a href="https://github.com/IndySockets/Indy/pull/299">alternate IO handlers</a> are able to use newest releases - to some extend.<br />
Indy users should rather move to a better supported library, like our little <em>mORMot</em>.</p>
<p>Also note that there are some API incompatibilities between 1.1 and 3.x. Functions have been renamed, or even removed; new context constructors appeared; some parameters types even changed!<br />
Our unit tries to address all those problems at runtime, and is tested against several version of the OpenSSL library, to ensure you do not have to worry about those low-level issues.</p>
<h4>OpenSSL 3.x Benefits</h4>
<p>With OpenSSL 3.0, the developpers did a huge refactoring of the library internals.<br />
To be fair, the 1.x source code of OpenSSL was kind of a mess, and difficult to maintain. The biggest IT companies did even made their own forks or switched to other libraries. The best known is <a href="https://boringssl.googlesource.com/boringssl/">BoringSSL</a>, maintained by Google, and used e.g. in Chrome and Android.<br />
So it was time for a refactoring, especially for a library as critical as OpenSSL for so many projects.</p>
<p>With the new 3.x branch, a lot of low-level API functions have been deprecated.<br />
In practice, you don't have direct access any more to the internal structures of the library, and should now always use the high-level API to access a context property, or execute the processing methods. For instance, the low-level <code>AES_encrypt</code> function is not available any more: from now on, you need to use the high-level <code>EVP_Encrypt*</code> API.<br />
The official <a href="https://www.openssl.org/docs/man3.0/man7/migration_guide.html">Migration Guide page</a> is clearly huge, and worth reading if you want to prepare yourself to the upcoming years with OpenSSL.</p>
<h4>OpenSSL 3.0 Performance Regression</h4>
<p>The 3.0 branch new code may seem more beautiful and more maintainable, but it had its drawbacks. Newer is not always better.<br />
Most users of this new release <a href="https://github.com/openssl/openssl/issues/17064">observed a huge performance regression</a> when switching from 1.x to 3.0. It affected a lot of projects, from various languages, even script languages which were not already shining about performance. Time regression from 3x up to 10x were reported. On our side, X509 certificates manipulation was really slower than before - the worse being about X509 stores.</p>
<p>Some slowdown were expected and documented (like RSA key generation, which now uses 64 rounds). But the regression was much deeper.<br />
The culprit seems not to be the core cryptographic code, like AES buffer encoding (which asm claims to have been optimized even further on 3.x branch), but the OpenSSL context structures themselves. They were rewritten for future maintainability, but not focusing on their actual performance.</p>
<h4>OpenSSL 3.1 Numbers</h4>
<p>The 3.1 branch claims to have addressed most of these problems.</p>
<p><img src="https://upload.wikimedia.org/wikipedia/commons/thumb/e/ea/The_Tortoise_and_the_Hare_-_Project_Gutenberg_etext_19994.jpg/334px-The_Tortoise_and_the_Hare_-_Project_Gutenberg_etext_19994.jpg" alt="The Tortoise and the Hare" /></p>
<p>To be sure, we run the <em>mORMot</em> cryptographic regression tests with several versions of OpenSSL. And in fact, OpenSSL 3.1 was much faster than OpenSSL 3.0, but still behind OpenSSL 1.1.<br />
Here are the numbers we observed for the whole <code>TTestCoreCrypto</code> method execution, executed on Win32:</p>
<ul>
<li>OpenSSL 1.1 = 15 sec</li>
<li>OpenSSL 3.0 = 33 sec</li>
<li>OpenSSL 3.1 = 18 sec</li>
</ul>
<p>There are several aspects to emphasize:</p>
<ul>
<li>Those tests runs also <em>mORMot</em> engine cryptography, so you don't only test OpenSSL: the "pure mORMot" tests take around 4.5 seconds in the above numbers;</li>
<li>Any serious project should consider compiling on Win64, and running a server on a x86_64 Linux - on this platform, the regression does exist, but only slightly better;</li>
<li>The slowdown was less affecting <code>TTestCoreCrypto.Benchmark</code> (i.e. raw buffer encryption) than <code>TTestCoreCrypto.Catalog</code> (i.e. certificates process);</li>
<li>Our tests were mono-threaded, and worse slow down were reported on heavily threaded process (up to x10).</li>
</ul>
<p>Within the <em>mORMot</em> OpenSSL wrapper, we try to cache as many context as possible. For instance, we don't lookup the OpenSSL algorithm by name for each call, but we cache it at runtime to avoid any slowdown.<br />
But it seems not enough with OpenSSL 3.0, which may affect your application performance.</p>
<h4>To Support or Not Support</h4>
<p>So OpenSSL 3.1 seems to be the way to go.</p>
<p>On Linux (or other POSIX systems), you are likely to use the library shipped with the system.<br />
So you would not worry about which version to use. And, sadly, it is very likely that your distribution provides OpenSSL 3.0 and not OpenSSL 3.1.</p>
<p>On Windows (or Mac), you could (should?) use your "own" dll/so files, so you have to take into account the support level of the library.<br /></p>
<ul>
<li>OpenSSL 3.0 is a Long Term Support (LTS) version, which will be maintained until 7th September 2026.<br /></li>
<li>OpenSSL 3.1 will be supported only until 14th March 2025.</li>
</ul>
<p>These support end dates could appear counter-intuitive, but this is an usual way in Open Source projects, the best known being perhaps <a href="https://ubuntu.com/blog/what-is-an-ubuntu-lts-release">Ubuntu LTS versions</a>.<br />
For more information about OpenSSL support lifetime, look at the <a href="https://www.openssl.org/source/">official OpenSSL Downloads page</a>.</p>
<p>So, for most projects, especially on Windows where you are likely to publish OpenSSL dll with your own executable, switching to OpenSSL 3.1 is likely to be the way to go.<br />
If you need to gather some security certification for your product, you may consider using OpenSSL 3.0 LTS version, which may help your certification remain active for a longer period.</p>
<p>Any feedback is <a href="https://synopse.info/forum/viewtopic.php?id=6697">welcome on our forum</a>, as usual!</p>mORMot 2 on Ampere AARM64 CPUurn:md5:01baca710d9e6371f285e77a90accdcd2021-08-17T13:16:00+01:002021-08-17T16:46:59+01:00Arnaud BouchezmORMot Framework64bitaarch64AESAES-GCMAES-NiampereAndroidasmavxblogCcompressioncrccrc32cFPCFreePascalLazarusLinuxMicroservicesmORMotmORMot2multithreadoraclecloudperformanceRESTSOASQLite3<p>Last weeks, we have enhanced mORMot support to one of the more powerful AARM64 CPU available: the <a href="https://amperecomputing.com/">Ampere Altra CPU</a>, as made available on the <a href="https://www.oracle.com/cloud/compute/arm/">Oracle Cloud Infrastructure</a>.</p>
<p><img src="https://blog.synopse.info/public/blog/AmpereCPU.jpg" alt="" /></p>
<p>Long story short, this is an amazing hardware to run on server side, with performance close to what Intel/AMD offers, but with <a href="https://www.oracle.com/cloud/compute/arm/why-arm-processors/">almost linear multi-core scalability</a>. The FPC compiler is able to run good code on it, and our mORMot 2 library is able to use the hardware accelerated opcodes for AES, SHA2, and crc32/crc32c.</p> <h3>Always Free Ampere VM</h3>
<p>Back to the beginning. Tom, one mORMot user, reported on our forum that he successfully <a href="https://synopse.info/forum/viewtopic.php?id=5945">installed FPC and Lazarus on the Oracle Cloud platform</a>, and accessed it via SSH/XRDP:</p>
<blockquote><p>Just open account on Oracle Cloud and create new compute VM: 4 ARMv8.2 CPU 3GHz, 24GB Ram (yes 24GB).
This is always free VM (you can combine this 4 cores and 24GB Ram to 1 or many (4) VM).
Install Ubuntu 20.04 server, then install LXDE and XRdp for remote access.
Now I have nice speed workstation. Install fpcupdeluxe then fpc 3.2.2/laz 2.0.12, all OK. fpcup build is faster then my local pc build <img src="https://blog.synopse.info?pf=smile.svg" alt=":-)" class="smiley" />
This OCI VM can be great mormot application server for some projects. I don't have any connection to Oracle - just test their product.</p></blockquote>
<p>I did the same, and in fact, this platform is really easy to work with, once you have paid a 1€ credit card fee to validate your account. Then you will get an "Always Free VM", with 4 Ampere cores, and 24GB of Ram. Amazing. The Oracle people really like to break into the cloud market, and they make it wide open for developers, so that they consider their Cloud instead of Microsoft's or Amazon's.</p>
<h3>FPC and Lazarus on Linux/AArch64</h3>
<p>The Lazarus experiment is very good on this platform, even remotely. The only issue is the debugger. Gdb was pretty unstable for me - almost as unstable as on Windows. But somewhat usable, until it crashes. <img src="https://blog.synopse.info?pf=sad.svg" alt=":(" class="smiley" /></p>
<p>Finally, and thanks to Alfred - our great friend behind <a href="https://github.com/LongDirtyAnimAlf/fpcupdeluxe">fpcupdeluxe</a> - we identified a problem with our asm stubs when calling mORMot interface-based services. It was in fact a FPC "feature" (it is documented as such in the compiler so it is not a bug), in how arguments are passed as result in the AARCH64 calling ABI. Once identified, we made an explicit exception to help circumvent the problem.</p>
<p>The FPC code quality seems good. At least at the level of the x86_64 Intel/AMD platform. Not as good as gcc for sure, but good enough for production code, and good speed. The only big limitation is that the inlined assembly is very limited: only a few AARCH64 opcodes are available - only what was mandatory for the basic FPC RTL needs.</p>
<h3>Tuning mORMot 2 for AArch64</h3>
<p>To enhance performance, we replaced the basic FPC RTL Move/FillChar functions by the libc memmov/memset. And performance is amazing:</p>
<pre>
FPC RTL
FillChar in 19.06ms, 20.3 GB/s
Move in 9.95ms, 1.5 GB/s
small Move in 15.85ms, 1.3 GB/s
big Move in 222.41ms, 1.7 GB/s
mORMot functions calling the Ubuntu Gnu libc:
FillCharFast in 8.84ms, 43.9 GB/s
MoveFast in 1.25ms, 12.4 GB/s
small MoveFast in 4.98ms, 4.3 GB/s
big MoveFast in 34.32ms, 11.3 GB/s
</pre>
<p>In comparison, here are the numbers on my Core i5 7200U CPU, of mORMot tuned x86_64 asm (faster than the FPC RTL), using SSE2 or AVX instructions:</p>
<pre>
FillCharFast [] in 21.43ms, 18.1 GB/s
MoveFast [] in 2.29ms, 6.8 GB/s
small MoveFast [] in 4.29ms, 5.1 GB/s
big MoveFast [] in 68.33ms, 5.7 GB/s
FillCharFast [cpuAVX] in 20.28ms, 19.1 GB/s
MoveFast [cpuAVX] in 2.26ms, 6.8 GB/s
small MoveFast [cpuAVX] in 4.25ms, 5.1 GB/s
big MoveFast [cpuAVX] in 69.93ms, 5.5 GB/s
</pre>
<p>So we can see that the Ampere CPU memory design is pretty efficient. It is up to twice faster than a Core i5 7200U CPU.</p>
<p>We had to go further, and get some fun with one bottleneck of every server operation: encryption and hashes. So we wrote some C code to be able to use the efficient HW acceleration we wanted for encryption and hashes. You could find the source code in <a href="https://github.com/synopse/mORMot2/tree/master/res/static/armv8">the /res/static/armv8 sub-folder of our repository</a>. Now we have tremendous performance for AES, GCM, SHA2 and CRC32/CRC32C computation.</p>
<pre>
2500 crc32c in 259us i.e. 9.2M/s or 20 GB/s
2500 xxhash32 in 1.47ms i.e. 1.6M/s or 3.5 GB/s
2500 crc32 in 259us i.e. 9.2M/s or 20 GB/s
2500 adler32 in 469us i.e. 5M/s or 11 GB/s
2500 hash32 in 584us i.e. 4M/s or 8.9 GB/s
2500 md5 in 12.12ms i.e. 201.3K/s or 438.7 MB/s
2500 sha1 in 21.75ms i.e. 112.2K/s or 244.5 MB/s
2500 hmacsha1 in 23.81ms i.e. 102.5K/s or 223.4 MB/s
2500 sha256 in 3.41ms i.e. 714.7K/s or 1.5 GB/s
2500 hmacsha256 in 4.12ms i.e. 591.2K/s or 1.2 GB/s
2500 sha384 in 27.71ms i.e. 88K/s or 191.9 MB/s
2500 hmacsha384 in 32.69ms i.e. 74.6K/s or 162.7 MB/s
2500 sha512 in 27.73ms i.e. 88K/s or 191.8 MB/s
2500 hmacsha512 in 32.77ms i.e. 74.4K/s or 162.3 MB/s
2500 sha3_256 in 35.82ms i.e. 68.1K/s or 148.5 MB/s
2500 sha3_512 in 65.48ms i.e. 37.2K/s or 81.2 MB/s
2500 rc4 in 12.98ms i.e. 188K/s or 409.8 MB/s
2500 mormot aes-128-cfb in 8.84ms i.e. 276.1K/s or 601.7 MB/s
2500 mormot aes-128-ofb in 3.78ms i.e. 645K/s or 1.3 GB/s
2500 mormot aes-128-c64 in 4.39ms i.e. 555.6K/s or 1.1 GB/s
2500 mormot aes-128-ctr in 4.52ms i.e. 539.7K/s or 1.1 GB/s
2500 mormot aes-128-cfc in 9.16ms i.e. 266.3K/s or 580.4 MB/s
2500 mormot aes-128-ofc in 5.25ms i.e. 465K/s or 0.9 GB/s
2500 mormot aes-128-ctc in 5.74ms i.e. 425.1K/s or 0.9 GB/s
2500 mormot aes-128-gcm in 7.52ms i.e. 324.5K/s or 707.2 MB/s
2500 mormot aes-256-cfb in 9.52ms i.e. 256.2K/s or 558.4 MB/s
2500 mormot aes-256-ofb in 4.71ms i.e. 517.6K/s or 1.1 GB/s
2500 mormot aes-256-c64 in 5.30ms i.e. 460.5K/s or 0.9 GB/s
2500 mormot aes-256-ctr in 5.33ms i.e. 457.5K/s or 0.9 GB/s
2500 mormot aes-256-cfc in 10.04ms i.e. 243K/s or 529.5 MB/s
2500 mormot aes-256-ofc in 6.11ms i.e. 399K/s or 869.6 MB/s
2500 mormot aes-256-ctc in 6.77ms i.e. 360.4K/s or 785.5 MB/s
2500 mormot aes-256-gcm in 8.38ms i.e. 291.1K/s or 634.4 MB/s
2500 openssl aes-128-cfb in 4.94ms i.e. 493.4K/s or 1 GB/s
2500 openssl aes-128-ofb in 4.12ms i.e. 591.2K/s or 1.2 GB/s
2500 openssl aes-128-ctr in 1.94ms i.e. 1.2M/s or 2.6 GB/s
2500 openssl aes-128-gcm in 3.18ms i.e. 767K/s or 1.6 GB/s
2500 openssl aes-256-cfb in 5.83ms i.e. 418.5K/s or 912.1 MB/s
2500 openssl aes-256-ofb in 5.04ms i.e. 484.1K/s or 1 GB/s
2500 openssl aes-256-ctr in 2.42ms i.e. 0.9M/s or 2.1 GB/s
2500 openssl aes-256-gcm in 3.66ms i.e. 667K/s or 1.4 GB/s
2500 shake128 in 29.63ms i.e. 82.3K/s or 179.5 MB/s
2500 shake256 in 35.07ms i.e. 69.5K/s or 151.6 MB/s
</pre>
<p>Here are the numbers on my Core i5 7200U CPU, with optimized asm, and the last OpenSSL calls:</p>
<pre>
2500 crc32c in 224us i.e. 10.6M/s or 23.1 GB/s
2500 xxhash32 in 817us i.e. 2.9M/s or 6.3 GB/s
2500 crc32 in 341us i.e. 6.9M/s or 15.2 GB/s
2500 adler32 in 241us i.e. 9.8M/s or 21.5 GB/s
2500 hash32 in 441us i.e. 5.4M/s or 11.7 GB/s
2500 aesnihash in 218us i.e. 10.9M/s or 23.8 GB/s
2500 md5 in 8.29ms i.e. 294.1K/s or 641.1 MB/s
2500 sha1 in 13.72ms i.e. 177.8K/s or 387.5 MB/s
2500 hmacsha1 in 15.05ms i.e. 162.1K/s or 353.3 MB/s
2500 sha256 in 17.40ms i.e. 140.2K/s or 305.6 MB/s
2500 hmacsha256 in 18.71ms i.e. 130.4K/s or 284.2 MB/s
2500 sha384 in 11.59ms i.e. 210.5K/s or 458.9 MB/s
2500 hmacsha384 in 13.84ms i.e. 176.3K/s or 384.2 MB/s
2500 sha512 in 11.59ms i.e. 210.5K/s or 458.8 MB/s
2500 hmacsha512 in 13.89ms i.e. 175.7K/s or 382.9 MB/s
2500 sha3_256 in 26.66ms i.e. 91.5K/s or 199.5 MB/s
2500 sha3_512 in 47.96ms i.e. 50.9K/s or 110.9 MB/s
2500 rc4 in 14.05ms i.e. 173.7K/s or 378.6 MB/s
2500 mormot aes-128-cfb in 4.59ms i.e. 530.9K/s or 1.1 GB/s
2500 mormot aes-128-ofb in 4.52ms i.e. 539.4K/s or 1.1 GB/s
2500 mormot aes-128-c64 in 6.23ms i.e. 391.7K/s or 853.7 MB/s
2500 mormot aes-128-ctr in 1.40ms i.e. 1.6M/s or 3.6 GB/s
2500 mormot aes-128-cfc in 4.75ms i.e. 513.2K/s or 1 GB/s
2500 mormot aes-128-ofc in 5.22ms i.e. 467.7K/s or 0.9 GB/s
2500 mormot aes-128-ctc in 1.72ms i.e. 1.3M/s or 3 GB/s
2500 mormot aes-128-gcm in 2.28ms i.e. 1M/s or 2.2 GB/s
2500 mormot aes-256-cfb in 6.12ms i.e. 398.4K/s or 868.3 MB/s
2500 mormot aes-256-ofb in 6.10ms i.e. 400K/s or 871.7 MB/s
2500 mormot aes-256-c64 in 7.86ms i.e. 310.6K/s or 676.9 MB/s
2500 mormot aes-256-ctr in 1.82ms i.e. 1.3M/s or 2.8 GB/s
2500 mormot aes-256-cfc in 6.36ms i.e. 383.5K/s or 835.9 MB/s
2500 mormot aes-256-ofc in 6.77ms i.e. 360.1K/s or 784.8 MB/s
2500 mormot aes-256-ctc in 2.02ms i.e. 1.1M/s or 2.5 GB/s
2500 mormot aes-256-gcm in 2.68ms i.e. 909.2K/s or 1.9 GB/s
2500 openssl aes-128-cfb in 7.11ms i.e. 342.9K/s or 747.3 MB/s
2500 openssl aes-128-ofb in 5.21ms i.e. 468K/s or 1 GB/s
2500 openssl aes-128-ctr in 1.54ms i.e. 1.5M/s or 3.3 GB/s
2500 openssl aes-128-gcm in 1.85ms i.e. 1.2M/s or 2.8 GB/s
2500 openssl aes-256-cfb in 8.65ms i.e. 282.2K/s or 615 MB/s
2500 openssl aes-256-ofb in 6.82ms i.e. 357.6K/s or 779.3 MB/s
2500 openssl aes-256-ctr in 1.93ms i.e. 1.2M/s or 2.6 GB/s
2500 openssl aes-256-gcm in 2.27ms i.e. 1M/s or 2.2 GB/s
2500 shake128 in 23.47ms i.e. 104K/s or 226.6 MB/s
2500 shake256 in 29.64ms i.e. 82.3K/s or 179.5 MB/s
</pre>
<p>The mORMot plain pascal code is used for MD5, SHA1, or shake/SHA3. So it is slower than our optimized asm for Intel/AMD. But not so slow. And those algorithms are either deprecated or not widely used - therefore they are not a bottleneck. OpenSSL numbers are pretty good too on this platform. As a result, AES, GCM, SHA-2 and crc32/crc32c performance is comparable between AARCH64 and Intel/AMD. With amazing SHA-2 numbers.</p>
<p>Then, we compiled the latest SQLite3, Lizard and libdeflate as static libraries, so that you could use them with your executable with no external dependency. Performance is very good:</p>
<pre>
TAlgoSynLZ 3.8 MB->2 MB: comp 287:151MB/s decomp 215:409MB/s
TAlgoLizard 3.8 MB->1.9 MB: comp 18:9MB/s decomp 857:1667MB/s
TAlgoLizardFast 3.8 MB->2.3 MB: comp 193:116MB/s decomp 1282:2135MB/s
TAlgoLizardHuffman 3.8 MB->1.8 MB: comp 84:40MB/s decomp 394:827MB/s
TAlgoDeflate 3.8 MB->1.5 MB: comp 30:12MB/s decomp 78:196MB/s
TAlgoDeflateFast 3.8 MB->1.6 MB: comp 48:20MB/s decomp 73:174MB/s
</pre>
<p>I was a bit surprised by how well the pure pascal version of SynLZ algorithm was running, once compiled with FPC 3.2, on AARCH64. Also the Deflate compression has a small advantage of using our statically linked libdeflate in respect to the plain zlib. But the very good news is that Lizard is really fast on AARCH64: even if it is written in plain C with no manual SIMD/asm code, it is really fast on non Intel/AMD platforms. More than 2GB/s for decompression is very high. I was told that Lizard may be a bit behind ZStandard on Intel/AMD, but its code is simpler, and much more CPU agnostic.</p>
<pre>
2.4. Sqlite file memory map:
- Database direct access: 22,264 assertions passed 55.40ms
- Virtual table direct access: 12 assertions passed 347us
- TOrmTableJson: 144,083 assertions passed 60.25ms
- TRestClientDB: 608,196 assertions passed 783.02ms
- Regexp function: 6,015 assertions passed 11.07ms
- TRecordVersion: 20,060 assertions passed 51.28ms
Total failed: 0 / 800,630 - Sqlite file memory map PASSED 961.45ms
</pre>
<p>Here SQLite3 numbers are similar to what I have on Intel/AMD. So I guess we could really consider using this database as storage back-end for mORMot MicroServices with their stand-alone persistence layer.</p>
<h3>Ampere and Beyond - Apple M1?</h3>
<p>We also tried to support as much as possible the ARM/AARCH64 CPUs with mORMot 2. So now we detect the CPU type and HW platform it runs on, especially on Linux or Android - which is also an AARCH64 platform. Here is what our regression tests report at their ending:</p>
<pre>
Ubuntu 20.04.2 LTS - Linux 5.8.0-1037-oracle (cp utf8)
2 x ARM Neoverse-N1 (aarch64)
on QEMU KVM Virtual Machine virt-4.2
Using mORMot 2.0.1
TSqlite3LibraryStatic 3.36.0 with internal MM
Generated with: Free Pascal 3.2 64 bit Linux compiler
Time elapsed for all tests: 44.38s
Performed 2021-08-17 13:44:09 by ubuntu on lxde
Total assertions failed for all test suits: 0 / 66,050,607
</pre>
<p>As you can see, the CPU was properly identified as <a href="https://www.arm.com/products/silicon-ip-cpu/neoverse/neoverse-n1">ARM Neoverse-N1</a>.</p>
<p>We could consider with good faith using <em>mORMot</em> code on an Apple M1/M1X/M2 CPU, thanks to the FPC (cross-)compiler. If we have access to this hardware. Any feedback is welcome.</p>
<h3>Server Process Performance</h3>
<p>All regression tests do pass whole green, with pretty consistent performance among all its various tasks. JSON process, ORM, SOA or encryption: everything flies on the Ampere CPU. You can check <a href="https://gist.github.com/synopse/0e7275684a2e2bbd2206940c3827055c">the detailed regression tests console output</a>.</p>
<p>Here are some numbers about UTF-8 or JSON process:</p>
<pre>
StrLen() in 1.43ms, 13.3 GB/s
IsValidUtf8(RawUtf8) in 11.75ms, 1.6 GB/s
IsValidUtf8(PUtf8Char) in 13.08ms, 1.4 GB/s
IsValidJson(RawUtf8) in 22.84ms, 858.2 MB/s
IsValidJson(PUtf8Char) in 22.93ms, 854.7 MB/s
JsonArrayCount(P) in 22.97ms, 853.1 MB/s
JsonArrayCount(P,PMax) in 22.89ms, 856.4 MB/s
JsonObjectPropCount() in 11.90ms, 0.9 GB/s
jsonUnquotedPropNameCompact in 72.35ms, 240.6 MB/s
jsonHumanReadable in 119.06ms, 209.4 MB/s
TDocVariant in 245.99ms, 79.7 MB/s
TDocVariant no guess in 260.57ms, 75.2 MB/s
TDocVariant dvoInternNames in 247.56ms, 79.1 MB/s
TOrmTableJson GetJsonValues in 34.88ms, 247.1 MB/s
TOrmTableJson expanded in 42.70ms, 459 MB/s
TOrmTableJson not expanded in 21.42ms, 402.4 MB/s
DynArrayLoadJson in 87.96ms, 222.8 MB/s
TOrmPeopleObjArray in 131.10ms, 149.5 MB/s
fpjson in 115.09ms, 17 MB/s
</pre>
<p>It is nice to see that our pascal code, which has been deeply tuned to let FPC generate the best x86_64 assembly possible, is also able to give very good performance on AARCH64. No need to write some dedicated code, and pollute the source with plenty of <em>$ifdef/$endif</em>: x86_64 is already some kind of RISC-like architecture, with a bigger number of registers, and 64-bit efficient processing. No need to rewrite everything. Optimized pascal code, with tuned pointer arithmetic is platform neutral. I like the quote of SQLite3 author saying that <a href="https://www.sqlite.org/whyc.html">C is a "portable assembly"</a>, and that we could also use tuned pascal code, as we try to do in the mORMot core units, to leverage modern CPU hardware, without the need of fighting against any hype/versatile language.</p>
<h3>Asm is Fun Again</h3>
<p>So we are pretty excited to see how this platform will go in the future. mORMot has invested a lot of time, refactoring and asm tuning to leverage the Intel/AMD platform, focusing on the server side performance. But this AARCH64 technology is really promising, and I can tell you that its RISC instruction set was very cleverly designed. It is very rich and powerful, almost perfect in its balance between power and expressiveness, in respect to the x86_64 platform, which has a lot of inconsistencies and seems outdated when you compare both asm. After decades playing with i386 or x86_64 asm, I had fun again with the ARM v8 assembly. It tastes like "assembly as it should be" (tm). Linking some static C code is a good balance between leveraging the hardware when needed, and keeping platform-independent pascal source. And FPC, as a compiler, is amazing by being open and well done on so many CPUs and platforms. Open Source rocks!</p>
<p>As usual, <a href="https://synopse.info/forum/viewtopic.php?pid=35602#p35602">feedback is welcome on our forum</a>.</p>Job Offer: FPC mORMot 2 and WAPTurn:md5:ca4b0bf9d944e866fee17f4328e82b462021-07-08T15:42:00+01:002021-07-08T15:42:00+01:00Arnaud BouchezPascal ProgrammingFPCJobLazarusmORMotmORMot2WAPT<p>Good news!<br />
The French company I work for, Tranquil IT, is hiring FPC / Lazarus / mORMot developers. Remote work possible.</p>
<p><img src="https://blog.synopse.info/public/blog/logo_Tranquil_IT.png" alt="" /></p>
<p>I share below the Job Offer from my boss Vincent.<br />
We look forward working with you on this great mORMot-powered project!</p>
<p><a href="https://www.tranquil.it/en/who-are-we/join-us/">https://www.tranquil.it/en/who-are-we/join-us</a></p> <p><em>If you dream to work on great projects with a team of talented developers and with technologies you love and believe in, Tranquil IT wants to hire you too.</em></p>
<p><em>Tranquil IT is based in Nantes, on the French Atlantic coast. If you have the right skills and you are self-driven, you can work in Nantes or remotely (as Arnaud does).</em></p>
<p><em>Our team is fluent in English, mORMot, Lazarus, FreePascal, Python and system administration. Tranquil IT is best known for developing one of the most useful tool to help private and public organizations prevent cyberattacks, <a href="https://www.wapt.fr/en/doc">WAPT Deployment software</a>. Tranquil IT is also known for her work with <a href="https://samba.tranquil.it/doc/en">Samba Active Directory</a>.</em></p>
<p><em>We have a lot of ideas to improve WAPT, so join us, bring ideas of your own, and become part of the mORMot / Tranquil IT adventure!</em></p>
<p><em>To apply, contact us at rh (at) tranquil (dot) it.</em><br />
Vincent CARDON, Président<br />
TRANQUIL IT</p>
<p><em>PS: You can send your resumé to this email address, preferably with links to code you proudly wrote (we enjoy reading nice code!). Indicate whether you are interested on working on the mORMot framework, on improving on Lazarus/FPC components, or working on the end-user software WAPT.</em></p>Status of mORMot ORM SOA MVC with FPCurn:md5:713e7f794fdb05475b36cadd4cde47f02018-02-07T10:31:00+01:002020-07-03T09:29:59+02:00AB4327-GANDImORMot Framework64bitasmblogBSDcrc32cCrossPlatformDelphiECCexceptionFreePascalLazarusLinuxMaxOSXmORMotORMperformanceRestRTTISOASourceSQLite3sse42<p>In the last weeks/months, we worked a lot with FPC.<br />
Delphi is still our main IDE, due to its better debugging experience under
Windows, but we target to have premium support of FPC, on all platforms,
especially Linux.</p>
<p><img src="https://blog.synopse.info?post/public/ScreenShots/lazarusaboutbox.png" alt="" title="Lazarus FPC About, Feb 2018" /></p>
<p>The new Delphi Linux compiler is out of scope, since it is heavily priced,
its performance is not so good, and ARC broke memory management so would need a
deep review/rewrite of our source code, which we can't afford - since we have
FPC which is, <a href="https://synopse.info/forum/viewtopic.php?pid=25984#p25984">from our
opinion</a>, a much better compiler for Linux.<br />
Of course, you can create clients for Delphi Linux and FMX, as usual, using
the <a href="https://synopse.info/files/html/Synopse%20mORMot%20Framework%20SAD%201.18.html#TITL_86">cross-platform
client parts of mORMot</a>. But for server side, this compiler is not
supported, and will probably never be.</p> <p>First of all, since FPC - and Lazarus, its sibling IDE - are Open Source and
free, we can focus on mainly support a single version of the compiler.<br />
Since some missing RTTI for interfaces were recently merged into the trunk, we
start from the current FPC trunk as our main version. Easier than maintaining
Delphi 5 - 10.2 compability!</p>
<p>To install it, we usually use the <a href="https://github.com/newpascal/fpcupdeluxe">fpcupdeluxe tool</a>: you <a href="https://github.com/newpascal/fpcupdeluxe/releases">download a single binary
for your platform</a>, then you run the executable, pickup the compiler (or
cross-compiler) versions you need, and everything is downloaded and compiled
from git/svn on your own computer. Then click on the desktop link, and the IDE
launches in seconds. Nice fresh air in respect to Delphi setup experience!</p>
<p style="margin-top: 0;">As I wrote, in the last weeks/months, we worked a lot
to improve FPC support.</p>
<p style="margin-top: 0;">Just a few commits:</p>
<ul>
<li><a href="https://github.com/synopse/mORMot/commit/7bf5951e322f0164d8e3852f98f9f48b1bd7a6d2">
crc32c 2x/4x speeup by using SSE4.2+pclmulqdq opcodes on x64</a> - speed is now
around 21GB/s</li>
<li><a href="https://github.com/synopse/mORMot/commit/5b569db11e47fd65e2e0f792e1383895e50352b6">
TSynDaemon fork/run support on Linux/Posix</a></li>
<li><a href="https://github.com/synopse/mORMot/commit/4e019e1b41f6759487e7479c194fcb28c25314c9">
fixed vtQWord proper support for FPC</a> - this doesn't exist in Delphi,
but it should!</li>
<li><a href="https://github.com/synopse/mORMot/commit/bc405cffb7bfb79646b46a6afc13968a0d804f32">
added pure pascal version of SynECC.pas</a> to run on all FPC platforms,
including ARM</li>
<li>BSD/OSX <a href="https://github.com/synopse/mORMot/commit/af8b33720d0da12a8476cd5bc34eeeab417cb096">
enhanced</a> <a href="https://github.com/synopse/mORMot/commit/16332915f87f5dcc7de881c9b72b09d05b609eba">
support</a></li>
<li><a href="https://github.com/synopse/mORMot/commit/a5352309f41d8ef70561d819e57732ac0aaf43e3">
added scripts to use fpcupdeluxe's gcc cross-compilers for FPC static linking
on Windows, Linux, BSD and OSX</a></li>
<li><a href="https://github.com/synopse/mORMot/commit/eaddf84b04b7eaf647adb598be140276b9b42046">
updated SQLite3 engine to latest version 3.22.0</a> - statically linked under
FPC Linux (no external dependency to the system libsqlite3.so)</li>
<li><a href="https://github.com/synopse/mORMot/commit/7c6d09df6caf3d8865e488d9147b5340c1a15fca">
HTTP server enhancements and fixes for high performance and stability under
Linux behind a local nginx proxy for production servers</a></li>
<li style="list-style: none"><a href="https://github.com/synopse/mORMot/commit/f40a379cbd3fb0e5076ac2cf664acb8b03ab86e2">
</a></li>
<li>better Linux compatibility</li>
<li><a href="https://github.com/synopse/mORMot/commit/ec4755a19722e744029244e7a2d1bfa985887f3e">
TSQLRecord QWord property fix</a></li>
<li><a href="https://github.com/synopse/mORMot/commit/4b7786c3be0d1aec5abaad26f9de90e58edcce16">
tuned/enhanced logging content</a></li>
<li><a href="https://github.com/synopse/mORMot/commit/d9fe9a5bf9305a0eb05c9f5a80d97a1b49f3bcb8">
deep refactoring of FPC RTTI access to have the same level than Delphi</a></li>
<li><a href="https://github.com/synopse/mORMot/commit/95c5d56edbcb641f716c36d78792853eae67c689">
implemented Exceptions interception and logging for FPC</a><br />
with call stack trace (if available) - includes source code lines if compiled
using -g or -gl switches - tested on Win32, Win64, Linux i386 et x86_64 but
should work on other OS <img src="https://blog.synopse.info?pf=smile.svg" alt=":)" class="smiley" /></li>
<li>SyNode JavaScript engine support under FPC / Linux x86_64 by <a href="https://github.com/synopse/mORMot/commits?author=ssoftpro">ssoftpro</a> and
<a href="https://github.com/synopse/mORMot/commits?author=pavelmash">pavelmash</a> -
including a lot of fixes and tuning for this platform <a href="https://synopse.info/forum/viewtopic.php?pid=25985#p25985">to be heavily used
on production</a></li>
<li>and a lot of smaller enhancements (just search for FPC <a href="https://synopse.info/fossil/timeline?n=500&y=ci&t=&ms=exact">in
the commit timeline</a>), especially <a href="https://github.com/synopse/mORMot/commit/c5ad9b1d1f57177a8fe686271370cb12fc29d3d9">
tuning</a> the pascal code to better compile and execute under FPC, which can
generate very efficient assembly!</li>
</ul>
<div>As you can see, exciting times!</div>
<div>To be honest, the more we work with FPC as a compiler, the more we like
it.<br />
Staty tuned, and we encourage you to discover FPC/Lazarus!</div>SynTaskDialog.pas version for Lazarusurn:md5:8be4f490f9315ca9a52060993ce7da0c2015-03-09T12:19:00+01:002015-03-09T12:21:06+01:00AB4327-GANDIOpen Source librariesblogCrossPlatformDelphiFreePascalLazarusLCLLinuxMaxOSXTaskDialog <p>Just to share a commit of some interest to FPC/Lazarus users.</p>
<p>Ondrej Pokorny (aka "reddwarf" in our forums) did send to us a nice
implementation of our <code>SynTaskDialog.pas</code> unit, compatible with
Lazarus.</p>
<p>Since it is incompatible with the current state of the other <em>mORMot</em>
UI units (which are still VCL-based), we have included the source in the
<a href="https://github.com/synopse/mORMot/tree/master/SQLite3/Samples/ThirdPartyDemos/Ondrej/SynTaskDialog4Lazarus">
Third Party subfolder</a> of our source code repository.<br />
Direct link is <a href="https://github.com/synopse/mORMot/tree/master/SQLite3/Samples/ThirdPartyDemos/Ondrej/SynTaskDialog4Lazarus">https://github.com/.../SynTaskDialog4Lazarus</a></p>
<p>Resulting unit is cross-platform, as stated by the following
screenshots:</p>
<p><a href="https://blog.synopse.info?post/public/ScreenShots/SynTaskDialogMacOSX.png"><img src="https://blog.synopse.info?post/public/ScreenShots/.SynTaskDialogMacOSX_m.jpg" alt="" title="SynTaskDialog for Lazarus on Mac OSX, Mar 2015" /></a></p>
<p><a href="https://blog.synopse.info?post/public/ScreenShots/SynTaskDialogLinux.png"><img src="https://blog.synopse.info?post/public/ScreenShots/.SynTaskDialogLinux_m.jpg" alt="" title="SynTaskDialog for Lazarus on Linux, Mar 2015" /></a></p>
<p>Feedback is <a href="http://synopse.info/forum/viewtopic.php?id=2410">welcome on our forum, as
usual</a>.</p>