Synopse Open SourcemORMot MVC / SOA / ORM and friends2024-02-02T17:08:25+00:00urn:md5:cc547126eb580a9adbec2349d7c65274DotclearIDocList/IDocDict JSON for Delphi and FPCurn:md5:b0a5a6970b7e21d2a5c0ea539316e6b52024-02-01T16:15:00+00:002024-02-02T17:08:25+00:00Arnaud BouchezmORMot FrameworkDelphiFreePascalGoodPracticeIDocDictIDocListJSONmORMotmORMot2performanceTDocVariant<p>Since years, our Open Source <em>mORMot</em> framework offers several ways to work with any combination of arrays/objects documents defined at runtime, e.g. via JSON, with a lot of features, and very high performance.</p>
<p><img src="https://blog.synopse.info?post/public/blog/json.png" alt="" /></p>
<p>Our <code>TDocVariant</code> custom variant type is a powerful way of working with such schema-less data, but it was found confusing by some users.<br />
So we developed a new set of <code>interface</code> definitions around it, to ease its usage, without sacrificing its power. We modelized them around Python <a href="https://www.w3schools.com/python/python_lists.asp">Lists</a> and <a href="https://www.w3schools.com/python/python_dictionaries.asp">Dictionaries</a>, which is proven ground - with some extensions of course.</p> <h4>TDocVariant Pros and Cons</h4>
<p>Since years, our <code>TDocVariant</code> can store any JSON/BSON document-based content, i.e. either:</p>
<ul>
<li>Name/value pairs, for object-oriented documents - internally identified as <code>dvObject</code> sub-type;</li>
<li>An array of values (including nested documents), for array-oriented documents - internally identified as <code>dvArray</code> sub-type;</li>
<li>Any combination of the two, by nesting <code>TDocVariant</code> instances.</li>
</ul>
<p>Every <code>TDocVariant</code> instance is also a custom <code>variant</code> type:</p>
<ul>
<li>So you can just store or convert it to or from <code>variant</code> variables;</li>
<li>You can use <em>late binding</em> to access its object properties, which is some kind of magic in the rigid world of modern pascal;</li>
<li>The Delphi IDE (and Lazarus 3.x) debuggers have native support of it, so can display its <code>variant</code> content as JSON;</li>
<li>If you define <code>variant</code> types in any class or record, our framework will recognize <code>TDocVariant</code> content and (un)serialize it as JSON, e.g. in its ORM, SOA or Mustache/MVC parts.</li>
</ul>
<p>Several drawbacks come also from this power:</p>
<ul>
<li>Switching between <code>variant</code> and its <code>TDocVariantData</code> record may be tricky, and it sometimes requires some confusing pointer references;</li>
<li>Each <code>TDocVariant</code> instance could be used as a weak reference to other data, or maintain its own content - in some corner cases, incorrect use may leak memory or get some GPF issues;</li>
<li>A <code>TDocVariant</code> could be either an object/dictionary or an array/list, so finding the right methods may be difficult, or raise exceptions at runtime;</li>
<li>It evolved from a simple store to a full in-memory engine, so the advanced features are usually underestimated;</li>
<li>The <code>TDocVariantData</code> record is far away from the class system most Delphi/FPC are used to;</li>
<li>By default, <code>double</code> values are not parsed - only <code>currency</code> - which makes sense if you don't want to loose any precision, but has been found confusing.</li>
</ul>
<p>Enough complains. <br />
Just make it better.</p>
<h4>Entering IDocList and IDocDict Interfaces</h4>
<p>We introduced two high-level wrapper <code>interface</code> types:</p>
<ul>
<li><code>IDocList</code> (or its alias <code>IDocArray</code>) to store a list of elements;</li>
<li><code>IDocDict</code> (or its alias <code>IDocObject</code>) to store a dictionary of key:value pairs.</li>
</ul>
<p>The <code>interface</code> methods and naming follows the usual Python List and Dictionaries, and wrap their own <code>TDocVariant</code> storage inside safe and class dedicated <code>IDocList</code> and <code>IDocDict</code> types.</p>
<p>You may be able to write on modern Delphi:</p>
<pre>
var
list: IDocList;
dict: IDocDict;
v: variant;
i: integer;
begin
// creating a new list/array from items
list := DocList([1, 2, 3, 'four', 1.0594631]); // double are allowed by default
// iterating over the list
for v in list do
Listbox1.Items.Add(v); // convert from variant to string
// or a sub-range of the list (with Python-like negative indexes)
for i in list.Range(0, -3) do
Listbox2.Items.Add(IntToStr(i)); // [1, 2] as integer
// search for the existence of some elements
assert(list.Exists(2));
assert(list.Exists('four'));
// a list of objects, from JSON, with an intruder
list := DocList('[{"a":0,"b":20},{"a":1,"b":21},"to be ignored",{"a":2,"b":22}]');
// enumerate all objects/dictionaries, ignoring non-objects elements
for dict in list.Objects do
begin
if dict.Exists('b') then
ListBox2.Items.Add(dict['b']);
if dict.Get('a', i) then
ListBox3.Items.Add(IntToStr(i));
end;
// delete one element
list.Del(1);
assert(list.Json = '[{"a":0,"b":20},"to be ignored",{"a":2,"b":22}]');
// extract one element
if list.PopItem(v, 1) then
assert(v = 'to be ignored');
// convert to a JSON string
Label1.Caption := list.ToString;
// display '[{"a":0,"b":20},{"a":2,"b":22}]'
end;
</pre>
<p>and even more advanced features, like sorting, searching, and expression filtering:</p>
<pre>
var
v: variant;
f: TDocDictFields;
list, list2: IDocList;
dict: IDocDict;
begin
list := DocList('[{"a":10,"b":20},{"a":1,"b":21},{"a":11,"b":20}]');
// sort a list/array by the nested objects field(s)
list.SortByKeyValue(['b', 'a']);
assert(list.Json = '[{"a":10,"b":20},{"a":11,"b":20},{"a":1,"b":21}]');
// enumerate a list/array with a conditional expression <img src="https://blog.synopse.info?pf=smile.svg" alt=":)" class="smiley" />
for dict in list.Objects('b<21') do
assert(dict.I['b'] < 21);
// another enumeration with a variable as conditional expression
for dict in list.Objects('a=', 10) do
assert(dict.I['a'] = 10);
// create a new IDocList from a conditional expression
list2 := list.Filter('b =', 20);
assert(list2.Json = '[{"a":10,"b":20},{"a":11,"b":20}]');
// direct access to the internal TDocVariantData storage
assert(list.Value^.Count = 3);
assert(list.Value^.Kind = dvArray);
assert(dict.Value^.Kind = dvObject);
// TDocVariantData from a variant intermediary
v := list.AsVariant;
assert(_Safe(v)^.Count = 3);
v := dict.AsVariant;
assert(_Safe(v)^.Count = 2);
// high-level Python-like methods
if list.Len > 0 then
while list.PopItem(v) do
begin
assert(list.Count(v) = 0); // count the number of appearances
assert(not list.Exists(v));
Listbox1.Items.Add(v.a); // late binding
dict := DocDictFrom(v); // transtyping from variant to IDocDict
assert(dict.Exists('a') and dict.Exists('b'));
// enumerate the key:value elements of this dictionary
for f in dict do
begin
Listbox2.Items.Add(f.Key);
Listbox3.Items.Add(f.Value);
end;
end;
// create from any complex "compact" JSON
// (note the key names are not "quoted")
list := DocList('[{ab:1,cd:{ef:"two"}}]');
// we still have the late binding magic working
assert(list[0].ab = 1);
assert(list[0].cd.ef = 'two');
// create a dictionary from key:value pairs supplied from code
dict := DocDict(['one', 1, 'two', 2, 'three', _Arr([5, 6, 7, 'huit'])]);
assert(dict.Len = 3); // one dictionary with 3 elements
assert(dict.Json = '{"one":1,"two":2,"three":[5,6,7,"huit"]}');
// convert to JSON with nice formatting (line feeds and spaces)
Memo1.Caption := dic.ToString(jsonHumanReadable);
// sort by key names
dict.Sort;
assert(dict.Json = '{"one":1,"three":[5,6,7,"huit"],"two":2}');
// note that it will ensure faster O(log(n)) key lookup after Sort:
// (beneficial for performance on objects with a high number of keys)
assert(dict['two'] = 2); // default lookup as variant value
assert(dict.I['two'] = 2); // explicit conversion to integer
end;
</pre>
<p>Since the high-level instances are <code>interface</code> and the internal content is <code>variant</code>, their life time are both safe and usual - and you don't need to write any <code>try..finaly list.Free</code> code.</p>
<p>And performance is still high, because e.g. a huge JSON array would have a single <code>IDocList</code> allocated, and all the nested nodes will be hold as efficient dynamic arrays of variants.</p>
<p>Two last one-liners may show how our <em>mORMot</em> library is quite unique in the forest/jungle of JSON libraries for Delphi and FPC:</p>
<pre>
assert(DocList('[{ab:1,cd:{ef:"two"}}]')[0].cd.ef = 'two');
</pre>
<pre>
assert(DocList('[{ab:1,cd:{ef:"two"}}]').First('ab<>0').cd.ef = 'two');
</pre>
<p>If you compare e.g. to how the standard Delphi JSON library works, with all its per-node classes, you may find quite a difference!<br />
Note that those both lines compile and run with the antique Delphi 7 compiler - who said the pascal language was not expressive, even back in the day?</p>
<p>We hope we succeeded in forging a new way to work with JSON documents, so that you may consider it for your projects on Delphi or FPC.<br />
Any feedback is <a href="https://synopse.info/forum/viewtopic.php?id=6805">welcome in our forum</a>, as usual!</p>
<p>BTW, do you know why I picked up this 1.0594631 number in the code?<br />
Hint: this is something I used when I was a kid programming music on a Z80 CPU... and I still remember this constant. :D</p>Happy New Year 2024 and Welcome MGETurn:md5:43234c51b30fea4e2169245f5f8abe8f2024-01-01T09:22:00+00:002024-01-01T10:03:59+00:00Arnaud BouchezmORMot FrameworkblogDelphiFreePascalHTTPHTTPSmgetmORMotmORMot2networkpeer2peerPeerCacheperformanceRESTRestweb<p>Last year 2023 was perhaps not the best ever, and, just after Christmas, we think about all people we know still in war or distress.<br />
But in the small <em>mORMot</em> world, 2023 was a fine millesima. A lot of exciting features, a pretty good rank in benchmarks, and a proof of being ready for the next decade.</p>
<p><img src="https://blog.synopse.info?post/public/blog/mormot2024.jpg" alt="" /></p>
<p>For this new year, we would like to introduce you to a new <em>mORMot</em> baby: the <strong>mget command line tool</strong>, a HTTP/HTTPS web client with peer-to-peer caching.<br />
It is just a wrapper around a set of the new <em>PeerCache</em> feature, built-in the framework web client class - so you can use it in your own projects if you need to.</p> <h4>The "mORMot GET" (mget) Cross-Platform Downloading Tool</h4>
<p>The <code>mget</code> command-line tool can retrieve files using HTTP or HTTPS, similar to the well-known homonymous GNU WGet tool, but with some unique features, like optional hash computation or peer-to-peer cache downloading.</p>
<p>It is based on the <code>THttpClientSocket.WGet()</code> process from <code><a href="https://github.com/synopse/mORMot2/blob/master/src/net/mormot.net.client.pas">mormot.net.client</a></code>, and optional peer-to-peer cache process as implemented by <code>THttpPeerCache</code> from <code><a href="https://github.com/synopse/mORMot2/blob/master/src/net/mormot.net.server.pas">mormot.net.server</a></code>. So everything you get with this tool is also directly available from you own projects using our framework.</p>
<p>Tested on Windows, Linux, and MacOS.</p>
<h4>Resume Downloads</h4>
<p>First of all, if a first download attempt failed (e.g. the network was interrupted), it can resume this aborted download, using <code>RANGE</code> headers. So only the remaining data will be retrieved, which may be a real time saver when getting huge files on weak connection. The partially downloaded file has a <code>.part</code> file name extension on disk.</p>
<h4>Hash Verification</h4>
<p>A cryptographic hash (typically MD5, SHA1 or SHA256) can be retrieved from the server before getting the file itself, to be checked at the end of the download. On recent 64-bit Intel/AMD, SHA-NI opcodes will be used for fast SHA1 and SHA256 calculation (2GB/s on my PC for instance).</p>
<p>You could also supply the hash at the command line level, if you know its value, e.g. from a public web site article.</p>
<h4>Peer-To-Peer Download</h4>
<p><img src="https://blog.synopse.info?post/public/blog/Peer2Peer.png" alt="" /></p>
<p>On corporate networks, one performance and usability issue is often the need to download content from the main corporate servers, via a VPN, over the Internet. In some countries, or due to some technical limitations, the bandwidth to the main servers may be limited, and become a bottleneck.</p>
<p>Our tool is able to maintain a local cache of already downloaded files (stored by their hash), and ask its peers on the local network if some content is not already in their cache. If the file is found, it will be downloaded locally, without using the main server but for a quick HEAD to ensure the file still exists on the main server (with the expected size).</p>
<p>Under the hood, a request will be broadcasted over UDP, to discover the presence of a file hash. If nothing is found, the main server will be requested with a GET, as usual. But if some peers do have the requested file, then the best peer will be selected and asked for a local download (over HTTP), with very good performance.</p>
<h4>Security Notes</h4>
<p>This <em>PeerCache</em> mechanism has been designed to be as secured as possible, even with its default settings.<br />
In a nutshell, its internal process expects a "secret" phrase to match on all peers for any communication to happen.</p>
<p>Here are some additional information:</p>
<ul>
<li>A global shared secret key is used to cipher and authenticate UDP frames and HTTP requests among all peers. This key should be strong enough and private, and can be provided via <code>--peerSecret</code> or <code>--peerSecretHexa</code>. It is derived internally using SHA-256 to generate secrets for encryption/authentication over both UDP and HTTP.</li>
<li>UDP frames are quickly signed with a secret-derivated crc before AES-GCM-128 encoding, so most attacks would be immediately detected.</li>
<li>HTTP requests on the local TCP port are also authenticated with a similar AES-GCM-128 bearer.</li>
<li>Peers which did send invalid requests over UDP or TCP will have their IP banished for a few minutes, to avoid fuzzing or denial of service attacks.</li>
<li>HTTP content is not encrypted on the wire by default, because it sounds not mandatory on a local network, but the <code>SelfSignedHttps</code> option can enable HTTPS if needed.</li>
<li>Tampering is avoided by using cryptographic hashes for the requests, the local storage and eventually in WGet, which would discard any invalid data.</li>
<li>The client caches only the content that it has requested itself, to reduce any information disclosure.</li>
<li>Local cache folders should have the proper ACL file permissions defined.</li>
<li>Local cached files are not encrypted, so if data leakage is a concern, consider enabling file systems encryption (e.g. BitLocker or Luks).</li>
<li>Resulting safety is similar to what Microsoft BranchCache offers, with no need of additional servers.</li>
</ul>
<h4>In Practice</h4>
<p>Run <code>mget</code> to get the minimal set of information:</p>
<pre>
ab@dev:~/mget$ ./mget
mget 2.1.6576: retrieve files - and more
proudly using mORMot 2 - synopse.info
Usage: mget <http://uri> [options] [params]
mget --help to display full usage description
</pre>
<p>Run the <code>mget /help</code> (on Windows) or <code>./mget --help</code> (on POSIX) command to get the list of all options available as command line switches. Note that the switches naming is case-sensitive on all platforms.</p>
<p>The main typical usages are the following (use <code>/</code> on Windows, or <code>--</code> on POSIX for the switches):</p>
<ul>
<li><code>mget https://someuri/some/file.ext</code> to retrieve a file from an URI, with optional resume if the download was aborted;</li>
<li><code>mget 4544b3...68ba7a@http://someuri/some/file.ext</code> to retrieve a file from an URI, with the specified hash value (md5/sha1/sha256 algo will be guessed from the hexadecimal hash length);</li>
<li><code>mget https://someuri/some/file.ext --hashAlgo sha256</code> to retrieve a file from an URI, first retrieving its hash value from <code>https://someuri/some/file.ext.sha256</code>;</li>
<li><code>mget --prompt</code> to ask for an URI or hash@URI from a prompt - terminates on error or from a void entry;</li>
<li><code>mget --prompt</code><code> --peer</code> to ask for an URI or hash@URI from a prompt, with <em>PeerCache</em> enabled.</li>
</ul>
<p>By design, <em>PeerCache</em> processing needs some UDP and TCP servers to run on the background. This is the point of the <code>--prompt</code> command line kind of process: you can ask for files to downloads, but peers can also ask for your own cached files during the prompt wait.</p>
<p>A lot of additional features or options are available, e.g. use a local cache folder, limit the bandwidth usage during the download, define HTTPS certificate validation, or tune any <code>--peer</code> setting, like <code>--peerSecret</code> or <code>--peerPort</code>.</p>
<h4>Show Me the Source</h4>
<p>Just go to <a href="https://github.com/synopse/mORMot2/tree/master/src/tools/mget">https://github.com/synopse/mORMot2/tree/master/src/tools/mget</a> and check out the source code of this tool, and its associated documentation.</p>
<p>Note that this tool was just released, and need additional testing. So your feedback is very much welcome!</p>
<p>Happy new year 2024!</p>Native X.509, RSA and HSM Supporturn:md5:6404121d596d3c5cee045e29cf93e02b2023-12-09T11:01:00+00:002023-12-10T18:13:06+00:00Arnaud BouchezmORMot FrameworkAsymmetricCrossPlatformCSPRNGDelphiECCed25519forensicFPCFreePascalGoodPracticeinterfacemORMotmORMot2OpenSourceOpenSSLperformancePKCS11RSAsecurityX509<p>Today, almost all computer security relies on asymmetric cryptography and X.509 certificates as file or hardware modules.<br />
And the RSA algorithm is still used to sign the vast majority of those certificates. Even if there are better options (like ECC-256), RSA-2048 seems the actual standard, at least still allowed for a few years.</p>
<p><img src="https://blog.synopse.info?post/public/blog/mormotSecurity.jpg" alt="" /></p>
<p>So we added pure pascal RSA cryptography and X.509 certificates support in <em>mORMot</em>.<br />
Last but not least, we also added Hardware Security Modules support via the PKCS#11 standard.<br />
Until now, we were mostly relying on OpenSSL, but a native embedded solution would be smaller in code size, better for reducing dependencies, and easier to work with (especially for HSM). The main idea is to offer only safe algorithms and methods, so that you can write reliable software, even if you are no cryptographic expert. <img src="https://blog.synopse.info?pf=smile.svg" alt=":)" class="smiley" /></p> <h4>Rivest-Shamir-Adleman (RSA) Public-Key Cryptography</h4>
<p>The RSA public-key algorithm was designed back in 1977, and is still the most widely used. In order to fully implement it, we need to generate new key pairs (public and private keys), then sign and verify (or encrypt or decrypt) data with the key. For instance, a private key is kept secret, and used for an Authority to sign a certificate, and the public key is published, and able to verify a certificate. It is based on large prime numbers, so we needed to develop a Big Integer library, which is not part of Delphi or FPC RTL.</p>
<p><img src="https://blog.synopse.info?post/public/blog/RSAalgo.png" alt="" /></p>
<p>Here as some notes about our implementation in <a href="https://github.com/synopse/mORMot2/blob/master/src/crypt/mormot.crypt.rsa.pas">mormot.crypt.rsa.pas</a>:</p>
<ul>
<li>new pure pascal OOP design of BigInt computation optimized for RSA process;</li>
<li>dedicated x86_64/i386 asm for core computation routines (noticeable speedup);</li>
<li>use half-registers (HalfUInt) for efficient computation on all CPUs (arm and aarch64 numbers are good);</li>
<li>slower than OpenSSL, but likely to be the fastest FPC or Delphi native RSA library, thanks to our optimized asm: for instance, you can generate a new RSA-2048 keypair in less than a second;</li>
<li>internal garbage collection of BigInt instances, to minimize heap pressure during computation, and ensure all values are wiped once used during the process - as proven anti-forensic measure;</li>
<li>includes FIPS-level RSA keypair validation and generation, using a safe random source, with efficient prime number detection, and minimal code size;</li>
<li>features both RSASSA-PKCS1-v1_5 and RSASSA-PSS signature schemes;</li>
<li>started as a fcl-hash fork, but full rewrite inspired by Mbed TLS source because this initial code is slow and incomplete;</li>
<li>references: we followed <a href="https://github.com/Mbed-TLS/mbedtls">the Mbded TLS</a> implementation (which is much easier to follow than OpenSSL), and the well known <a href="https://cacr.uwaterloo.ca/hac/about/chap4.pdf">Handbook of Applied Cryptography (HAC)</a> recommendations;</li>
<li>includes full coverage of unit tests to avoid any regression, validated against the OpenSSL library as audited reference;</li>
<li>this unit will register as <code>Asym</code> 'RS256','RS384','RS512' algorithms (if not overridden by the faster <code>mormot.crypt.openssl</code>), keeping 'RS256-int' and 'PS256-int' available to use our unit;</li>
<li>as used by <code>mormot.crypt.x509</code> (see below) to handle RSA signatures of its X.509 Certificates.</li>
</ul>
<p>For instance, if you want to access a <code>TCryptAsym</code> digital signature instance with RSA-2048 and SHA-256 hashing, you can just use the <code>CryptAsym</code> global variable with <code>caaRS256</code> algorithm as factory.<br />
If you need just public/private key support, you can use <code>CryptPublicKey</code> or <code>CryptPrivateKey</code> factories with <code>ckaRsa</code> algorithm.</p>
<p>About RSA security:</p>
<ul>
<li>RSA-512 or RSA-1024 are considered unsafe and should not be used.</li>
<li>RSA-2048 confers 112-bit of security, and is the usual choice today when this algorithm is to be used.</li>
<li>RSA-3072 could confer 128-bit of security, at the expense of being slower and 50% bigger - so switching to ECC-256 may be a better option, for the same level of security.</li>
<li>RSA-4096 is not worth it in respect to RSA-3072, and RSA-7680 is very big and slow, and only gives 192-bit of security, so should be avoided.</li>
</ul>
<p>Anyway, our library is able to support all those key sizes, up to RSA-7680 is you really need it.<br />
See <a href="https://stackoverflow.com/a/589850/458259">this SO response</a> as reference about RSA keysizes.</p>
<h4>X.509 Certificates</h4>
<p>As we wrote in introduction, X.509 certificates are the base of most computer security.<br />
The whole TLS/HTTPS stack makes use of it, and the whole Internet would collapse without it.</p>
<p>We developed our <a href="https://github.com/synopse/mORMot2/blob/master/src/crypt/mormot.crypt.x509.pas">mormot.crypt.x509.pas</a> unit from scratch, featuring:</p>
<ul>
<li>X.509 Certificates Fields Logic (e.g. X.501 Type Names);</li>
<li>X.509 Certificates and Certificate Signing Request (CSR);</li>
<li>X509 Certificate Revocation List (CRL);</li>
<li>X509 Private Key Infrastructure (PKI);</li>
<li>Registration of our X.509 Engine to the <code>TCryptCert</code>/<code>TCryptStore</code> Factories.</li>
</ul>
<p><img src="https://blog.synopse.info?post/public/blog/X509certificate.png" alt="" /></p>
<p>The raw binary encoding is using the (weird) ASN.1 syntax, which is now implemented as part of the <a href="https://github.com/synopse/mORMot2/blob/master/src/crypt/mormot.crypt.secure.pas">mormot.crypt.secure.pas</a> unit.<br />
We followed the RFC 5280 specifications, and mapped latest X.509 Certificates / CSR / CRL extensions, with some low-level but very readable pascal code using classes, records and enumerates. It features perfect compatibility with our <code>ICryptCert</code> high-level interface wrappers, ready to be used in a very convenient way. We support all basic functions, but also advanced features like open/sealing or text peer information in a human readable format.<br />
When using our unit, your end-user code should not be lost within the complex details and notions of the X.509 format (like OIDs, versions or extensions), but use high-level pascal code, with no possibility to use a weak or invalid configuration.</p>
<p>Of course, it can support not only our new RSA keys, but also ECC-256 as implemented by our native <a href="https://github.com/synopse/mORMot2/blob/master/src/crypt/mormot.crypt.ecc.pas">mormot.crypt.ecc.pas</a>, or any other algorithm, e.g. available from OpenSSL.</p>
<h4>X.509 Private Key Infrastructure (PKI)</h4>
<p>Our unit features a full Private Key Infrastructure (PKI) implementation.<br />
In fact, X.509 certificates are as weak as the PKI they are used on. You can have strong certificates, but a weak verification pattern. In end-user applications, it is typical to see all the security being lost by a poor (e.g. naive) implementation of the keys interaction.</p>
<p><img src="https://blog.synopse.info?post/public/blog/PKI.png" alt="" /></p>
<p>This is why our unit publishes a 'x509-pki' <code>ICryptStore </code> as a full featured PKI:</p>
<ul>
<li>using our <code>TX509</code> and <code>TX509Crl</code> classes for actual certificates process;</li>
<li>clean verification of the chain of trust, with customized depth and proper root Certificate Authority (CA) support, following the RFC 5280 section 6 requirements of a clean "Certification Path Validation";</li>
<li>maintaining a cache of <code>ICryptCert</code> instances, which makes a huge performance benefit in the context of a PKI (e.g. you don't need to parse the X.509 binary, or verify the chain of trust each time).</li>
</ul>
<p>We tried to make performance and usability in the highest possible standards, to let you focus on your business logic, and keep the hard cryptography work done in the <em>mORMot</em> library code.</p>
<h4>Hardware Security Modules (HSM) via PKCS#11</h4>
<p>The PKCS#11 standard is a way to define some software access to Hardware Security Modules, via a set of defined API calls.<br />
We just published the <a href="https://github.com/synopse/mORMot2/blob/master/src/crypt/mormot.crypt.pkcs11.pas">mormot.crypt.pkcs11.pas</a> unit to interface those devices with the other <em>mORMot</em> PKI.</p>
<p><img src="https://blog.synopse.info?post/public/blog/HSM.png" alt="" /></p>
<p>Once you have loaded the library of your actual hardware (typically a <code>.dll</code> or <code>.so</code>) using a <code>TCryptCertAlgoPkcs11</code> instance, you can see all stored certificates and keys, as high-level regular <code>ICryptCert</code> instances, and sign or verify any kind of data (some binary or some other certificates), using the private key safely stored on in the hardware device.<br />
This is usually slower than a pure software verification, but it is much safer, because the private key is sealed within the hardware token, and never leave it. So it can't be intercepted and stolen.</p>
<h4>You are Welcome!</h4>
<p>With those <em>mORMot</em> cryptography units, you now have anything at hand to use standard and proven public-key cryptography in your applications, on both Delphi or FPC, with no external dll deployment issue, and minimal code size increase.<br />
We can thank a lot <a href="https://www.tranquil.it/en">my employer</a> for needing those nice features, therefore letting me work on them.<br />
Open Source rocks! :)</p>Pascal In The Race: TFB Challenge Benchmarksurn:md5:29172a09bdb07f1325d502a68027ae562023-10-31T07:36:00+00:002023-10-31T09:30:40+00:00Arnaud BouchezmORMot FrameworkblogDelphiEKONFreePascalGoodPracticemORMot2multithreadperformancePostgreSQLTFB<p>Round 22 of the TechEmpower Frameworks has just finished.<br />
And this time, there was a pascal framework in the race: our little <em>mORMot</em>!</p>
<p>Numbers are quite good, because we are rated #12 among 302 frameworks over 791 runs of several configurations.</p>
<p><img src="https://blog.synopse.info?post/public/blog/TFBround22.png" alt="" /></p> <p>The TFB challenge is a performance comparison of many web app platforms, exercising JSON, database, ORM, HTML templates, all over HTTP. It compares the best frameworks written in C++, Rust, Go, JS, Java, C#… and now Pascal – thanks to <em>mORMot</em>.</p>
<h4>Use The Source, Luke!</h4>
<p>The source code of the <em>mORMot</em> server involved is available online:<br />
<a href="https://github.com/synopse/mORMot2/tree/master/ex/techempower-bench">https://github.com/synopse/mORMot2/tree/master/ex/techempower-bench</a></p>
<p>The supplied server program publishes three families of endpoints, defined as such in the "display_name" of the benchmark configuration:</p>
<ul>
<li><em>mormot orm</em> using the ORM layer;</li>
<li><em>mormot direct</em> using the direct DB layer - mapping /raw* endpoints;</li>
<li><em>mormot async</em> using the asynchronous DB layer (only available on PostgreSQL by now) - mapping /async* endpoints.</li>
</ul>
<p>Yes, we added an asynchronous/non-blocking DB interface, for the purpose of the TFB challenge. Pascal can do wonders even without "async" keyword support in the language itself - which we would like to appear, anyway.</p>
<h4>Gimme Numbers</h4>
<p>As we wrote above, <em>mORMot</em> is within the top #12 frameworks, and within the first frameworks with a full ORM. Note that some libraries listed as "ORM" (like drogon) are not full ORM frameworks. They are C++ templates engines, with pre-generated code. So they don't use RTTI or a separated Mustache template as <em>mORMot</em> does. To my understanding, an ORM is not just a set of templates with manual SQL statements, but about automation and minimal code writing. You define the class, and the framework should do the work for you.</p>
<p>Note that the ranking may change with the final publication of Round 22, because the weighting of each tests may be changed.<br />
We will see what the future offers...</p>
<p>But anyway, we are above some well-known and proven solutions like "asp.net core", so we can be proud.<br />
Actual numbers can be found <a href="https://tfb-status.techempower.com/results/66d86090-b6d0-46b3-9752-5aa4913b2e33">at the https://tfb-status.techempower.com website</a>.</p>
<p>Also note that those tests use <em>PostgreSQL</em>. In most <em>mORMot</em> configurations, a typical MicroService would rather use its own embedded <em>SQLite3</em> database. And here, the numbers are twice higher. So in fact, a production-ready <em>mORMot</em> service with its stand-alone database is likely to blow away any other framework using a separated <em>PostgreSQL</em> database. As such, the cached-queries test, which returns some items using the DB/ORM cache, is a typical workload on a production system, and <em>mORMot</em> shines in this test. It is usually the first full ORM listed in the "cached query" test.<br />
But <em>PostgreSQL</em> is for sure the fastest and most stable database around, and we have a native and direct access layer to it in <em>mORMot</em>. The <a href="https://github.com/synopse/mORMot2/blob/master/src/db/mormot.db.sql.postgres.pas">mormot.db.sql.postgres.pas unit</a> has been written from scratch, and heavily tuned to maximize the performance of this amazing Open Source database - thanks a lot Pavlo for your hard work!</p>
<p>During <a href="https://blog.synopse.info?post/post/2023/09/06/Meet-at-EKON-27">upcoming EKON 27 Conference at Dusseldorf</a> next week (already!), <a href="https://entwickler-konferenz.de/speaker/arnaud-bouchez/">I will make two sessions about it</a>.<br />
Here is a quick description of them.</p>
<h4>Frameworks Expressiveness</h4>
<p>In this first session, we will look and compare the source code of some frameworks samples, to distinguish their typical philosophy. We will see how modern Object Pascal is still relevant, and discuss/propose some ideas for the future of the pascal language.</p>
<h4>Frameworks Tuning</h4>
<p>As a follow-up of the previous session about TFB, we will detail what kind of tuning was made to the <em>mORMot</em> library, and its associated TFB sample implementation, to reach the top scores in charts.</p>
<p>How can a pure Pascal project reach 7 millions of HTTP requests per seconds? How to scale and measure on high-end hardware? Are ORM frameworks damned to slow down everything? How to circumvent the lack of “async” programming at language level? How realistic is such a benchmark?</p>
<h4>More To Come</h4>
<p>When I return from Germany, I will write here some more information.<br />
But in short, server performance is about proper multi-thread coding, minimizing number of syscalls and lock contention, and reusing as much context as possible. And the most important is certainly actual measurement (not guessing), and the ability to have several brains investigating the bottlenecks: thanks a lot Pavel for your insight and ideas to optimize our numbers!<br />
Stay tuned!</p>End Of Live OpenSSL 1.1 vs Slow OpenSSL 3.0urn:md5:f20e1a3a1c96e8f65f1fc8ef5a04498c2023-09-08T11:59:00+01:002023-09-08T14:13:49+01:00Arnaud BouchezOpen SourceAESCertificatesDelphiFreePascalGoodPracticeLateBindingLazarusMaxOSXmORMotmORMot2OpenSSLperformancesecuritySource<p>You may have noticed that the OpenSSL 1.1.1 series will reach End of Life (EOL) next Monday...<br />
Most sensible options are to switch to 3.0 or 3.1 as soon as possible.</p>
<p><img src="https://blog.synopse.info?post/public/blog/mormotSecurity.jpg" alt="mormotSecurity.jpg, Sep 2023" /></p>
<p>Of course, our <a href="https://github.com/synopse/mORMot2/blob/master/src/lib/mormot.lib.openssl11.pas"><em>mORMot 2</em> OpenSSL unit</a> runs on 1.1 and 3.x branches, and self-adapt at runtime to the various API incompatibilities existing between each branch.<br />
But we also discovered that switching to OpenSSL 3.0 could led into big performance regressions... so which version do you need to use?</p> <h4>OpenSSL 1.1 End Of Live</h4>
<p><img src="https://upload.wikimedia.org/wikipedia/commons/thumb/6/6a/OpenSSL_logo.svg/320px-OpenSSL_logo.svg.png" alt="OpenSSL logo" /></p>
<p>The well known and well established OpenSSL 1.1.1 series will reach End of Life (EOL) on 11th September 2023. So next Monday! <img src="https://blog.synopse.info?pf=sad.svg" alt=":(" class="smiley" /> <br />
Users of OpenSSL 1.1.1 should consider their options and plan any actions they might need to take.</p>
<p>Note that Indy users are <a href="https://github.com/IndySockets/Indy/issues/183">still stuck to the OpenSSL 1.0 branch</a>, even 1.1 is not yet officially supported. Some <a href="https://github.com/IndySockets/Indy/pull/299">alternate IO handlers</a> are able to use newest releases - to some extend.<br />
Indy users should rather move to a better supported library, like our little <em>mORMot</em>.</p>
<p>Also note that there are some API incompatibilities between 1.1 and 3.x. Functions have been renamed, or even removed; new context constructors appeared; some parameters types even changed!<br />
Our unit tries to address all those problems at runtime, and is tested against several version of the OpenSSL library, to ensure you do not have to worry about those low-level issues.</p>
<h4>OpenSSL 3.x Benefits</h4>
<p>With OpenSSL 3.0, the developpers did a huge refactoring of the library internals.<br />
To be fair, the 1.x source code of OpenSSL was kind of a mess, and difficult to maintain. The biggest IT companies did even made their own forks or switched to other libraries. The best known is <a href="https://boringssl.googlesource.com/boringssl/">BoringSSL</a>, maintained by Google, and used e.g. in Chrome and Android.<br />
So it was time for a refactoring, especially for a library as critical as OpenSSL for so many projects.</p>
<p>With the new 3.x branch, a lot of low-level API functions have been deprecated.<br />
In practice, you don't have direct access any more to the internal structures of the library, and should now always use the high-level API to access a context property, or execute the processing methods. For instance, the low-level <code>AES_encrypt</code> function is not available any more: from now on, you need to use the high-level <code>EVP_Encrypt*</code> API.<br />
The official <a href="https://www.openssl.org/docs/man3.0/man7/migration_guide.html">Migration Guide page</a> is clearly huge, and worth reading if you want to prepare yourself to the upcoming years with OpenSSL.</p>
<h4>OpenSSL 3.0 Performance Regression</h4>
<p>The 3.0 branch new code may seem more beautiful and more maintainable, but it had its drawbacks. Newer is not always better.<br />
Most users of this new release <a href="https://github.com/openssl/openssl/issues/17064">observed a huge performance regression</a> when switching from 1.x to 3.0. It affected a lot of projects, from various languages, even script languages which were not already shining about performance. Time regression from 3x up to 10x were reported. On our side, X509 certificates manipulation was really slower than before - the worse being about X509 stores.</p>
<p>Some slowdown were expected and documented (like RSA key generation, which now uses 64 rounds). But the regression was much deeper.<br />
The culprit seems not to be the core cryptographic code, like AES buffer encoding (which asm claims to have been optimized even further on 3.x branch), but the OpenSSL context structures themselves. They were rewritten for future maintainability, but not focusing on their actual performance.</p>
<h4>OpenSSL 3.1 Numbers</h4>
<p>The 3.1 branch claims to have addressed most of these problems.</p>
<p><img src="https://upload.wikimedia.org/wikipedia/commons/thumb/e/ea/The_Tortoise_and_the_Hare_-_Project_Gutenberg_etext_19994.jpg/334px-The_Tortoise_and_the_Hare_-_Project_Gutenberg_etext_19994.jpg" alt="The Tortoise and the Hare" /></p>
<p>To be sure, we run the <em>mORMot</em> cryptographic regression tests with several versions of OpenSSL. And in fact, OpenSSL 3.1 was much faster than OpenSSL 3.0, but still behind OpenSSL 1.1.<br />
Here are the numbers we observed for the whole <code>TTestCoreCrypto</code> method execution, executed on Win32:</p>
<ul>
<li>OpenSSL 1.1 = 15 sec</li>
<li>OpenSSL 3.0 = 33 sec</li>
<li>OpenSSL 3.1 = 18 sec</li>
</ul>
<p>There are several aspects to emphasize:</p>
<ul>
<li>Those tests runs also <em>mORMot</em> engine cryptography, so you don't only test OpenSSL: the "pure mORMot" tests take around 4.5 seconds in the above numbers;</li>
<li>Any serious project should consider compiling on Win64, and running a server on a x86_64 Linux - on this platform, the regression does exist, but only slightly better;</li>
<li>The slowdown was less affecting <code>TTestCoreCrypto.Benchmark</code> (i.e. raw buffer encryption) than <code>TTestCoreCrypto.Catalog</code> (i.e. certificates process);</li>
<li>Our tests were mono-threaded, and worse slow down were reported on heavily threaded process (up to x10).</li>
</ul>
<p>Within the <em>mORMot</em> OpenSSL wrapper, we try to cache as many context as possible. For instance, we don't lookup the OpenSSL algorithm by name for each call, but we cache it at runtime to avoid any slowdown.<br />
But it seems not enough with OpenSSL 3.0, which may affect your application performance.</p>
<h4>To Support or Not Support</h4>
<p>So OpenSSL 3.1 seems to be the way to go.</p>
<p>On Linux (or other POSIX systems), you are likely to use the library shipped with the system.<br />
So you would not worry about which version to use. And, sadly, it is very likely that your distribution provides OpenSSL 3.0 and not OpenSSL 3.1.</p>
<p>On Windows (or Mac), you could (should?) use your "own" dll/so files, so you have to take into account the support level of the library.<br /></p>
<ul>
<li>OpenSSL 3.0 is a Long Term Support (LTS) version, which will be maintained until 7th September 2026.<br /></li>
<li>OpenSSL 3.1 will be supported only until 14th March 2025.</li>
</ul>
<p>These support end dates could appear counter-intuitive, but this is an usual way in Open Source projects, the best known being perhaps <a href="https://ubuntu.com/blog/what-is-an-ubuntu-lts-release">Ubuntu LTS versions</a>.<br />
For more information about OpenSSL support lifetime, look at the <a href="https://www.openssl.org/source/">official OpenSSL Downloads page</a>.</p>
<p>So, for most projects, especially on Windows where you are likely to publish OpenSSL dll with your own executable, switching to OpenSSL 3.1 is likely to be the way to go.<br />
If you need to gather some security certification for your product, you may consider using OpenSSL 3.0 LTS version, which may help your certification remain active for a longer period.</p>
<p>Any feedback is <a href="https://synopse.info/forum/viewtopic.php?id=6697">welcome on our forum</a>, as usual!</p>Meet at EKON 27urn:md5:3b2b4d1ce07707b55749943bcf2e0ffe2023-09-06T14:32:00+01:002023-09-06T13:51:36+01:00Arnaud Bouchez<p>There is still a bit more than one day left for "<a href="https://entwickler-konferenz.de/tickets-de/">very early birds</a>" offer for EKON 27 conference in Germany, and <a href="https://entwickler-konferenz.de/speaker/arnaud-bouchez/">meet us for 3 sessions (including a half-day training/introduction to mORMot 2)</a>!</p>
<p><img src="https://blog.synopse.info?post/public/blog/EKON27.png" alt="EKON27.png, Sep 2023" /></p>
<p>Join us the 6-8th of November in Düsseldorf!</p> <p>Those sessions will illustrate some of the latest news about the framework, focusing on the <a href="https://synopse.info/forum/viewtopic.php?id=6443">TechEmpower Framework Benchmark (aka TFB)</a> challenge, and what it did change for the framework - and potentially for modern Object Pascal:</p>
<ul>
<li><a href="https://entwickler-konferenz.de/frameworks-and-tools/frameworks-expressiveness">Frameworks Expressiveness</a></li>
<li><a href="https://entwickler-konferenz.de/frameworks-and-tools/frameworks-tuning">Frameworks Tuning</a></li>
<li><a href="https://entwickler-konferenz.de/agile-devops-001/embrace-posix-servers">Embracing mORMot 2.</a>1 (full morning workshop)</li>
</ul>
<p>We will see how Object Pascal is still in the race, and could compete with the best performing frameworks and languages around.<br />
Thanks to our little mORMot, of course. <img src="https://blog.synopse.info?pf=wink.svg" alt=";)" class="smiley" /></p>
<p>You <a href="https://synopse.info/forum/misc.php?email=2">can contact me</a> if you want to join, so that I may give you an additional discount password!</p>
<p>Hope we could meet face to face, discuss and perhaps enjoy (a few) beers!</p>mORMot 2.1 Releasedurn:md5:4b0b47efd0e18fe5bdbea7a1e836a7662023-08-24T00:39:00+01:002023-08-29T08:53:25+01:00Arnaud BouchezmORMot Framework7zipAES-NiangelizeauthenticationCommandLineLDAPLUTImORMotmORMot2OpenSSLperformancereleaseshaSQLite3TFB<p>We are pleased to announce the release of <strong>mORMot 2.1</strong>.<br />
The <a href="https://github.com/synopse/mORMot2/releases/tag/2.1.stable">download link is available on github</a>.</p>
<p><img src="https://blog.synopse.info?post/public/blog/mORMot21.jpg" alt="" /></p>
<p>The <em>mORMot</em> family is growing up. <img src="https://blog.synopse.info?pf=smile.svg" alt=":)" class="smiley" /></p> <h4>The Release Cycle</h4>
<p>Like any living form, even silicon-based, the <em>mORMot</em> has its own cycle of life, following the coding seasons and commits history.</p>
<p>This was time to publish a new stable version of our Open Source framework for Delphi and FPC.<br />
We want indeed to make some official releases in a regular way, to ease integration into end user projects.</p>
<p>Here is an extract of the release notes:</p>
<h4>Added</h4>
<ul>
<li>(C)LDAP, DNS, (S)NTP clients</li>
<li>Command Line Parser</li>
<li>Native digest/basic HTTP servers authentication</li>
<li>Angelize services/daemons manager</li>
<li>TTunnelLocal TCP port forwarding</li>
<li>SHA-1/SHA-256 HW opcodes asm</li>
<li>7Zip dll wrapper</li>
<li>OpenSSL CSR support</li>
<li>PostgreSQL async DB with HTTP async backend (for TFB)</li>
<li>LUTI continous integration cross-platform farm</li>
<li></li>
</ul>
<h4>Changed</h4>
<ul>
<li>Upgraded SQLite3 to 3.42.0</li>
<li>Stabilized Mac x86_64/aarch64 platforms</li>
<li>Lots of bug fixes and enhancements</li>
</ul>
<h4>Feedback Welcome</h4>
<p>Some of the new features were already discussed in this blog, but <a href="https://synopse.info/forum/viewtopic.php?id=6681">you are free to ask for more on our forum, as usual</a>.</p>
<p>There was also a lot of small but meaningful changes and enhancements in the source code tree.<br />
Especially in the HTTP/REST part, <a href="https://synopse.info/forum/viewtopic.php?id=6443">to reach the top of the TechEmpower Framework Benchmark ranks</a>.</p>
<p><img src="https://blog.synopse.info?post/public/blog/mORMot21Release.jpeg" alt="" /></p>The LUTI and the mORMoturn:md5:25a4abb86aef5e445dca306f3a44273a2023-07-20T10:14:00+01:002023-07-20T10:36:35+01:00Arnaud BouchezmORMot FrameworkblogContinuousIntegrationCrossPlatformDelphiFPCFreePascalGoodPracticeLUTImORMotperformanceTestingTranquilITWAPT<p>Since its earliest days, our <em>mORMot</em> framework did offer extensive regression tests. In fact, it is fully test-driven, and almost 78 million individual tests are performed to cover all its abilities:</p>
<p><img src="https://blog.synopse.info?post/public/blog/RegressTests.png" alt="RegressTests.png, Jul 2023" /></p>
<p>We just integrated those tests to the <a href="https://www.tranquil.it/">TranquilIT</a> build farm, and its great LUTI tool. So we have now continuous integration tests over several versions of Windows, Linux, and even Mac!<br />
LUTI is the best <em>mORMot</em>'s friends these days. <img src="https://blog.synopse.info?pf=smile.svg" alt=":)" class="smiley" /></p> <h4>Discover the LUTI</h4>
<p>LUTI is a <em>TranquilIT</em> internal tool that can automatically create, test, track and update WAPT packages for the WAPT Store.<br />
Those WAPT packages are the core software deployment archives used by the great WAPT software solution of the French <em>TranquilIT</em> company.</p>
<p>More information is available at
<a href="https://www.tranquil.it/en/luti-creation-testing-and-automatic-tracking-of-wapt-packages/">https://www.tranquil.it/en/luti-creation-testing-and-automatic-tracking-of-wapt-packages/</a></p>
<h4>Our Rodent's Best Friend</h4>
<p>Yes, of course, <a href="https://www.tranquil.it/en/mormot/">WAPT does use <em>mORMot</em></a>, so it did make perfect sense to integrate both projects.</p>
<p>In practice, we built and integrated the tests on several versions of Windows, several Linux distributions, and even Mac Intel and Mac M1 virtual machines.<br />
The source code is cloned from our <a href="https://github.com/synopse/mORMot2">official GitHub repository</a>, built using FPC, then distributed over the needed virtual machines.<br />
The WAPT agent is in fact deployed on all VMs, and a dedicated package/script is installed and run via this Agent to trigger the actual tests.</p>
<p>A typical WAPT package script extract looks like the following:</p>
<pre>
def install():
if iswin64():
arch = 'x86_64-win64'
else:
arch = 'i386-win32'
try:
run(r'testmormot\%s\mormot2tests.exe /noenter' % arch)
run(r'testmormot\%s\mormot2tests.exe /noenter --test TTestCoreProcess.JSONBenchmark' % arch)
run(r'testmormot\%s\mormot2tests.exe /noenter --dns sambaad.lan --test TNetworkProtocols.DNSAndLDAP' % arch)
run(r'testmormot\%s\mormot2tests.exe /noenter --dns msad.lan --test TNetworkProtocols.DNSAndLDAP' % arch)
except:
...
</pre>
<p>Nothing complex here, just some python code executed on the target machine.</p>
<p>As you can see, the mormot2tests project has now optional <a href="https://blog.synopse.info/?post/2023/04/19/New-Command-Line-Parser-in-mORMot-2">command line options</a> to trigger dedicated tests. For instance the JSON benchmark is run after a main default pass, because it uses some JSON content generated by the ORM, so a second pass is needed. Or we can validate two kinds of <a href="https://blog.synopse.info/?post/2023/04/19/New-DNS-and-%28C%29LDAP-Clients-for-Delphi-and-FPC-in-mORMot-2">local DNS and LDAP servers</a>.</p>
<h4>All Green</h4>
<p>Now, let's see the result of a typical test run. Note that one such run is triggered every night with the latest <em>mORMot</em> sources available, for continuous delivery.</p>
<p>Several versions of Windows are validated:
<img src="https://blog.synopse.info?post/public/blog/LutiWin.png" alt="LutiWin.png, Jul 2023" /></p>
<p>Then several Linux distributions:
<img src="https://blog.synopse.info?post/public/blog/LutiLinux.png" alt="LutiLinux.png, Jul 2023" />
Note that the <em>bullseye_arm64</em> is in fact a Mac M1 virtual machine running Debian, and <em>buster_armhf</em> is a <a href="https://www.raspberrypi.com/">good tiny Raspberry Pi</a> running on our network.</p>
<p>And even Mac Intel and Mac M1 systems:
<img src="https://blog.synopse.info?post/public/blog/LutiMac.png" alt="LutiMac.png, Jul 2023" /></p>
<p>With a quick calculation, we can guess that 1.7 billion individual tests are done during each pass, throughout the 22 machines involved...</p>
<h4>Good Benefits</h4>
<p>This is a good showcase of the FPC and <em>mORMot</em> abilities to work cross-platform and cross-architecture. It also includes OpenSSL validation, and LDAP/DNS testing on local Samba or MSAD server.<br />
We discovered and fixed some corner-case issues during the integration of those tests. But this is what tests are about, isn't it? To show what is wrong.<br />
Some issues were in fact very nasty, especially on Mac, where <a href="https://github.com/synopse/mORMot2/commit/2352a485952ed19354eab8340ec69fd8ecaaecfd">Apple can't do as everyone does</a>.</p>
<p>We also were able to compare performance between targets, and in fact, we were pleased to see that <a href="https://blog.synopse.info/?post/2021/08/17/mORMot-2-on-Ampere-AARM64-CPU">aarch64 platforms should work fast enough</a>, even if x86_64 are better supported by <em>mORMot</em>, especially thanks to a lot of manually-tuned assembly code using latest Intel/AMD SIMD instructions, and our <a href="https://github.com/synopse/mORMot2/blob/master/src/core/mormot.core.fpcx64mm.pas">dedicated memory manager</a>. Even the Raspberry Pi can sustain all this JSON, cryptography, ORM, REST, HTTPS, network testing... at its own pace, of course. :)</p>
<p>Thanks a lot anyway to <a href="https://www.tranquil.it/">TranquilIT</a> for supporting our little <em>mORMot</em> and offering such a great product like <a href="https://www.tranquil.it/en/wapt/managing-your-it-assets/">WAPT</a>!</p>
<p>You can <a href="https://synopse.info/forum/viewtopic.php?id=6645">discuss about this blog entry on our forum</a>, as usual.</p>New DNS and (C)LDAP Clients for Delphi and FPC in mORMot 2urn:md5:9b7d1f59f1f9c0dd25b70ad069127d332023-05-05T07:18:00+01:002023-05-05T06:37:59+01:00Arnaud BouchezmORMot FrameworkCLDAPCrossPlatformDNSLDAPmORMot2<p>DNS and LDAP are the two protocols on which the Internet and the Intranet are built.<br />
Most of the time, you don't have to care about them. But sometimes, you need to access them directly, especially in a corporate environment.</p>
<p><img src="https://blog.synopse.info?post/public/blog/NetworkProtocol.png" alt="" /></p>
<p>We just introduced in our Open Source mORMot 2 framework two client units to access DNS and LDAP/CLDAP servers.<br />
You can resolve IP addresses and services using DNS, and ask for information about your IT infrastructure using LDAP.</p> <h4>The DNS Protocol</h4>
<p>The Domain Name System (DNS) is a hierarchical and distributed naming system for the Internet.<br />
You can connect to a DNS server, and for instance resolve a computer name (e.g. 'synopse.info') and retrieve its IP (e.g. 62.210.254.173). Or you can ask for some known services, e.g. "which DC server do you know?".</p>
<p>Do implement this, a client should send some message to one server, which works as a cache toward some other reference servers. Those messages are usually transmitted as UDP packets, but could use TCP over TLS or some other protocols.</p>
<p><img src="https://upload.wikimedia.org/wikipedia/commons/thumb/a/a5/Example_of_an_iterative_DNS_resolver.svg/400px-Example_of_an_iterative_DNS_resolver.svg.png" alt="" /></p>
<p>A lot of information is available <a href="https://en.wikipedia.org/wiki/Domain_Name_System">https://en.wikipedia.org/wiki/Domain_Name_System</a>.<br />
As you can imagine, the premises are quiet simple, but the actual implementation could be pretty complex, because it is really the corner stone of the whole Internet.</p>
<h4>The LDAP / CLDAP Protocols</h4>
<p>The Lightweight Directory Access Protocol (LDAP) is a standard application protocol for accessing and maintaining distributed directories.<br />
Each LDAP server maintain a hierarchical database of information about the computers, users, services and protocols available in a given IT infrastructure.</p>
<p><img src="https://blog.synopse.info?post/public/blog/LdapServices.png" alt="" /></p>
<p>Usually, some binary messages (using ASN.1 encoding) are sent and received over TCP (and TLS), but you may have to use UDP in the discovery phase. You can authenticate using username/password credentials, or Kerberos (i.e. the Windows authentication system). And the objects are stored in a tree of name/value attributes.</p>
<p><a href="https://en.wikipedia.org/wiki/Lightweight_Directory_Access_Protocol">https://en.wikipedia.org/wiki/Lightweight_Directory_Access_Protocol</a> gives high level information about the protocol, whereas <a href="https://ldap.com/ldap-reference-materials/">https://ldap.com</a> offers a detailed and didactic presentation of how it works.</p>
<h4>DNS and LDAP Servers</h4>
<p>Since DNS is needed at the local level (there is usually a DNS server in most computer systems), there are plenty of DNS servers around. <br />
Some are local DNS servers (you have typically one in your Internet box), other are distributed DNS servers offered as Service (e.g. GoogleDNS, CloudFlare, OpenDNS...).</p>
<p>The most well-known proprietary LDAP server is part of the <a href="https://learn.microsoft.com/en-us/windows-server/identity/ad-ds/get-started/virtual-dc/active-directory-domain-services-overview">ActiveDirectory product</a> from Microsoft.<br />
<a href="https://www.openldap.org/">OpenLDAP</a> is the most well known and used Open Source LDAP server around.<br />
But of course, both systems are not fully compatible. For instance, the AD authentication is incompatible with OpenLDAP. Here comes Samba to the rescue.</p>
<p>The <a href="https://www.samba.org/">Samba project</a> is an amazing Open Source implementation of it. Samba is an important component to seamlessly integrate Linux/Unix Servers and Desktops into Active Directory environments. It can function both as a domain controller (server) or as a regular domain member (client).</p>
<p><img src="https://www.samba.org/samba/style/2010/grey/bgHeader.png" alt="" /></p>
<p>Samba is a growing alternative to Microsoft for most companies. Especially for those wanting to keep their IT information local - not everyone wants to share everything with Microsoft, which pushes everyone to migrate to its Azure cloud services. Samba could be an almost turn-key solution to manage in-house your IT infrastructure with a proven and audited OpenSource software. Disclaimer: the company I currently work with <a href="https://www.tranquil.it/en/securing-accesses/discover-samba-active-directory/">offers support for Samba setup or migration</a>, and invests for the Samba development. Perhaps because we are French, we like to be free as a bird and as a glass of wine with saucisson. <img src="https://blog.synopse.info?pf=wink.svg" alt=";)" class="smiley" /></p>
<h4>DNS and LDAP Clients</h4>
<p>A basic DNS client is not so difficult to implement, thanks to its simple but proven protocol, based on an efficient binary encoding.<br />
Usually, you would rely on the Operating System to resolve a host name into an IP address. But if you want more than this basic feature, like locating some services addresses, you would need a proper DNS client.</p>
<p>The LDAP protocol is more complex. It is based on the <a href="https://en.wikipedia.org/wiki/ASN.1">ASN.1 encoding</a> and some conventions.<br />
Even if it is documented (there are several RFC to consider), the actual implementation by Microsoft does not follow the specs. Apart from the <a href="https://www.openldap.org/">libldap source code</a>, there are very few LDAP client libraries (all languages considered) which properly work today on most Microsoft servers. For Delphi/FPC, you have an <a href="https://sourceforge.net/projects/synalist/">old library from Synapse</a> - not Synopse <img src="https://blog.synopse.info?pf=smile.svg" alt=":)" class="smiley" /> - but it does not support Kerberos authentication or LDAP signing sealing which is mandatory on modern AD setups, and is not maintained since years.</p>
<h4>mORMot 2 to the Rescue</h4>
<p><img src="https://blog.synopse.info?post/public/blog/marmotteingrass.jpg" alt="" /></p>
<p>We introduced the <a href="https://github.com/synopse/mORMot2/blob/master/src/net/mormot.net.dns.pas">mormot.net.dns.pas</a> unit.<br />
It implements the low-level DNS protocol (over UDP or TCP if the messages are too big), and offers some high-level wrappers, for DNS hostname (reverse) resolution, or service location.</p>
<p>It could also be used as alternate DNS resolver, instead of the Operating System one, for our <a href="https://github.com/synopse/mORMot2/blob/master/src/net/mormot.net.sock.pas">mormot.net.sock.pas</a> core network unit. You can therefore let your application bypass the Internet or corporate IP resolver, and access other networks without messing with the Operating System or VPN settings.</p>
<p>Something more complex - and harder to stabilize - was the <a href="https://github.com/synopse/mORMot2/blob/master/src/net/mormot.net.ldap.pas">mormot.net.ldap.pas</a> unit.<br />
We needed to implement basic ASN.1 encoding/decoding functions, then all LDAP messages generation and parsing, and finally implement a client able to connect to both Microsoft AD or Samba servers, with the best security possible by default - of course.</p>
<p>The easy and safe solution - used by some tools like <a href="http://www.ldapadmin.org/">LdapAdmin</a> - is to call the Windows API. But we could not, because we wanted our client to be truly cross-platform. So we re-invented the wheel, starting from the Synapse library, reverse engineering both OpenLDAP and actual MS ADs (thank you <a href="https://www.wireshark.org/">WireShark</a>!), and wrote and debugged the current code. We will certainly find some nasty limitations in the next months, but at least we have a good code base to work with. The latest addition was to be able to use DNS and CLDAP over UDP to locate the proper LDAP server to use with no pre-configuration.<br />
Now you have in Delphi or FPC a LDAP client able to connect - and even auto-connect with the automated best configuration after service discovery through DNS and CLDAP - to most Microsoft AD or Samba servers. Working and actively tested on both Windows and Linux, over a multitude of actual LDAP servers, thanks to <a href="https://www.tranquil.it/en/manage-it-equipment/discover-wapt/">the main product</a> using our framework.<br />
I don't know many libraries featuring CLDAP server discovery, as our unit does. None in the Delphi/FPC world last time I checked. Feel free to point some libraries in other languages properly implementing CLDAP, if you know any.</p>
<h4>First Steps</h4>
<p>If you are part of a corporate network, the most simple code to use is the following:</p>
<pre>
with TLdapClient.Create do
try
if BindSaslKerberos then
writeln(' authenticated via Kerberos');
Search(WellKnownObjects.Computers, false, '', []);
writeln(SearchResult.Dump);
finally
Free;
end;
</pre>
<p>It will locate and connect to the LDAP server optimum for you, then use your current Windows credentials to login, then list all known computers of the network with full information. And if you are on a Linux (or even Mac) part of the domain... the very same code will also work with no further configuration, using GSSAPI instead of the Windows SSPI calls.<br />
This is a clean demonstration of what <em>mORMot</em> DNS and LDAP client classes could do for you.</p>
<h4>Real Work</h4>
<p>Most of time, you would like to avoid direct LDAP filter searches.<br />
So our <code>TLdapClient</code> class provides some ready-to-use high-level methods, mainly:</p>
<ul>
<li><code>GetGroups()</code> and <code>GetUsers()</code> to retrieve the list of group or user stored names;</li>
<li><code>GetGroupInfo()</code> and <code>GetUserInfo()</code> to retrieve the main information about a given user or group;</li>
<li><code>GetIsMemberOf()</code> to test membership of a user.</li>
</ul>
<p>Those methods feature some high level <code>TGroupType</code> and <code>TUserAccountControl</code> enumerates and sets, which allow to easily get the users or groups from their attributes:</p>
<pre>
disabled := GetUsers([uacAccountDisable]); // all disabled user names
mygroups := GetGroups([gtGlobal], 'grp*'); // all global groups which name starts with grp.. characters
</pre>
<p>To be fair, testing membership of a user over LDAP a very complex task, and <code>GetIsMemberOf()</code> may not cover all cases.</p>
<p>So we even defined an inherited <code>TLdapCheckMember</code> class which can check if a user name is actually a member of one or several groups, maintaining a single connection to the LDAP server for the task, with an internal in-memory cache, featuring <a href="https://www.gabescode.com/active-directory/2018/06/07/what-makes-a-member.html">all the twisted aspect of user membership</a> (e.g. primary groups). This is the purpose of its <code>TLdapCheckMember.Authorize()</code> method:</p>
<pre>
var
usr: RawUtf8;
grps: TRawUtf8DynArray;
...
with TLdapCheckMember.Create do
try
// no CLDAP discovery, but specify the LDAP address and its Kerberos DN
Settings.TargetUri := 'ldaps://dc-siteone.ad.corporate.it/ad.corporate.it';
// we may also set Settings.UserName/Password on a service with no logged user
if BindSaslKerberos('', @usr) then
writeln('Authenticated as ', usr, ' via Kerberos to ', Settings.TargetUri);
// expect the users to be part of any of two groups (specified as sAMAccountName)
AllowGroupAN('grp_one,grp_two');
// loop to ask for user sAMAccountName or userPrincipalName
repeat
readln(usr);
if usr = '' then
break;
if Authorize(usr, @grps) then
writeln('Authorized with groups = ', RawUtf8DynArrayToCsv(grps))
else
writeln('Rejected');
until false;
finally
Free;
end;
</pre>
<p>This last class is used e.g. by both <code>TBasicAuthServerKerberos</code> and <code>TBasicAuthServerLdap</code> for user group authorization before the actual Kerberos or LDAP password authentication, to implement the "search and bind" authorization/authentication pattern on client and/or server sides, from a computer part of the domain or not.</p>
<h4>Better Safe than Sorry</h4>
<p>Perhaps you don't need to use DNS or LDAP directly in your projects now.</p>
<p>But it may be good to know you could with <em>mORMot</em>, if you need to. Those protocols are clearly the <em>lingua franca</em> of all modern computer infrastructures.</p>
<p>As usual, feedback is <a href="https://synopse.info/forum/viewtopic.php?id=6580">welcome on our forum</a>! :)</p>New Command Line Parser in mORMot 2urn:md5:371069b8539ca9cfd9d3cfbcb6163f282023-04-19T00:55:00+01:002023-04-19T12:17:27+01:00Arnaud BouchezmORMot FrameworkblogCommandLineCrossPlatformDelphiFPCFreePascalGoodPracticemORMot2ParamCountParamStr<p><img src="https://blog.synopse.info?post/public/blog/CommandLine.png" alt="" /></p>
<p>For most projects, we want to be able to pass some custom values when starting it.<br />
The command line is then used to add this additional information.</p>
<p>We have <code>ParamStr</code> and <code>ParamCount</code> global functions, enough to retrieve the information. You may also use <code>FindCmdLineSwitch</code> for something more easy to work with.<br />
The Lazarus RTL offers some additional methods like <code>hasOption</code> or <code>getOptionValue</code> or <code>checkOptions</code> in its <code>TCustomApplication</code> class. Their are better, but not so easy to use, and not available on Delphi.</p>
<p>We just committed a new command line parser to our Open Source <em>mORMot 2</em> framework, which works on both Delphi and FPC, follows both Windows and POSIX/Linux conventions, and has much more features (like automated generation of the help message), in an innovative and easy workflow.</p> <p>The most simple code may be the following (extracted from the documentation):</p>
<pre>
var
verbose: boolean;
threads: integer;
...
with Executable.Command do
begin
ExeDescription := 'An executable to test mORMot Execute.Command';
verbose := Option(['v', 'verbose'], 'generate verbose output');
Get(['t', 'threads'], threads, '#number of threads to run', 5);
ConsoleWrite(FullDescription);
end;
</pre>
<p>This code will fill <code>verbose</code> and <code>threads</code> local variables from the command line (with some optional default value), and output on Linux:</p>
<pre>
An executable to test mORMot Execute.Command
Usage: mormot2tests [options] [params]
Options:
-v, --verbose generate verbose output
Params:
-t, --threads <number> (default 5)
number of threads to run
</pre>
<p>So, not only you can parse the command line and retrieve values, but you can also add some description text, and let generate an accurate help message when needed.</p>
<p>You can note that the <code>#</code> character is used to mark the keyword to be used as value name for a given parameter, to make the text more meaningful.<br />
For instance, <code>'#number of threads to run'</code> will generate a nice <code> -t, --threads <number></code> text for the parameter description.</p>
<p>For a most typical use case, you may look at our <a href="https://github.com/synopse/mORMot2/blob/master/ex/techempower-bench/raw.pas#L665">TFB Benchmarking Sample source code</a>:</p>
<pre>
// parse command line parameters
with Executable.Command do
begin
ExeDescription := 'TFB Server using mORMot 2';
if Option(['p', 'pin'], 'pin each server to a CPU') then
pinServers2Cores := true;
if Option('nopin', 'disable the CPU pinning') then
pinServers2Cores := false; // no option would keep the default boolean
Get(['s', 'servers'], servers, '#count of servers (listener sockets)', servers);
Get(['t', 'threads'], threads, 'per-server thread pool #size', threads);
if Option(['?', 'help'], 'display this message') then
begin
ConsoleWrite(FullDescription);
exit;
end;
if ConsoleWriteUnknown then
exit;
end;
</pre>
<p>This would generate such a description:</p>
<pre>
d:\dev\lib2\ex\techempower-bench\exe>raw /?
TFB Server using mORMot 2
Usage: raw [options] [params]
Options:
/p, /pin pin each server to a CPU
/nopin disable the CPU pinning
/?, /help display this message
Params:
/s, /servers <count> (default 1)
count of servers (listener sockets)
/t, /threads <size> (default 8)
per-server thread pool size
</pre>
<p>It will accept commands like this on Windows:</p>
<pre>
raw /p /t=10
raw /t 10 /s 2 /pin
raw /servers=2 /threads=8 /nopin
raw /servers 2 /threads 8 /nopin
</pre>
<p>And, on Linux/POSIX, you could write as usual:</p>
<pre>
./raw -p -t=10
./raw -t 10 -s 2 --pin
./raw --servers=2 --threads=8 --nopin
./raw --servers 2 --threads 8 --nopin
</pre>
<p>Note that both <code>-t 10</code> and <code>-t=10</code> syntax are accepted.</p>
<p>As you may have guessed it, <code>ConsoleWriteUnknown</code> is able to notify the user that a wrong switch has been used - and display the help message.</p>
<p>This function is available in the base <a href="https://github.com/synopse/mORMot2/blob/master/src/core/mormot.core.os.pas">mormot.core.os.pas</a> unit of the framework.</p>
<p>And feedback is <a href="https://synopse.info/forum/viewtopic.php?pid=39579#p39579">welcome in our forum</a>, as usual!</p>
<p>We hope you find it useful!
:)</p>mORMot 2 Release Candidateurn:md5:5a07fafd62077467420766d41d575dde2023-01-10T09:35:00+00:002023-01-10T09:45:21+00:00Arnaud BouchezmORMot FrameworkDelphiForumsFreePascalmORMot2releaseRoadMap<p>The <em>mORMot 2</em> framework is about to be released as its first 2.0 stable version.</p>
<p><img src="https://blog.synopse.info?post/public/blog/mormotontopmountain.jpg" alt="" /></p>
<p>The framework feature set should now be considered as sealed for this release.<br />
There is no issue reported still <a href="https://github.com/synopse/mORMot2/issues">open at github</a> or in the forum.</p>
<p>Please test it, and give here some feedback to fix any problem before the actual release!<br />
We enter a framework code-freeze phase until then.<br />
<img src="https://blog.synopse.info?pf=smile.svg" alt=":-)" class="smiley" /></p> <p><a href="https://synopse.info/forum/viewtopic.php?id=6442">You can use this forum thread</a> to report any issue or modification to be added before the release.<br />
Thanks for your input!</p>
<p><img src="https://blog.synopse.info?post/public/blog/mORMot2-small.png" alt="" /></p>
<p>I am currently working on preliminary documentation.<br />
Some first draft <a href="https://synopse.info/files/doc/mORMot2.html">is available here online</a> (very preliminary draft). <br />
The idea is to make the <em>mORMot 2</em> documentation less verbose, more like a "quick start guide" than mORMot 1. Too much material was killing the documentation, so user reported...<br />
At least we can use <a href="https://github.com/synopse/SynProject">SynProject</a> to generate it. So all API documention is extracted from source, so should be accurate.</p>Efficient Routing for Christmasurn:md5:a0b8f306951aec1c0a62023f50b1d8d22022-12-28T13:52:00+00:002022-12-28T13:58:58+00:00Arnaud BouchezmORMot FrameworkblogGoodPracticeHTTPmORMotmORMot2ORMperformanceRadixTreeRestroutingSOA<p>This is perhaps the last new feature of <em>mORMot 2</em> before its first stable release: a very efficient custom URI routing for our HTTP/HTTPS servers.</p>
<p><img src="https://blog.synopse.info?post/public/blog/SantaRouting.jpeg" alt="" /></p>
<p>At ORM and SOA level, there is by-convention routing of the URI, depending on the ORM table, SOA interface and method, and <code>TOrmModel.Root</code> value. Even for our MVC web part, we rely on a <code>/root/</code> URI prefix, which may not be always needed.<br />
Relying on convention is perfect between <em>mORMot</em> clients and servers, but in some cases, it may be handy to have something smoother, e.g. to publish a truly REST scheme.</p>
<p>We introduced two routing abilities to <em>mORMot 2</em>, with amazing performance (6-12 million parsings per CPU core), via a new <em><code>THttpServerGeneric.Route</code></em><em></em> property:</p>
<ul>
<li>Internal URI rewrite, to redirect internally from a human/REST-friendly request e.g. to a SOA <code>/root/interface.method</code> layout, or to a MVC web page;</li>
<li>Direct callback execution, with optional parameter parsing.</li>
</ul>
<p><strong>Article edited on 28th December:</strong><br />
Fixed performance numbers (much higher than reported), and introduced latest source changes.</p> <h4>The New TUriRouter Class</h4>
<p><img src="https://blog.synopse.info?post/public/blog/Routing.jpg" alt="" /></p>
<p>From <code>THttpServerGeneric.Route</code>, or from <code>TRestHttpServer.Route</code>, you can access a new <code>TUriRouter</code> class.<br />
It is the class responsible of the core registration process of all custom URI parsing.</p>
<p>By default, it is disabled. The classical <em>mORMot</em> routing applies.<br />
But once you call <code>THttpServerGeneric.Route</code> or <code>TRestHttpServer.Route</code>, you can register URIs and how the HTTP server should process it.</p>
<h4>Internal URI rewrite</h4>
<p>Here, we are not talking about HTTP redirection, i.e. returning a <code>30x</code> HTTP status code to let the client make a new request to another URI.<br />
We allow URI rewriting on the fly, within the server, just before the incoming request is identified to the ORM, MVC or SOA <em>mORMot</em> router.</p>
<p>It offers for instance an alternative URI path to the method-based services or the interface based services, at HTTP/HTTPS server level.</p>
<p>Now, we could write:</p>
<pre>
Server.Route.Get('/info', 'root/timestamp/info');
</pre>
<p>So that any GET on <code>/info</code> will redirect to the internal <code>TRestServer.TimeStamp</code> method-based services, and its hidden <code>/info</code> sub-method, which displays some general statistics about the server.</p>
<p>Or we could write:</p>
<pre>
Server.Route.Get('/user/<id>', '/root/userservice/new?id=<id>', urmPost);
</pre>
<p>to rewrite internally e.g. the GET <code>'/user/1234'</code> URI into a POST at <code>'/root/userservice/new?id=1234'</code>, as published by a <code>IUserService.New(id: Int64)</code> interface-based service method.</p>
<p>As such, you could have the best of both worlds.</p>
<p>This URI redirection may also have a very high benefit for a <em>mORMot</em> MVC web application. You could easily redirect some human-friendly URIs into the MVC routing convention, as expected by the MVC interface definition.</p>
<p>Last but not least, if the redirected URI is an integer in range 200..599, then the server will abort the request immediately with an HTTP status error matching the integer:</p>
<pre>
Server.Route.Get('/admin.php', '403');
</pre>
<p>This could help to avoid calling the main REST engine, or write a callback, just to return an error code.</p>
<h4>Direct Callbacks Execution</h4>
<p>As an alternative, you can assign a <code>TOnHttpServerRequest</code> callback with a given URI, optionally with <code><parameters></code>:</p>
<pre>
TOnHttpServerRequest = function(Ctxt: THttpServerRequestAbstract): cardinal of object;
</pre>
<p>The <code>Ctxt</code> instance is the low-level structure holding the HTTP/HTTPS request, with all input and output context.<br />
It even has a property to retrieve the named parameters within the URI, i.e. an <code><id></code> place holder in the URI registration will be recognized, and available within the callback from the <code>Ctxt['id']</code> property.</p>
<p>For instance, it could be used to publish a standard REST process as:</p>
<pre>
// retrieve a list of picture IDs
Server.Route.Get('/user/<user>/pic', DoUserPic);
// support CRUD access of a given picture by ID
Server.Router.Run([urmGet, urmPost, urmPut, urmDelete], '/user/<user>/pic/<id>', DoUserPic)
</pre>
<p>Then the callback could be something like this:</p>
<pre>
function TMyClass.DoUserPic(Ctxt: THttpServerRequestAbstract): cardinal;
var
user: RawUtf8;
id: Int64;
ids: TInt64DynArray;
begin
user := Ctxt['user'];
if Ctxt.RouteInt64('id', id) then
// manage /user/<user>/pic/<id>
if CRUDUserPictureFromDatabase(Ctxt, user, id) then
result := HTTP_SUCCESS
else
result := HTTP_NOTFOUND;
else if RetrieveUserPictureIDListFromDatabase(Ctxt, user, ids) then
// returned /user/<user>/pic
result := HTTP_SUCCESS
else
result := HTTP_NOTFOUND;
end;
</pre>
<p>Of course, URI redirection to an interface-based service may be more convenient, but if you want to reuse some existing code, and have the best performance possible, you could follow this pattern.</p>
<p>It also may be handy for some low-level tasks of the HTTP server, like proxying to internal sub-servers, or quickly return some 30x redirection, or generate some HTML pages.</p>
<p>If the callback returns a <code>result</code> of 0, then execution will continue as usual, but you can change some <code>Ctxt</code> fields. It may allow for very efficient and tuned client authorization, even before you execute some regular interface-based services.</p>
<p>You could also redirect all published methods of a class instance using RTTI, via <code>TRouterUri.RunMethods()</code>.<br />
Just like <code>TRestServer</code> method-based services, but here at HTTP server level, with even higher performance, since the <em>mORMot</em> REST engine is not involved.</p>
<p>Another use may be to handle some very tuned HTTP OPTIONS requests, if the default CORS feature is too broad for your case.<br />
Or quickly return a standard response for HTTP HEAD requests, without any Content-Length header as it is allowed, to leverage the server process.</p>
<h4>Why Not Make It Fast ?</h4>
<p><img src="https://blog.synopse.info?post/public/blog/marmotblackwhite.png" alt="" /></p>
<p>About performance, its <code>TUriRouter.Process()</code> method is done with no memory allocation for a static route, using a <a href="https://en.wikipedia.org/wiki/Radix_tree">very efficient Radix Tree algorithm</a> for path lookup, over a thread-safe non-blocking URI parsing with values extractions for rewrite or execution.</p>
<p>Here are some numbers from <code>TNetworkProtocols._TUriTree</code> on my old Core i5 laptop, on a single thread/core:</p>
<pre>
1000 URI lookups in 37us i.e. 25.7M/s, aver. 37ns
1000 URI static rewrites in 80us i.e. 11.9M/s, aver. 80ns
1000 URI parametrized rewrites in 117us i.e. 8.1M/s, aver. 117ns
1000 URI static execute in 91us i.e. 10.4M/s, aver. 91ns
1000 URI parametrized execute in 162us i.e. 5.8M/s, aver. 162ns
</pre>
<p>As you can see, this routing won't be the bottleneck in your server process. It has a non blocking O(1) complexity, with no unneeded memory allocation during its process.<br />
I know no other Delphi or FPC web or REST framework using custom Radix Tree data structures. Other libraries parse the URI as parts, then check the parts against registered routed (using hash maps if possible). It is clearly less efficient, and has some known disadvantages we will discuss about.</p>
<h4>How Does It work?</h4>
<p>If we run the following code (from our regression tests):</p>
<pre>
router.Get('/plaintext', DoPlainText);
router.Get('/', DoRequestRoot);
router.Get('/do/<one>/pic/<two>', DoRequest0);
router.Get('/do/<one>', DoRequest1);
router.Get('/do/<one>/pic', DoRequest2);
router.Get('/do/<one>/pic/<two>/', DoRequest3);
router.Get('/da/<one>/<two>/<three>/<four>/', DoRequest4);
writeln(router.Tree[urmGet].ToText);
</pre>
<p>We get the following output on the console:</p>
<pre>
/
d
a/
<one>
/
<two>
/
<three>
/
<four>
/
o/
<one>
/pic
/
<two>
/
plaintext
</pre>
<p>The Radix Tree is expanded as spaces. Above lines make it easy to understand how the URI parsing is done: for the top <code>/</code> node to the last exact matching node.<br />
A node can be some static text like <code>/pic</code> or a parameter like <code><one></code>. The registered routes (either URI rewrite, or a callback) are assigned to one node.</p>
<p>This data structure has several benefits, for our parametrized routing scheme.</p>
<p>1. The nodes form a memory structure very easy to parse an URI, one character per character, even with thousands of registered routes.</p>
<p>2. Unlike hash-maps, a tree structure also allows us to use dynamic parts like the <code><one></code> parameter, since we actually match against the routing patterns instead of just comparing hashes. Hashes map static values, while we need to map URI parameters as dynamic values. Our tree is perfect for this purpose.</p>
<p>3. It greatly reduces the classical routing problem of regular registration based on lists and maps, which can suffer from unexpected behavior, just due to the order of the URI registration calls. With our data structure, we know that there is a single node per path, before even starting the look-up in the prefix-tree. Thanks to the tree structure, we know by design that nodes won't overlap.</p>
<p>4. For even better scalability, the child nodes on each tree level are sorted by depth/priority/usage after each registration. The depth/priority/usage is just the number of sub nodes: children, grandchildren, and so on.. Nodes which are part of the most routing paths are evaluated first. This helps to make as much routes as possible to be reachable as fast as possible. It is also some sort of cost compensation. The longest reachable path (highest cost) can always be evaluated first. You can see this sorting in the above tree sample.</p>
<p>5. As you can see, there is one Radix Tree per supported HTTP method, which are GET, POST, PUT, DELETE, HEAD and OPTIONS. For one thing it is more efficient than holding a per-method registration in every single node, for another thing it greatly reduces any routing problem on path overlapping between methods.</p>
<p>In practice, it is very fast and efficient: from 6 to 12 million of URI parsed per CPU core, depending on the process (static URI or parametrized URI).</p>
<p>If you are curious, you could look at the source code in our repository:<br />
<a href="https://github.com/synopse/mORMot2/blob/master/src/net/mormot.net.server.pas">https://github.com/synopse/mORMot2/blob/master/src/net/mormot.net.server.pas</a><br />
Some of the code may be a bit difficult to follow, since we use low-level pointers over chars, to avoid intermediate <code>string</code> allocations, and to offer the best performance. Even the values are parsed and stored as integer indexes and lengths, not as pre-allocated <code>string</code> instance... But you could see the classes hierarchy, and how the registration is done - registration is less performance sensitive, so the code is more high-level here. ;-)</p>
<h4>What Does The Marmot Say?</h4>
<p><img src="https://blog.synopse.info?post/public/blog/marmotinflowers.jpg" alt="marmotinflowers.jpg, Dec 2022" /></p>
<p><strong>Merry Christmas again, and enjoy!</strong></p>Modern Pascal is Still in the Raceurn:md5:af5e15cc3f4bde55561318a1bb5dd3bb2022-11-26T10:05:00+00:002022-11-29T13:21:57+00:00Arnaud BouchezPascal ProgrammingblogcollectionsCrossPlatformDatabaseDelphiFPCGarbageCollectorgenericsGoGoodPracticeMetaProgrammingmORMotmORMot2performanceRTTIRust<p>A recent poll <a href="https://forum.lazarus.freepascal.org/index.php/topic,61276.0.html">on the Lazarus/FPC forum</a> highlighted a fact: pascal coders are older than most coders. Usually, at our age, we should be managers, not developers. But we like coding in pascal. It is still fun after decades!<br />
But does it mean that you should not use pascal for any new project? Are the language/compilers/libraries outdated?<br />
In the company I currently work for, we have young coders, just out-of-school or still-in-school, which joined the team and write great code!</p>
<p><img src="https://blog.synopse.info?post/public/blog/performance.jpg" alt="" /></p>
<p>And a recent thread <a href="https://forum.lazarus.freepascal.org/index.php/topic,61035.0.html">in this very same forum</a> was about comparing languages to implement a REST server, in C#, Go, Scala, TypeScript, Elixir and Rust.<br />
Several pascal versions are about to be contributed, one in which <em>mORMot</em> shines.</p> <h4>The Challenge and the Algorithms</h4>
<p>The original challenge is available at <a href="https://github.com/losvedir/transit-lang-cmp">transit-lang-cmp</a> with the original source code, of all those fancy languages and libraries.</p>
<p>In practice, the goal of this test program is to load two big CSVs into memory (80MB + 2MB), then serve over HTTP some JSON generated by route identifiers, joining both CSVs.<br />
The resulting JSON could be of 30KB up to 2MB. And all data is generated on the fly from the CSV in memory.</p>
<p>To be fair, a regular/business coder would have used a database for this. Not silly memory structures. And asked for money to setup a huge set of cloud machines with load balancing. <img src="https://blog.synopse.info?pf=smile.svg" alt=":-)" class="smiley" /></p>
<h4>Reference Implementations in Today Languages</h4>
<p>The "modern" / "school" approach, as implemented in the reference project in Go/Rust/C#/... is using two lists for the CSVs data, then two maps/dictionaries between route ID and lists indexes.</p>
<ul>
<li>The <a href="https://github.com/losvedir/transit-lang-cmp/blob/main/trogsit/app.go">Golang version</a> has a good expressiveness, and is nice to read, even if you don't know the language.</li>
<li>The <a href="https://github.com/losvedir/transit-lang-cmp/tree/main/Trannet">C# version</a> is also readable, but making a webserver is still confusing because it is not built from code, but from config files.</li>
<li><a href="https://github.com/losvedir/transit-lang-cmp/tree/main/trexit">Elixir</a> is a bit over-complicated to my taste.</li>
<li><a href="https://github.com/losvedir/transit-lang-cmp/tree/main/trala">Scala</a> and <a href="https://github.com/losvedir/transit-lang-cmp/tree/main/trypsit">TypeScript/Deno</a> versions, are fine to read, but really slow. You may better use a database instead.</li>
<li>Just for fun, check <a href="https://github.com/losvedir/transit-lang-cmp/blob/main/trustit/src/main.rs">the Rust version</a> - do you think Rust is good for big maintainable projects with junior developers?</li>
</ul>
<p>There was a first attempt to write a FPC version of it, by Leledumbo.<br />
His <a href="https://github.com/leledumbo/transit-lang-cmp/blob/main/trascal/app.pas">Source Code repository</a> is a nice pascal conversion of above code. But performance was disappointing. Especially because the standard JSON library can not work directly with high level structures like collections or arrays.</p>
<p>So is Pascal out of the race?<br />
Let's call the <em>mORMot</em> to the rescue!</p>
<h4>Following the mORMot Way</h4>
<p>For the <em>mORMot</em> version in FPC, I used another approach, with two diverse algorithms:</p>
<ul>
<li>I ensured the lists were sorted in memory, then made a O(log(n)) binary lookup in it;</li>
<li>All stored strings were "interned", i.e. the same text was sharing a single string instance, and FPC reference counting did its magic.</li>
</ul>
<p>There is no low-level tricks like generating the JSON by hand or using complex data structures - data structures are still are high-level, with readable field names and such. The logic and the intent are clearly readable.<br />
We just leveraged the pascal language, and <em>mORMot</em> features. For instance, string interning is part of the framework, if needed.</p>
<p>Please <a href="https://github.com/synopse/mORMot2/tree/master/ex/lang-cmp/LangCmp.dpr">check the source code in our repository</a>.</p>
<p>As a result:</p>
<ul>
<li>Code is still readable, short and efficient (most of the process is done by <em>mORMot</em>, i.e. CSV, searching, JSON);</li>
<li>It uses much less memory - 10 times less memory than Go when holding the data, 5 times less memory than Go when serving the data;</li>
<li>Performance is as fast as Go, and its very tuned/optimized compiler and RTL.</li>
</ul>
<p><img src="https://blog.synopse.info?post/public/blog/mORMot2-small.png" alt="" /></p>
<h4>Algorithms Matters</h4>
<p>Main idea was to let the algorithms match the input data and the expected resultset.<br />
As programmers do when programming games. Not as coders do when pissing out business software. <img src="https://blog.synopse.info?pf=wink.svg" alt=";-)" class="smiley" /></p>
<ul>
<li>The source code is still pretty readable, thanks to using <em>mORMot</em> efficient <code>TDynArray</code> to map the dynamic array storage, and its CSV and JSON abilities.</li>
<li>I guess source is still understandable for out-of-school programmers - much more readable than Rust for instance.</li>
</ul>
<p>To by fair, I used typed pointers in <code>TScheduler.BuildTripResponse</code> but it is not so hard getting their purpose, and FPC compiles this function into very efficient assembly. I could have used regular dynamic array access with indexes, it would have been slightly slower, but not really easier to follow, nor safer (if we compile with no range checking).</p>
<p>Worth noting that we did not make any specific tuning, like pre-allocating the results with constants, as other frameworks did. We just specified the data, then let <em>mORMot</em> play with it - that's all.<br />
The <em>mORMot</em> RTTI level matches what we expect for modern frameworks: not only some classes to store JSON, but convenient serialization/unserialization using structures like class or record.<br />
Using modern Pascal dynamic arrays and records to define the data structures let the compiler leverage the memory for us, with no need to write any <code>try..finally..Free</code> blocks, and use interfaces. "Manual memory management" with Pascal is not mandatory and can easily be bypassed. Only for the WebServer, we have a <code>Free</code>, which is expected to close it.</p>
<h4>Give Me Some Numbers</h4>
<p>Here are a performance comparison with Go (FPC on the left, Go on the right):</p>
<pre>
parsed 1790905 stop times in 968.43ms | parsed 1790905 stop times in 3.245251432s
parsed 71091 trips in 39.54ms | parsed 71091 trips in 85.747852ms
running (0m33.4s), 00/50 VUs, 348 complete and 0 interrupted | running (0m32.3s), 00/50 VUs, 320 complete and 0 interrupted
default ✓ [======================================] 50 VUs 30 default ✓ [======================================] 50 VUs 30
data_received..................: 31 GB 933 MB/s | data_received..................: 31 GB 971 MB/s
data_sent......................: 3.2 MB 97 kB/s | data_sent......................: 3.0 MB 92 kB/s
http_req_blocked...............: avg=9µs min=1.09µs | http_req_blocked...............: avg=6.77µs min=1.09µs
http_req_connecting............: avg=2.95µs min=0s | http_req_connecting............: avg=1.73µs min=0s
http_req_duration..............: avg=47.59ms min=97.28µs | http_req_duration..............: avg=49.02ms min=123.81µ
{ expected_response:true }...: avg=47.59ms min=97.28µs | { expected_response:true }...: avg=49.02ms min=123.81µ
http_req_failed................: 0.00% ✓ 0 ✗ | http_req_failed................: 0.00% ✓ 0 ✗ 3
http_req_receiving.............: avg=9.66ms min=15.35µs | http_req_receiving.............: avg=5.92ms min=14.76µs
http_req_sending...............: avg=87.24µs min=5.2µs | http_req_sending...............: avg=70.71µs min=5.2µs
http_req_tls_handshaking.......: avg=0s min=0s | http_req_tls_handshaking.......: avg=0s min=0s
http_req_waiting...............: avg=37.83ms min=54.74µs | http_req_waiting...............: avg=43.02ms min=91.84µs
http_reqs......................: 34452 1032.205528/s | http_reqs......................: 31680 981.949476/s
iteration_duration.............: avg=4.72s min=3.54s | iteration_duration.............: avg=4.86s min=2.19s
iterations.....................: 348 10.426318/s | iterations.....................: 320 9.918682/s
vus............................: 30 min=30 ma | vus............................: 15 min=15 max
vus_max........................: 50 min=50 ma | vus_max........................: 50 min=50 max
</pre>
<p>So CSV loading was much faster, then the HTTP server performance was almost the same.</p>
<h4>No Alzheimer</h4>
<p>Here are some numbers about memory consumption:</p>
<blockquote><p>Upon finished loading the CSV, mORMot only eats 80MB, heck so little. Sounds a bit magical. But during load test, it fluctuates between 250-350MB, upon which it returns to 80MB at the end.
The Go version eats 925MB upon finished loading the CSV. During load test, it tops at 1.5GB, returning to 925MB afterwards.</p></blockquote>
<p>Nice to read. :)</p>
<h4>Pascal has a Modern and Capable Ecosystem</h4>
<p>This article was not only about Pascal, but about algorithms and libraries.<br />
The challenge was initially about comparing them. Not only as unrealistic micro-benchmarks, or "computer language benchmark games", but as data processing abilities on a real usecase.</p>
<p><strong>And... Pascal is still in the race for sure!</strong><br />
Not only for "old" people like me - I just got 50 years old. ;-)</p>
<p>The more we spread such kind of information, the less people would make jokes about pascal programmers.<br />
Delphi and FPC are as old as Java, so it is time to get the big picture, not following marketing trends.</p>New Client for MongoDB 5.1/6 Supporturn:md5:41f28034cfacffa2bf7fde35e8e64b432022-08-12T20:12:00+01:002022-08-12T20:12:00+01:00Arnaud BouchezmORMot FrameworkCrossPlatformDatabaseDelphiFPCFreePascalJSONMongoDBmORMot2NoSQLODMORMperformanceRESTSource<p>Starting with its version 5.1, <em>MongoDB</em> disabled the legacy protocol used for communication since its beginning.<br />
As a consequence, our <em>mORMot</em> client was not able to communicate any more with the latest versions of <em>MongoDB</em> instances.</p>
<p><img src="https://blog.synopse.info?post/public/mongodb.png" alt="" /></p>
<p>Last week, we made a deep rewrite of <a href="https://github.com/synopse/mORMot2/blob/master/src/db/mormot.db.nosql.mongodb.pas">mormot.db.nosql.mongodb.pas</a>, which changed the default protocol to use the new layout on the wire. Now messages use regular <a href="https://www.mongodb.com/docs/current/reference/command/">MongoDB Database Commands</a>, with automated compression if needed.</p>
<p>No change is needed in your end-user <em>MongoDB</em> or ORM/ODM code. The upgrade is as simple as update your <em>mORMot 2</em> source, then recompile.</p> <h4>The Mongo Wire Protocol</h4>
<p>Since its beginning, <em>MongoDB</em> used a simple protocol over TCP, via several binary opcodes and message, for CRUD operations.</p>
<p>A new alternative protocol was <a href="https://emptysqua.re/blog/driver-features-for-mongodb-3-6/">introduced in version 3.6,</a> and the former protocol was marked as deprecated.<br />
Two new opcodes were introduced, OP_MSG and OP_COMPRESSED, to replace all other frames. They just encapsulate, with or without compression, some abstract BSON content.<br />
The official documentation details those changes <a href="https://www.mongodb.com/docs/manual/reference/mongodb-wire-protocol/">in this web page</a>.</p>
<p>In short (picture extracted from the blog above), the protocol came from this:</p>
<p><img src="https://blog.synopse.info?post/public/ye-olde-wire-protocol.png" alt="" /></p>
<p>to this:</p>
<p><img src="https://blog.synopse.info?post/public/op-msg.png" alt="" /></p>
<p>The main benefit is that the commands and answers are just conventional BSON, so the protocol can change at logical/BSON/JSON level by adding or changing some members, with no need of dealing with low-level binary structures.</p>
<p>With the version 5.1 of <em>MongoDB</em>, the previous protocol was not just deprecated, but disabled.<br />
So we had to update the <em>mORMot 2</em> client code! (yes, the <em>mORMot 1</em> code has not been updated - it may become a good reason to upgrade)</p>
<h4>Deep Rewrite</h4>
<p>In fact, the official MongoDB documentation is somewhat vague. And the official drivers are a bit difficult to reverse-engineer, due to the verbose nature of C, Java or C#. The native/node driver was easiest to dissect, and we used it as reference.<br />
Luckily enough, there are some <a href="https://github.com/mongodb/specifications/blob/master/source/message/OP_MSG.rst">specification document available too</a>, which offers some additional valuable clarifications.</p>
<p>After some testing, we managed to replace all previous OP_QUERY and its brothers to the new OP_MSG frame, which is, as documented in the specification, "One opcode to rule them all". <img src="https://blog.synopse.info?pf=wink.svg" alt=";)" class="smiley" /></p>
<p>Once we had the commands working, we needed to rewrite all CRUD operations using commands, and not opcodes.<br />
Queries are now made with <code><a href="https://www.mongodb.com/docs/current/reference/command/find/">find</a></code> and <code><a href="https://www.mongodb.com/docs/current/reference/command/aggregate/">aggregate</a></code> commands. Their results are now located in a <code>"cursor": firstBatch": ..</code> BSON array within the response. And a new <code><a href="https://www.mongodb.com/docs/current/reference/command/getMore/">getMore</a></code> command is to be used to retrieve the next values within a <code>"cursor": nextBatch": ...</code> resultset.<br />
For writing, <code><a href="https://www.mongodb.com/docs/current/reference/command/insert">insert</a></code>, <code><a href="https://www.mongodb.com/docs/current/reference/command/update">update</a></code> and <code><a href="https://www.mongodb.com/docs/current/reference/command/delete">delete</a></code> commands are called, with their appropriate BSON content.</p>
<p>During the refactoring, we optimized the BSON process, and also enhanced the whole process, mainly the logs and the execution efficiency. The <em>mORMot</em> client side should not be a bottleneck. And it is not, even with this NoSQL database.</p>
<p>Don't expect any performance enhancement, or new features. It is just some low-level protocol change at TCP level.<br />
But if you used the "non acknowledged write mode" of the former protocol, which was unsafe but very fast, you will have lower performance with the new protocol, because the new protocol always acknowledges the commands it receives. So, in some very specific configurations, the new protocol may reduce the performance.</p>
<h4>Backward Compatibility</h4>
<p>All those changes were encapsulated in our revised <a href="https://github.com/synopse/mORMot2/blob/master/src/db/mormot.db.nosql.mongodb.pas">mormot.db.nosql.mongodb.pas</a> unit.</p>
<p>If you have a very old <em>MongoDB</em> instance, and don't want to upgrade, you could just compile your project with the <code>MONGO_OLDPROTOCOL</code> conditional, to use the deprecated opcodes.<br />
If the <em>MongoDB</em> team does not care much with backward compatibility (they could have kept the previous protocol for sure, they still maintain it for the handshake message if needed), we do care about not breaking too much things with <em>mORMot</em>, so we kept the previous code, and tested/validated it too, for legacy systems.</p>
<h4>New Sample</h4>
<p>We translated and introduced the <em>MongoDB</em> benchmark sample to <em>mORMot 2</em> code base.</p>
<p>You could find it, and run it, from <a href="https://github.com/synopse/mORMot2/tree/master/ex/mongodb">our source code repository</a>.</p>
<p>This code is a good entry point for what is possible with this unit in our framework, for both direct access or ORM/ODM access.<br />
And you would be able to guess the performance numbers you may achieve with your project.</p>
<p>Running a <em>MongoDB</em> database in a container is as easy as executing the following command:</p>
<pre>
sudo docker run --name mongodb -d -p 27017:27017 mongo:latest
</pre>
<p>Then you will have a <em>MongoDB</em> server instance accessible on <code>localhost:27017</code>, so you could run the sample straight away.</p>
<h4>Delphi/FPC Open Source Rocks</h4>
<p>We hope you will find the change painless and transparent. We did not modify the high-level client methods, nor break the ORM/ODM: you can still write some SELECT complex statements, and our ORM will translate it into <em>MongoDB</em> aggregate commands.</p>
<p>To my knowledge, there is <a href="https://github.com/stijnsanders/TMongoWire/commit/7f12a64f571e476704bdcb737e1fc087ef792f59">only a single other Delphi/FPC client library</a> which made the upgrade to the new protocol, at today. Once we made our own changes, we notified other library authors, and Stijn made very quickly the needed changes. Congrats! Maybe our code could be used as reference for other library maintainers, because the protocol needs some small tweaks sometimes.<br />
It is important to have some maintenance on the library you use. And our little <em>mORMot</em> is still on the edge: thanks to FPC, it runs very well on Linux and BSD, which makes it perfect for professional services running in the long term! :)</p>
<p>Your feedback is welcome <a href="https://synopse.info/forum/viewtopic.php?id=6318">in the forum thread which initiated these modifications</a>, as usual!<br />
Don't hesitate to notify us any missing or broken feature.<br />
Thanks Daniel for your report and support!</p>Native TLS Support for mORMot 2 REST or WebSockets Serversurn:md5:0d13d921c925b27dacea2bb1bda2c2852022-07-09T11:08:00+01:002022-07-10T06:39:19+01:00Arnaud BouchezmORMot FrameworkAsymmetricFreePascalHTTPhttp.sysHTTPSLinuxmORMotmORMot2OpenSourceOpenSSLperformancePublicKeyRESTSChannelTLSWebSockets<p>Since the beginning, we delegated the TLS encryption support to a reverse proxy server, mainly <a href="https://nginx.org">Nginx</a>. Under Windows, you could setup the http.sys HTTPS layer as usual, as a native - even a bit complicated - solution.<br />
Nginx has several advantages, the first being a proven and efficient technology, with plenty of documentation and configuration tips. It interfaces nicely with Let's Encrypt, and is very good for any regular website, using static content and PHP. This very blog and the <a href="https://synopse.info">Synopse</a> web site is hosted via Ngnix on a small Linux server.</p>
<p><img src="https://blog.synopse.info?post/public/blog/TlsServer.png" alt="" /></p>
<p>But in mORMot 2, we introduced a new set of <a href="https://blog.synopse.info/?post/2022/05/21/New-Async-HTTP/WebSocket-Server-on-mORMot-2">asynchronous web server classes</a>. So stability and performance are not a problem any more. Some benchmarks even consider this server to <a href="https://synopse.info/forum/viewtopic.php?pid=36546#p36546">be faster than nginx</a> (the stability issue mentioned in this post has been fixed in-between).<br />
We just introduced TLS support of our socket-based servers, both the blocking and asynchronous classes. It would use OpenSSL if available, or the SChannel API layer of Windows. Serving HTTPS or WSS with a self-signed certificate is just a matter of a single parameter now, and performance seems pretty good, especially with OpenSSL.</p> <h4>From HTTP to HTTPS</h4>
<p>Here is how you publish a <code>TRestServer</code> instance over HTTP, on port 8888, and with 16 threads for the thread pool:</p>
<pre>
Server := TRestHttpServer.Create([RestServer], '8888', 16);
</pre>
<p>Note the new constructor, easier to use than before, if you just want the default asynchronous server.</p>
<p>And to publish over HTTPS, on the very same port, with a self-signed certificate:</p>
<pre>
Server := TRestHttpServer.Create([RestServer], '8888', 16, secTLSSelfSigned);
</pre>
<p>For a <em>mORMot</em> client, you should also specify that you expect TLS support, and ignore the fact that this self-signed certificate is unknown by the system:</p>
<pre>
Client := TRestHttpClientSocket.Create('127.0.0.1', '8888', Model, {https=}true);
Client.IgnoreTlsCertificateErrors := true;
</pre>
<p>Then, at least with OpenSSL, you could serve TLS 1.3 content from now on, with a safe cipher negotiation by default (which could be tuned if needed).<br />
Nice and easy!</p>
<h4>Give Me the Keys</h4>
<p>And if you generated your own public/private keys pair, you could specify it:</p>
<pre>
Server := TRestHttpServer.Create([RestServer], '8888', 16, secTLS,
HTTPSERVER_DEFAULT_OPTIONS, 'mycert.pem', 'mypriv.pem', 'privpassw');
</pre>
<p>And don't forget to keep your private key... private. :)</p>
<h4>Keys Rule</h4>
<p>Where do those certificates come from? Do I need to read endless and complex OpenSSL command line samples, and mess with files and passwords?<br />
Our framework make it easy. You can now use the new <em>mORMot 2</em> high-level cryptography interfaces to generate the keys you want in simple pascal code.</p>
<p>Here is how the framework generates the self-signed server certificate on OpenSSL, or use a pre-computed one for SChannel:</p>
<pre>
procedure InitNetTlsContextSelfSignedServer(var TLS: TNetTlsContext;
Algo: TCryptAsymAlgo);
var
cert: ICryptCert;
certfile, keyfile: TFileName;
keypass: RawUtf8;
begin
certfile := TemporaryFileName;
if CryptCertAlgoOpenSsl[Algo] = nil then
begin
FileFromString(PrivKeyCertPfx, certfile); // use pre-computed key
keypass := 'pass';
end
else
begin
keyfile := TemporaryFileName;
keypass := CardinalToHexLower(Random32);
cert := CryptCertAlgoOpenSsl[Algo].
Generate(CU_TLS_SERVER, '127.0.0.1', {authority=}nil, 3650);
cert.SaveToFile(certfile, cccCertOnly, '', ccfPem);
cert.SaveToFile(keyfile, cccPrivateKeyOnly, keypass, ccfPem);
//writeln(BinToSource('PRIVKEY_PFX', '',
// cert.Save(cccCertWithPrivateKey, 'pass', ccfBinary)));
end;
InitNetTlsContext(TLS, {server=}true, certfile, keyfile, keypass);
end;
</pre>
<p>As you can see, the <code>ICryptCert</code> interface is very simple to use, and hide all the complexity of X509 and OpenSSL. We provided <code>nil</code> as authority, but you could specify a <code>ICryptCert</code> instance to sign your certificate, if needed.<br />
Under comments in the above source, you can see how to export the keys pair as PKCS#12 certificate, ready to be used for SChannel.</p>
<h4>TLS Everywhere</h4>
<p>Offering TLS as part of your software solution could be a game-changer for serious business, even over a corporate network, with self-signed certificates. It would help your IT and management people trust your <em>mORMot</em> / pascal solution in an heterogeneous and complex mesh of services. Modern object pascal is still on track for the next decades! :)</p>
<p>Feedback is <a href="https://synopse.info/forum/viewtopic.php?id=6291">welcome on our forum</a>, as usual. :)</p>New Async HTTP/WebSocket Server on mORMot 2urn:md5:b5d5687573d19a81f6d6dadfbc68461d2022-05-21T13:35:00+01:002022-05-21T18:05:45+01:00Arnaud BouchezmORMot FrameworkDelphiFPCfpcx64mmHTTPhttp.sysHTTPSLinuxmORMotmORMot2multithreadperformanceRESTRestsecuritySOA<p>The HTTP server is one main part of any SOA/REST service, by design.<br />
It is the main entry point of all incoming requests. So it should better be stable and efficient. And should be able to scale in the future, if needed.</p>
<p><img src="https://blog.synopse.info?post/public/blog/server.jpg" alt="" /></p>
<p>There have always been several HTTP servers in <em>mORMot</em>. You can use the HTTP server class you need.<br />
In <em>mORMot</em> 2, we added two new server classes, one for publishing over HTTP, another able to upgrade to WebSockets. The main difference is that they are fully event-driven, so their thread pool is able to scale with thousands of concurrent connections, with a fixed number of threads. They are a response to the limitations of our previous socket server.</p> <h4>HTTP Is Not REST</h4>
<p>In <em>mORMot</em>, the HTTP server does not match the REST server. They are two concepts, with diverse units, classes and even... source code folders.<br />
The HTTP server can publish its own process, using a callback, or can leverage one or several REST servers. Or the REST server could be with no communication at all, e.g. be run in the service process within the current thread, executing ORM or SOA requests without any HTTP or WebSockets involved. No difference in the user code: just some interface methods to call, and they will return their answer, whatever it is over HTTP, locally or remotely, on the server side or the client side, in a thread pool or in-process... pascal code just runs its magic for you.</p>
<p>For instance, take a look at <a href="https://github.com/synopse/mORMot2/blob/master/ex/http-server-raw/httpServerRaw.dpr">httpServerRaw</a> as a low-level HTTP server sample, using a callback for every incoming request.<br />
Here you don't have any automatic routing: you just parse the input URI, then return the proper HTTP response, with its status code, headers and body.<br />
Those low-level HTTP servers are implemented in the <em>server</em> units of folder <a href="https://github.com/synopse/mORMot2/tree/master/src/net">src/net</a> - as we will detail below.</p>
<p>Then, the REST process is abstracted from HTTP. Even if both HTTP and REST historically share the <a href="https://en.wikipedia.org/wiki/Roy_Fielding">same father/author/initiator</a>, you could have a REST approach without HTTP. For instance, you could use WebSockets, or direct in-process call.<br />
So in <em>mORMot</em>, the REST process is implemented in the <em>server</em> units of folder <a href="https://github.com/synopse/mORMot2/tree/master/src/rest">src/rest</a>, abstracted from any communication protocol. It does not know nor assume anything about TCP, WebSockets or TLS. Just about URI, headers and body texts, and follow the routing as defined.</p>
<p>Two folders, two uncoupled feature sets. Perhaps a bit confusing when you discover it first. But for the best maintainability and code design.</p>
<h4>Several Servers To Rule Them All</h4>
<p><em>mORMot</em> 2 adopted all HTTP server classes from <em>mORMot</em> 1 source code. Then include some new "asynchronous" servers.<br />
They all inherit from a <code>THttpServerGeneric</code> parent class, so you can follow the Liskow Substitution Principle, and change the class at runtime or compilation, as needed, without altering your actual logic.</p>
<p>HTTP servers are implemented in several units:</p>
<ul>
<li><a href="https://github.com/synopse/mORMot2/blob/master/src/net/mormot.net.server.pas"><em>mORMot</em>.net.server.pas</a> offers the <code>THttpServerSocket</code>/<code>THttpServer</code> HTTP/1.1 server, the <code>THttpApiServer</code> HTTP/1.1 server over Windows http.sys module, and <code>THttpApiWebSocketServer</code> over Windows http.sys module;</li>
<li><a href="https://github.com/synopse/mORMot2/blob/master/src/net/mormot.net.ws.server.pas"><em>mORMot</em>.net.ws.server.pas</a> offers the <code>TWebSocketServerRest</code> server, which uses WebSockets as mean of transmission, but enable a REST-like blocking request/answer protocol on top of it, with optional bi-directional notifications, using a one-thread-per-connection server;</li>
<li><a href="https://github.com/synopse/mORMot2/blob/master/src/net/mormot.net.async.pas"><em>mORMot</em>.net.async.pas</a> offers the new <code>THttpAsyncServer</code> event-driven HTTP server;</li>
<li><a href="https://github.com/synopse/mORMot2/blob/master/src/net/mormot.net.ws.async.pas"><em>mORMot</em>.net.ws.async.pas</a> offers the new <code>TWebSocketAsyncServerRest</code> server, which uses WebSockets as mean of transmission, but enable a REST-like blocking request/answer protocol on top of it, with optional bi-directional notifications, using an event-driven server.</li>
</ul>
<p>On Windows, the <a href="https://docs.microsoft.com/en-us/iis/get-started/introduction-to-iis/introduction-to-iis-architecture#hypertext-transfer-protocol-stack-httpsys">http.sys</a> module gives you very good stability, and uses the same Windows-centric way of publishing servers as used by IIS and DotNet. You could even share the same port between several services, if needed.</p>
<p>Our socket-based servers are cross-plaform, and compile and run on both Windows and POSIX (Linux, BSD, MacOS). They use a thread pool for HTTP/1.0 short living requests, and one thread per connection on HTTP/1.1. So they are meant to be used behind a reverse proxy like nginx, which could transmit over HTTP/1.0 with <em>mORMot</em>, but keep efficient HTTP/1.1 or HTTP/2.0 to communicate with the clients.</p>
<p>Both <a href="https://github.com/synopse/mORMot2/blob/master/src/net/mormot.net.async.pas"><em>mORMot</em>.net.async.pas</a> and <a href="https://github.com/synopse/mORMot2/blob/master/src/net/mormot.net.ws.async.pas"><em>mORMot</em>.net.ws.async.pas</a> are new to <em>mORMot</em> 2. They use an event-driven model, i.e. the opened connections are tracked using a fast API (like epoll on Linux), and the thread pool is used only when there is actually new data pending.</p>
<h4>Events Forever</h4>
<p>Asynchronous socket access, and event loops are the key for best server scalability. In respect to our regular <code>THttpServerSocket</code> class which uses one thread per HTTP/1.1 or WebSockets connection, our asynchronous classes (e.g. <code>THttpAsyncServer</code>) can have thousands of concurrent clients, with minimal CPU and RAM resource consumption.</p>
<p>Here is a typical event-driven socket access:</p>
<p><img src="https://blog.synopse.info?post/public/blog/epoll.png" alt="" /></p>
<p>In <em>mORMot</em> 2 network core, i.e. in unit <a href="https://github.com/synopse/mORMot2/blob/master/src/net/mormot.net.sock.pas"><em>mORMot</em>.net.sock.pas</a>, we define an abstract event-driven class:</p>
<pre>
/// implements efficient polling of multiple sockets
// - will maintain a pool of TPollSocketAbstract instances, to monitor
// incoming data or outgoing availability for a set of active connections
// - call Subscribe/Unsubscribe to setup the monitored sockets
// - call GetOne from a main thread, optionally GetOnePending from sub-threads
TPollSockets = class(TPollAbstract)
...
/// initialize the sockets polling
constructor Create(aPollClass: TPollSocketClass = nil);
/// finalize the sockets polling, and release all used memory
destructor Destroy; override;
/// track modifications on one specified TSocket and tag
function Subscribe(socket: TNetSocket; events: TPollSocketEvents;
tag: TPollSocketTag): boolean; override;
/// stop status modifications tracking on one specified TSocket and tag
procedure Unsubscribe(socket: TNetSocket; tag: TPollSocketTag); virtual;
/// retrieve the next pending notification, or let the poll wait for new
function GetOne(timeoutMS: integer; const call: RawUtf8;
out notif: TPollSocketResult): boolean; virtual;
/// retrieve the next pending notification
function GetOnePending(out notif: TPollSocketResult; const call: RawUtf8): boolean;
/// let the poll check for pending events and apend them to fPending results
function PollForPendingEvents(timeoutMS: integer): integer; virtual;
/// manually append one event to the pending nodifications
procedure AddOnePending(aTag: TPollSocketTag; aEvents: TPollSocketEvents;
aNoSearch: boolean);
/// notify any GetOne waiting method to stop its polling loop
procedure Terminate; override;
...
</pre>
<p>Depending on the operating system, it will mimic the <a href="https://man7.org/linux/man-pages/man7/epoll.7.html">epoll api</a> with the underlying low-level system calls.</p>
<p>The events are very abstract, and are in fact just the basic R/W operations on each connection, associated with a "tag", which is likely to be a class pointer associated with a socket/connection:</p>
<pre>
TPollSocketEvent = (
pseRead,
pseWrite,
pseError,
pseClosed);
TPollSocketEvents = set of TPollSocketEvent;
TPollSocketResult = record
tag: TPollSocketTag;
events: TPollSocketEvents;
end;
TPollSocketResults = record
Events: array of TPollSocketResult;
Count: PtrInt;
end;
</pre>
<p>Then whole new HTTP/1.0, HTTP/1.1 and WebSockets stacks have been written on top of those basic socket-driven events. Instead of blocking threads, they use internal state machines, which are much lighter than a thread, and even lighter than a coroutine/goroutine. Each connection is just a class instance, which maintains the state of each client/server communication, and accesses its own socket.<br />
The <em>mORMot</em> asynchronous TCP server has by default one thread to accept the connection, one thread to poll for pending events (calling the <code>GetOne</code> method), then a dedicated number of threads to consume the Read/Write/Close events (via the <code>GetOnePending</code> method). We used as many non-blocking structures as possible, we minimized memory allocation by reusing the same buffers e.g. for the headers or small responses, we pickup the <a href="https://blog.synopse.info?post/2022/05/21/New-Async-HTTP/post/2022/01/22/Three-Locks-To-Rule-Them-All">best locks possible</a> in each case, so that this server could scales nice and smoothly. And simple to use, because handling a new protocol is as easy as inheriting and writing a new connection class.</p>
<p>Our asynchronous servers classes seem now stable, and fast (reported to be twice faster than nginx and six time faster than nodejs!).<br />
But of course, as any new complex code, they may be caveats. And some of them have already be identified and fixed - as <a href="https://synopse.info/forum/viewtopic.php?pid=36546#p36546">reported in our forum</a>. Therefore, feedback is welcome, and a nginx, haproxy or caddy reverse proxy frontend is always a good idea on production.</p>
<p>Writing those servers took more time than previewed, and was sometimes painful. Because debugging multi-thread process is not easy, and especially on several operating systems. There are some subtle differences between the OS, which could lead to unexpected blocking or degraded performance. But we are proud of the result, which compares to the best-in-class servers. Still in modern pascal code, and Open Source software.</p>
<p>Don't hesitate to take a look at the source, and try some samples.<br />
Feedback is <a href="https://synopse.info/forum/viewtopic.php?id=6253">welcome in our forum</a>, as usual.</p>mORMot 2 ORM Performanceurn:md5:332da916ca3f905cedf4c82d14fc9b872022-02-15T13:24:00+00:002022-02-15T13:24:00+00:00Arnaud BouchezmORMot Framework64bitAES-CTRasmblogDatabaseDelphiFPCfpcx64mmFreePascalJSONMicroservicesmORMotmORMot2ORMperformanceSQLSQLite3<p>The official release of <em>mORMot 2</em> is around the edge.
It may be the occasion to show some data persistence performance numbers, in respect to <em>mORMot 1</em>.</p>
<p><img src="https://blog.synopse.info?post/public/blog/marmotrunningsnow.jpg" alt="" /></p>
<p>For the version 2 of our framework, its ORM feature has been enhanced and tuned in several aspects: REST routing optimization, ORM/JSON serialization, and in-memory and SQL engines tuning.
Numbers are talking. You could compare with any other solution, and compile and run the tests by yourself for both framework, and see how it goes on your own computer or server.<br />
In a nutshell, we <em>almost reach 1 million inserts per second on SQLite3</em>, and are above the million inserts in our in-memory engine. Reading speed is 1.2 million and 1.7 million respectively. From the object to the storage, and back. And forcing AES-CTR encryption on disk almost don't change anything. Now we are talking. <img src="https://blog.synopse.info?pf=wink.svg" alt=";)" class="smiley" /></p> <h3>Platform Used</h3>
<p>Those numbers were taken from the "external database" sample, which is available on both versions of the framework.<br />
This is the very same benchmark as used in previous benchmarks on this blog or our documentation. So you could compare the numbers.</p>
<p>But we run the tests on a new computer, featuring a Intel(R) Core(TM) i5-7300U cpu, from a good old Thinkpad T470 notebook, with a SATA SSD.<br />
So on a more modern hardware, like a high-end AMD or Xeon server, the million inserts is easily passed. And on a slow VM, you will get pretty good numbers.</p>
<p>We run the tests on Linux x86_64 (Debian 11), compiled from FPC 3.2 stable - which is our target platform for performance.<br />
In fact, performance matters mainly on the server side - clients are usually fast enough to run whatever process they need, and the ORM/database/persistence process is likely to be located on the server side. So in <em>mORMot 2</em>, we focused on a x86_64 Linux server for performance, which is the cheapest and safest solution around.</p>
<p>Both frameworks used our <a href="https://github.com/synopse/mORMot2/blob/master/src/core/mormot.core.fpcx64mm.pas">in-memory heap manager for FPC</a>, written in x86_64 assembly. It has a noticeable performance benefit, especially on multi-thread process (not shown here, but during the main regression tests).</p>
<h3>Insertion Speed</h3>
<p>Here are the <em>mORMot 2</em> insertion numbers:</p>
<p>Running tests using Synopse mORMot framework 2.0.1, compiled with Free Pascal 3.2 64 bit, against SQLite 3.37.2, on Debian GNU/Linux 11 (bullseye) - Linux 5.10.0-10-amd64, at 2022-02-14 21:09:37.</p>
<p><table><tbody><tr align="center"><td> </td><td><strong>Direct</strong></td><td><strong>Batch</strong></td><td><strong>Trans</strong></td><td><strong>Batch Trans</strong></td></tr>
<tr align="center"><td><strong>Sqlite file full</strong></td><td>98</td><td>5908</td><td>74089</td><td>242072</td></tr>
<tr align="center"><td><strong>Sqlite file off</strong></td><td>13534</td><td>474428</td><td>151315</td><td>919624</td></tr>
<tr align="center"><td><strong>Sqlite file off exc</strong></td><td>42961</td><td>691037</td><td>153374</td><td>929281</td></tr>
<tr align="center"><td><strong>Sqlite file off exc aes</strong></td><td>26882</td><td>533788</td><td>152795</td><td>874814</td></tr>
<tr align="center"><td><strong>Sqlite in memory</strong></td><td>114664</td><td>969743</td><td>152190</td><td>972478</td></tr>
<tr align="center"><td><strong>In memory static</strong></td><td>411895</td><td>1086956</td><td>428724</td><td>1301236</td></tr>
<tr align="center"><td><strong>In memory virtual</strong></td><td>385445</td><td>1200480</td><td>412762</td><td>1219660</td></tr>
<tr align="center"><td><strong>External sqlite file full</strong></td><td>107</td><td>5957</td><td>83531</td><td>111043</td></tr>
<tr align="center"><td><strong>External sqlite file off</strong></td><td>16509</td><td>291528</td><td>151890</td><td>390502</td></tr>
<tr align="center"><td><strong>External sqlite file off exc</strong></td><td>58922</td><td>354924</td><td>150179</td><td>392649</td></tr>
<tr align="center"><td><strong>External sqlite in memory</strong></td><td>114476</td><td>991080</td><td>154564</td><td>991375</td></tr>
<tr align="center"><td><strong>Remote sqlite socket</strong></td><td>19789</td><td>155763</td><td>19439</td><td>236406</td></tr>
</tbody></table></p>
<p><img src="http://chart.apis.google.com/chart?chtt=Insertion+speed+%28rows%2Fsecond%29&chxl=1:|Batch+Trans|Trans|Batch|Direct&chxt=x,y&chbh=a&chs=600x500&cht=bhg&chco=3D7930,3D8930,309F30,40C355&chxr=0,0,1301236&chds=0,1301236,0,1301236,0,1301236,0,1301236,0,1301236&chd=t:98,5908,74089,242072|13534,474428,151315,919624|42961,691037,153374,929281|26882,533788,152795,874814|114664,969743,152190,972478|411895,1086956,428724,1301236|385445,1200480,412762,1219660|107,5957,83531,111043|16509,291528,151890,390502|58922,354924,150179,392649|114476,991080,154564,991375|19789,155763,19439,236406&chdl=Sqlite+file+full|Sqlite+file+off|Sqlite+file+off+exc|Sqlite+file+off+exc+aes|Sqlite+in+memory|In+memory+static|In+memory+virtual|External+sqlite+file+full|External+sqlite+file+off|External+sqlite+file+off+exc|External+sqlite+in+memory|Remote+sqlite+socket" /></p>
<p><img src="http://chart.apis.google.com/chart?chtt=Insertion+speed+%28rows%2Fsecond%29&chxl=1:|Remote+sqlite+socket|External+sqlite+in+memory|External+sqlite+file+off+exc|External+sqlite+file+off|External+sqlite+file+full|In+memory+virtual|In+memory+static|Sqlite+in+memory|Sqlite+file+off+exc+aes|Sqlite+file+off+exc|Sqlite+file+off|Sqlite+file+full&chxt=x,y&chbh=a&chs=600x500&cht=bhg&chco=3D7930,3D8930,309F30,40C355&chxr=0,0,1301236&chds=0,1301236,0,1301236,0,1301236,0,1301236,0,1301236,0,1301236,0,1301236,0,1301236,0,1301236,0,1301236,0,1301236,0,1301236&chd=t:98,13534,42961,26882,114664,411895,385445,107,16509,58922,114476,19789|5908,474428,691037,533788,969743,1086956,1200480,5957,291528,354924,991080,155763|74089,151315,153374,152795,152190,428724,412762,83531,151890,150179,154564,19439|242072,919624,929281,874814,972478,1301236,1219660,111043,390502,392649,991375,236406&chdl=Direct|Batch|Trans|Batch+Trans" /></p>
<p>In comparison, here are the <em>mORMot 1</em> performance - which was already ahead of most other solutions - on the same machine:</p>
<p>Running tests using Synopse mORMot framework 1.18.6365, compiled with Free Pascal 3.2 64 bit, against SQLite 3.37.2, on Debian GNU/Linux 11 (bullseye) - Linux 5.10.0-10-amd64, at 2022-02-15 10:27:28.</p>
<p><table><tbody><tr align="center"><td> </td><td><strong>Direct</strong></td><td><strong>Batch</strong></td><td><strong>Trans</strong></td><td><strong>Batch Trans</strong></td></tr>
<tr align="center"><td><strong>Sqlite file full</strong></td><td>97</td><td>5034</td><td>45603</td><td>90704</td></tr>
<tr align="center"><td><strong>Sqlite file off</strong></td><td>12504</td><td>212350</td><td>96605</td><td>272910</td></tr>
<tr align="center"><td><strong>Sqlite file off exc</strong></td><td>36189</td><td>244702</td><td>94754</td><td>273687</td></tr>
<tr align="center"><td><strong>Sqlite file off exc aes</strong></td><td>22933</td><td>219857</td><td>96513</td><td>267881</td></tr>
<tr align="center"><td><strong>Sqlite in memory</strong></td><td>79953</td><td>275269</td><td>97776</td><td>269963</td></tr>
<tr align="center"><td><strong>In memory static</strong></td><td>208125</td><td>473126</td><td>227821</td><td>505254</td></tr>
<tr align="center"><td><strong>In memory virtual</strong></td><td>202683</td><td>467595</td><td>220031</td><td>472545</td></tr>
<tr align="center"><td><strong>External sqlite file full</strong></td><td>100</td><td>2809</td><td>49603</td><td>102247</td></tr>
<tr align="center"><td><strong>External sqlite file off</strong></td><td>15329</td><td>190701</td><td>109346</td><td>301768</td></tr>
<tr align="center"><td><strong>External sqlite file off exc</strong></td><td>48818</td><td>264760</td><td>109767</td><td>304710</td></tr>
<tr align="center"><td><strong>External sqlite in memory</strong></td><td>93820</td><td>303766</td><td>111437</td><td>311779</td></tr>
<tr align="center"><td><strong>Remote sqlite socket</strong></td><td>17952</td><td>68360</td><td>14915</td><td>97096</td></tr>
</tbody></table></p>
<p><img src="http://chart.apis.google.com/chart?chtt=Insertion+speed+%28rows%2Fsecond%29&chxl=1:|Batch+Trans|Trans|Batch|Direct&chxt=x,y&chbh=a&chs=600x500&cht=bhg&chco=3D7930,3D8930,309F30,40C355&chxr=0,0,505254&chds=0,505254,0,505254,0,505254,0,505254,0,505254&chd=t:97,5034,45603,90704|12504,212350,96605,272910|36189,244702,94754,273687|22933,219857,96513,267881|79953,275269,97776,269963|208125,473126,227821,505254|202683,467595,220031,472545|100,2809,49603,102247|15329,190701,109346,301768|48818,264760,109767,304710|93820,303766,111437,311779|17952,68360,14915,97096&chdl=Sqlite+file+full|Sqlite+file+off|Sqlite+file+off+exc|Sqlite+file+off+exc+aes|Sqlite+in+memory|In+memory+static|In+memory+virtual|External+sqlite+file+full|External+sqlite+file+off|External+sqlite+file+off+exc|External+sqlite+in+memory|Remote+sqlite+socket" /></p>
<p><img src="http://chart.apis.google.com/chart?chtt=Insertion+speed+%28rows%2Fsecond%29&chxl=1:|Remote+sqlite+socket|External+sqlite+in+memory|External+sqlite+file+off+exc|External+sqlite+file+off|External+sqlite+file+full|In+memory+virtual|In+memory+static|Sqlite+in+memory|Sqlite+file+off+exc+aes|Sqlite+file+off+exc|Sqlite+file+off|Sqlite+file+full&chxt=x,y&chbh=a&chs=600x500&cht=bhg&chco=3D7930,3D8930,309F30,40C355&chxr=0,0,505254&chds=0,505254,0,505254,0,505254,0,505254,0,505254,0,505254,0,505254,0,505254,0,505254,0,505254,0,505254,0,505254&chd=t:97,12504,36189,22933,79953,208125,202683,100,15329,48818,93820,17952|5034,212350,244702,219857,275269,473126,467595,2809,190701,264760,303766,68360|45603,96605,94754,96513,97776,227821,220031,49603,109346,109767,111437,14915|90704,272910,273687,267881,269963,505254,472545,102247,301768,304710,311779,97096&chdl=Direct|Batch|Trans|Batch+Trans" /></p>
<p>As you can see, the performance benefits are noticeable. For a <em>MicroService</em>, an embedded SQlite3 storage may give pretty amazing scalability of your SOA processing.</p>
<h3>Reading Speed</h3>
<p>Here are the <em>mORMot 2</em> reading numbers:</p>
<p><table><tbody><tr align="center"><td> </td><td><strong>By one</strong></td><td><strong>All Virtual</strong></td><td><strong>All Direct</strong></td></tr>
<tr align="center"><td><strong>Sqlite file full</strong></td><td>115775</td><td>1151012</td><td>1121956</td></tr>
<tr align="center"><td><strong>Sqlite file off</strong></td><td>123198</td><td>1157407</td><td>1162925</td></tr>
<tr align="center"><td><strong>Sqlite file off exc</strong></td><td>270504</td><td>1162790</td><td>1174122</td></tr>
<tr align="center"><td><strong>Sqlite file off exc aes</strong></td><td>269978</td><td>1160227</td><td>1171920</td></tr>
<tr align="center"><td><strong>Sqlite in memory</strong></td><td>273950</td><td>1154201</td><td>1150350</td></tr>
<tr align="center"><td><strong>In memory static</strong></td><td>467551</td><td>1803751</td><td>1743071</td></tr>
<tr align="center"><td><strong>In memory virtual</strong></td><td>464209</td><td>771188</td><td>777786</td></tr>
<tr align="center"><td><strong>External sqlite file full</strong></td><td>184836</td><td>522247</td><td>1142204</td></tr>
<tr align="center"><td><strong>External sqlite file off</strong></td><td>180900</td><td>519237</td><td>1155134</td></tr>
<tr align="center"><td><strong>External sqlite file off exc</strong></td><td>186046</td><td>512583</td><td>1153003</td></tr>
<tr align="center"><td><strong>External sqlite in memory</strong></td><td>274559</td><td>1168770</td><td>1182872</td></tr>
<tr align="center"><td><strong>Remote sqlite socket</strong></td><td>22246</td><td>444820</td><td>873133</td></tr>
</tbody></table></p>
<p><img src="http://chart.apis.google.com/chart?chtt=Read+speed+%28rows%2Fsecond%29&chxl=1:|All+Direct|All+Virtual|By+one&chxt=x,y&chbh=a&chs=600x500&cht=bhg&chco=3D7930,3D8930,309F30,40C355&chxr=0,0,1803751&chds=0,1803751,0,1803751,0,1803751&chd=t:115775,1151012,1121956|123198,1157407,1162925|270504,1162790,1174122|269978,1160227,1171920|273950,1154201,1150350|467551,1803751,1743071|464209,771188,777786|184836,522247,1142204|180900,519237,1155134|186046,512583,1153003|274559,1168770,1182872|22246,444820,873133&chdl=Sqlite+file+full|Sqlite+file+off|Sqlite+file+off+exc|Sqlite+file+off+exc+aes|Sqlite+in+memory|In+memory+static|In+memory+virtual|External+sqlite+file+full|External+sqlite+file+off|External+sqlite+file+off+exc|External+sqlite+in+memory|Remote+sqlite+socket" /></p>
<p><img src="http://chart.apis.google.com/chart?chtt=Read+speed+%28rows%2Fsecond%29&chxl=1:|Remote+sqlite+socket|External+sqlite+in+memory|External+sqlite+file+off+exc|External+sqlite+file+off|External+sqlite+file+full|In+memory+virtual|In+memory+static|Sqlite+in+memory|Sqlite+file+off+exc+aes|Sqlite+file+off+exc|Sqlite+file+off|Sqlite+file+full&chxt=x,y&chbh=a&chs=600x500&cht=bhg&chco=3D7930,3D8930,309F30,40C355&chxr=0,0,1803751&chds=0,1803751,0,1803751,0,1803751,0,1803751,0,1803751,0,1803751,0,1803751,0,1803751,0,1803751,0,1803751,0,1803751,0,1803751&chd=t:115775,123198,270504,269978,273950,467551,464209,184836,180900,186046,274559,22246|1151012,1157407,1162790,1160227,1154201,1803751,771188,522247,519237,512583,1168770,444820|1121956,1162925,1174122,1171920,1150350,1743071,777786,1142204,1155134,1153003,1182872,873133&chdl=By+one|All+Virtual|All+Direct" /></p>
<p>In comparison, here are the <em>mORMot 1</em> performance - which was already ahead of most other solutions - on the same machine:</p>
<p><table><tbody><tr align="center"><td> </td><td><strong>By one</strong></td><td><strong>All Virtual</strong></td><td><strong>All Direct</strong></td></tr>
<tr align="center"><td><strong>Sqlite file full</strong></td><td>72439</td><td>847888</td><td>848176</td></tr>
<tr align="center"><td><strong>Sqlite file off</strong></td><td>73248</td><td>837100</td><td>858811</td></tr>
<tr align="center"><td><strong>Sqlite file off exc</strong></td><td>111037</td><td>845737</td><td>848032</td></tr>
<tr align="center"><td><strong>Sqlite file off exc aes</strong></td><td>112973</td><td>863557</td><td>869716</td></tr>
<tr align="center"><td><strong>Sqlite in memory</strong></td><td>111766</td><td>864154</td><td>879043</td></tr>
<tr align="center"><td><strong>In memory static</strong></td><td>229074</td><td>1395868</td><td>1401738</td></tr>
<tr align="center"><td><strong>In memory virtual</strong></td><td>228081</td><td>625782</td><td>626095</td></tr>
<tr align="center"><td><strong>External sqlite file full</strong></td><td>107587</td><td>412745</td><td>840618</td></tr>
<tr align="center"><td><strong>External sqlite file off</strong></td><td>132890</td><td>393948</td><td>805023</td></tr>
<tr align="center"><td><strong>External sqlite file off exc</strong></td><td>133347</td><td>411082</td><td>821422</td></tr>
<tr align="center"><td><strong>External sqlite in memory</strong></td><td>135197</td><td>411658</td><td>820075</td></tr>
<tr align="center"><td><strong>Remote sqlite socket</strong></td><td>19991</td><td>367161</td><td>643832</td></tr>
</tbody></table></p>
<p><img src="http://chart.apis.google.com/chart?chtt=Read+speed+%28rows%2Fsecond%29&chxl=1:|All+Direct|All+Virtual|By+one&chxt=x,y&chbh=a&chs=600x500&cht=bhg&chco=3D7930,3D8930,309F30,40C355&chxr=0,0,1401738&chds=0,1401738,0,1401738,0,1401738&chd=t:72439,847888,848176|73248,837100,858811|111037,845737,848032|112973,863557,869716|111766,864154,879043|229074,1395868,1401738|228081,625782,626095|107587,412745,840618|132890,393948,805023|133347,411082,821422|135197,411658,820075|19991,367161,643832&chdl=Sqlite+file+full|Sqlite+file+off|Sqlite+file+off+exc|Sqlite+file+off+exc+aes|Sqlite+in+memory|In+memory+static|In+memory+virtual|External+sqlite+file+full|External+sqlite+file+off|External+sqlite+file+off+exc|External+sqlite+in+memory|Remote+sqlite+socket" /></p>
<p><img src="http://chart.apis.google.com/chart?chtt=Read+speed+%28rows%2Fsecond%29&chxl=1:|Remote+sqlite+socket|External+sqlite+in+memory|External+sqlite+file+off+exc|External+sqlite+file+off|External+sqlite+file+full|In+memory+virtual|In+memory+static|Sqlite+in+memory|Sqlite+file+off+exc+aes|Sqlite+file+off+exc|Sqlite+file+off|Sqlite+file+full&chxt=x,y&chbh=a&chs=600x500&cht=bhg&chco=3D7930,3D8930,309F30,40C355&chxr=0,0,1401738&chds=0,1401738,0,1401738,0,1401738,0,1401738,0,1401738,0,1401738,0,1401738,0,1401738,0,1401738,0,1401738,0,1401738,0,1401738&chd=t:72439,73248,111037,112973,111766,229074,228081,107587,132890,133347,135197,19991|847888,837100,845737,863557,864154,1395868,625782,412745,393948,411082,411658,367161|848176,858811,848032,869716,879043,1401738,626095,840618,805023,821422,820075,643832&chdl=By+one|All+Virtual|All+Direct" /></p>
<h3>Feedback Welcome</h3>
<p>We encourage you to download the full source of both framework, from:</p>
<ul>
<li><a href="https://github.com/synopse/mORMot">https://github.com/synopse/mORMot</a></li>
<li><a href="https://github.com/synopse/mORMot2">https://github.com/synopse/mORMot2</a></li>
</ul>
<p>Then you can compile the "15 - External DB performance" sample on <em>mORMot 1</em>, and "extdb-bench" example on <em>mORMot 2</em>.</p>
<p>Feedback is <a href="https://synopse.info/forum/viewtopic.php?id=6143">welcome in our forum,</a> as usual!</p>Three Locks To Rule Them Allurn:md5:97541d46fe6c5cf0e003c129797319df2022-01-22T12:56:00+00:002022-01-22T15:55:41+00:00Arnaud BouchezmORMot FrameworkCriticalSectionCrossPlatformDelphiFPCFreePascallockmORMotmultithreadmutexperformance<p>To ensure thread-safety, especially on server side, we usually protect code with critical sections, or locks. In recent Delphi revisions, we have the <code>TMonitor</code> feature, but I would rather trust the OS for locks, which are implemented using Windows Critical Sections, or POSIX futex/mutex.</p>
<p><img src="https://blog.synopse.info?post/public/blog/flyinglock.png" alt="" /></p>
<p>But all locks are not born equal. Most of the time, the overhead of a Critical Section WinAPI or the <code>pthread</code> library is not needed.<br />
So, in <em>mORMot 2</em>, we introduced several native locks in addition to those OS locks, with multi-read/single-write abilities, or re-entrancy.</p> <h4>Thread Safety - The Hard Way</h4>
<p>For a regular RAD/Client application, a single thread is usually enough. Using messages, and/or a <code>TTimer</code> allow some simple cooperative multi-tasking in the application, good enough for most use.</p>
<p>But on server side, scalability requires the business code to be thread-safe. Thread safety is hard, harder than parallel computing from my experiments.</p>
<p>Note that multi-thread programing is not easy, sometimes very difficult to debug, because the problems are hard to reproduce - it is easy to get an <a href="https://en.wikipedia.org/wiki/Heisenbug">HeisenBug</a>.<br />
So ensure you first read some general features about thread safety, and modern CPU memory and operation execution. I just found out <a href="https://preshing.com/20120612/an-introduction-to-lock-free-programming/">these series of blog articles</a>, which details some caveats which may appear in border cases... which may occur to you as they do for me!</p>
<h4>Saved By The Lock</h4>
<p>To ensure thread safety, the most convenient feature we have is the lock, which protects some code section to be executed from several threads.</p>
<p>To be more accurate, we don't protect code, we protect resources. The code itself is thread-safe. But the data requires attention, when several threads access it. If we only read the data, it is fine. But once the data is changed by one thread, then other threads are likely to break - imagine that you add an item to a list, then the list storage is reallocated in memory, then you get some random GPF due to invalid pointers. Or two threads add items at the <em>same time</em> - then the counter or the storage may become pretty wrong. We need to lock the data access to prevent such issues.</p>
<p>Here is how the POSIX <code>libpthread</code> library offers this lock - similar to a Windows Critical Section:</p>
<p><img src="https://blog.synopse.info?post/public/blog/pthreadlock.png" alt="" /></p>
<p>All the memory operations in between are contained inside a nice little barrier sandwich, preventing any undesireable memory reordering across the boundaries. So you write your thread-unsafe code as the ham in your sandwich, and you will ensure that only a single thread will execute it at once.</p>
<h4>Locks Are Not Expensive, Contention Is</h4>
<p>The main rule about using locks it that they should be as small as possible.<br />
Why?</p>
<p>Acquiring an unlocked mutex, or releasing a mutex is almost free, it is usually a single atomic assembly instruction. Atomic instructions have the <code>lock</code> prefix on Intel/AMD, or are explicitly specified as such, e.g. the <code>cmpxchg</code> operation. On ARM, you usually need to write a small loop, or at least several instructions.<br />
In <code>mormot.core.base.pas</code> we provide some cross-platform and cross-compiler functions for atomic process, written in tuned assembly or calling the RTL:</p>
<pre>
procedure LockedInc32(int32: PInteger);
procedure LockedDec32(int32: PInteger);
procedure LockedInc64(int64: PInt64);
function InterlockedIncrement(var I: integer): integer;
function InterlockedDecrement(var I: integer): integer;
function RefCntDecFree(var refcnt: TRefCnt): boolean;
function LockedExc(var Target: PtrUInt; NewValue, Comperand: PtrUInt): boolean;
procedure LockedAdd(var Target: PtrUInt; Increment: PtrUInt);
procedure LockedAdd32(var Target: cardinal; Increment: cardinal);
procedure LockedDec(var Target: PtrUInt; Decrement: PtrUInt);
</pre>
<p>But if two (or more) threads fight against acquiring a lock, then only one would get it. So the other threads will have to wait. Waiting is usually done by first <em>spinning</em> (i.e. running a void loop), and trying to acquire the lock. Eventually, an OS kernel call could take place, to leverage the CPU core, and try to execute some pending code from another thread.</p>
<p><img src="https://blog.synopse.info?post/public/blog/lockcontention.png" alt="" /></p>
<p>This lock contention, spinning or switching to another thread, is what really degrades the whole process performance. You are really wasting time and energy just for accessing a shared resource.</p>
<p>Therefore, in practice, I would advice to follow some simple rules.</p>
<h5>Make it work, then make it fast</h5>
<p>You may first use a giant Critical Section for a whole method. Most of the time, it would be fine.</p>
<p>Don't guess, run actual benchmarking on multi-core CPU (not a single core VM!), trying to reproduce the worse case possible which may happen.<br />
Have detailed, and thread-aware logs, to properly debug production code - the Heisenbugs are likely to appear not on your development PC, but with real world load.</p>
<p>Once you have identified a real bottleneck, try to split the logic code into small pieces:</p>
<ul>
<li>Ensure you have a multi-thread regression testing code for this method, to validate your modifications are actually still correct and ... faster;</li>
<li>Some part of the code may be thread-safe by itself (e.g. the error checking or result logging): no need to protect it with the lock;</li>
<li>Isolate the processing code into some private/protected methods, depending on the resources shared, with proper locking.</li>
</ul>
<h5>The Less The Better</h5>
<p>Eventually, to achieve the best performance:</p>
<ul>
<li>Keep your locks as short as possible.</li>
<li>Prefer more locks on small data than some giant locks;</li>
<li>Use a lock per list or queue, not per process or business logic method;</li>
<li>Make a private copy of the data (e.g. on a local stack variable) within the lock, then process it outside the lock;</li>
<li>Avoid calling other methods within a lock: focus on the shared data, and be sure that the functions you call may not be thread-safe;</li>
<li>Try to avoid memory allocations.</li>
</ul>
<h5>Pickup The Right Lock</h5>
<p>Generally speaking, the regular <code>TRTLCriticalSection</code> is fine, and should be preferred.<br />
Our <em>mormot.core.os.pas</em> unit leverage this into a cross-platform way, among FPC/Delphi compilers and operating systems. It tries to call directly the OS, with proper inlining if possible.</p>
<p>But if you follow "The Less The Better" rule above, your code may be something very small like this:</p>
<pre>
procedure TAsyncConnections.AddGC(aConnection: TPollAsyncConnection);
begin
if Terminated then
exit;
(aConnection as TAsyncConnection).fLastOperation := fLastOperationMS; // in ms
fGCSafe.Lock;
ObjArrayAddCount(fGC, aConnection, fGCCount);
fGCSafe.UnLock;
end;
</pre>
<p>Here you can see that the lock is very small, and setting <code>fLastOperation</code> has been done outside of the lock, since this operation is thread-safe by design: this connection will be free once, whereas <code>fGC/fGCCount</code> list may be accessed from several threads. Also note that <code>ObjArrayAddCount()</code> is a well defined function which should not have its behavior changed, nor raise any exception, so it is safe to be used... and we even didn't put any <code>try...finalll fGCSafe.UnLock;</code> statement here, because a <code>try..finally</code> has a cost on some platforms (e.g. FPC Linux generates several RTL calls even if no exception is raised).</p>
<p>Or course, we could use our <code>TSynLock</code> for <code>fGCSafe</code> - which encapsulate a <code>TRTLCriticalSection</code> in an object-oriented manner.<br />
But since here we know that the lock will be very small, no need to have the whole overhead of a Critical Section or a mutex/futex, which always has a cost at least in resources.</p>
<h4>Several Locks To Rule Them All</h4>
<p>In addition to the <code>TSynLock</code> wrapper, <em>mormot.core.os.pas</em> defines several kind of locks:</p>
<pre>
// a lightweight exclusive non-rentrant lock, stored in a PtrUInt value
// - calls SwitchToThread after some spinning, but don't use any R/W OS API
// - warning: methods are non rentrant, i.e. calling Lock twice in a raw would
// deadlock: use TRWLock or TSynLocker/TRTLCriticalSection for reentrant methods
// - light locks are expected to be kept a very small amount of time: use
// TSynLocker or TRTLCriticalSection if the lock may block too long
// - several lightlocks, each protecting a few variables (e.g. a list), may
// be more efficient than a more global TRTLCriticalSection/TRWLock
// - only consume 4 bytes on CPU32, 8 bytes on CPU64
TLightLock = record
procedure Lock;
function TryLock: boolean;
procedure UnLock;
end;
// a lightweight multiple Reads / exclusive Write non-upgradable lock
// - calls SwitchToThread after some spinning, but don't use any R/W OS API
// - warning: ReadLocks are reentrant and allow concurrent acccess, but calling
// WriteLock within a ReadLock, or within another WriteLock, would deadlock
// - consider TRWLock is you need an upgradable lock
// - light locks are expected to be kept a very small amount of time: use
// TSynLocker or TRTLCriticalSection if the lock may block too long
// - several lightlocks, each protecting a few variables (e.g. a list), may
// be more efficient than a more global TRTLCriticalSection/TRWLock
// - only consume 4 bytes on CPU32, 8 bytes on CPU64
TRWLightLock = record
procedure ReadLock;
function TryReadLock: boolean;
procedure ReadUnLock;
procedure WriteLock;
function TryWriteLock: boolean;
procedure WriteUnLock;
end;
type
TRWLockContext = (
cReadOnly, cReadWrite, cWrite);
// a lightweight multiple Reads / exclusive Write reentrant lock
// - calls SwitchToThread after some spinning, but don't use any R/W OS API
// - locks are expected to be kept a very small amount of time: use TSynLocker
// or TRTLCriticalSection if the lock may block too long
// - warning: all methods are reentrant, but WriteLock/ReadWriteLock would
// deadlock if called after a ReadOnlyLock
TRWLock = record
procedure ReadOnlyLock;
procedure ReadOnlyUnLock;
procedure ReadWriteLock;
procedure ReadWriteUnLock;
procedure WriteLock;
procedure WriteUnlock;
procedure Lock(context: TRWLockContext {$ifndef PUREMORMOT2} = cWrite {$endif});
procedure UnLock(context: TRWLockContext {$ifndef PUREMORMOT2} = cWrite {$endif});
end;
</pre>
<p><code>TLightLock</code> is the simplest lock.<br />
It will acquire a lock, then spin or sleep on contention. But be aware that it is not reentrant: if you call <code>Lock</code> twice in a row from the same thread, the second <code>Lock</code> would wait forever. So you must ensure that your code doesn't call any other method which may also call <code>Lock</code> during its process, otherwise your thread would "deadlock". Such race conditions are relatively easy to identify: it will always block and deadlock, whatever condition there is. To fix it, don't call other method which run <code>Lock</code>: for instance, you may define some private/protected <code>LockedDoSomething</code> methods, which won't have any lock but expect to be called within a lock.</p>
<p><code>TRWLightLock</code> and <code>TRWLock</code> are <em>multiple Reads / exclusive Write locks</em>.<br />
This is a feature missing in the regular Critical Section. It is very likely that your shared resource will be often read, and seldom modified. Since reads are thread-safe by design, there is no need to prevent other reading threads to read the resource. Only writing/updating the data should be exclusive and protected from other threads. This is the purpose of <code>ReadLock</code> / <code>ReadOnlyLock</code> and <code>WriteLock</code>.<br />
<code>TRWLock</code> goes one step further, and allow a read lock to be upgraded into a write lock, using <code>ReadWriteLock</code> instead of <code>ReadOnlyLock</code>. <code>ReadWriteLock</code> could be followed by a <code>WriteLock</code>, whereas <code>ReadOnlyLock</code> should always be followed by <code>ReadOnlyUnlock</code>, but never by a <code>WriteLock</code> which would deadblock.<br />
Last but not least, <code>ReadOnlyLock</code> / <code>ReadOnlyUnLock</code> are re-entrant (you can call them nested), because they are implemented using a counter. And <code>TRWLock.WriteLock</code> is re-entrant, because it takes track of the locked thread ID, so detects nested calls - as a <code>TRtlCriticalSection</code> does.</p>
<h4>Low Level Stuff</h4>
<p>Just for fun, take a look at the source code:</p>
<pre>
procedure TLightLock.LockSpin;
var
spin: PtrUInt;
begin
spin := SPIN_COUNT;
repeat
spin := DoSpin(spin);
until LockedExc(Flags, 1, 0);
end;
procedure TLightLock.Lock;
begin
// we tried a dedicated asm but it was slower: inlining is preferred
if not LockedExc(Flags, 1, 0) then
LockSpin;
end;
function TLightLock.TryLock: boolean;
begin
result := LockedExc(Flags, 1, 0);
end;
procedure TLightLock.UnLock;
begin
Flags := 0; // non reentrant locks need no additional thread safety
end;
</pre>
<p><code>TLightLock</code> is pretty straightforward, using a simple CAS compare & exchange <code>LockedExc()</code> atomic function, but <code>TRWLightLock</code> and <code>TRWLock</code> are slightly more complex.</p>
<p>In <em>mORMot 2</em> code base, we tried to use the best lock possible. <code>TRtlCriticalSection</code> / <code>TSynLock</code> when the locks are likely to have a contention for some time (more than a micro second), and other locks, with <em>multiple Reads / exclusive Write</em> methods if possible, are used to protect very small tuned code.<br />
Of course, thread safety is tested during the regression tests, with dozen of concurrent threads trying to break the locks logic. I can tell you that we found some nasty problems in the initial code of our <code>TAsyncServer</code>, but after days debugging and logging, it sounds stable now - but it is the matter for another article! :)</p>
<p><a href="https://synopse.info/forum/viewtopic.php?id=6119">Feedback is welcome in our forum</a>, as usual!</p>mORMot 2 Generics and Collectionsurn:md5:d00a8669325eb2868f9b0a0780e893892021-12-19T19:11:00+00:002021-12-20T09:44:35+00:00Arnaud BouchezmORMot FrameworkcollectionsDelphiFreePascalgenericsGoodPracticemORMotmORMot2performanceRTL<p>Generics are a clever way of <a href="https://docwiki.embarcadero.com/RADStudio/en/Overview_of_Generics">writing some code once, then reuse it for several types</a>.<br />
They are like templates, or compiler-time shortcuts for type definitions.</p>
<p><img src="https://blog.synopse.info?post/public/blog/generics-war.png" alt="" /></p>
<p>In the last weeks, we added a new <a href="https://github.com/synopse/mORMot2/blob/master/src/core/mormot.core.collections.pas">mormot.core.collections.pas unit</a>, which features:</p>
<ul>
<li>JSON-aware <code>IList<></code> List Storage;</li>
<li>JSON-aware <code>IKeyValue<></code> Dictionary Storage.</li>
</ul>
<p>In respect to Delphi or FPC RTL <code>generics.collections</code>, this unit uses interfaces as variable holders, and leverage them to reduce the generated code as much as possible, as the <em>Spring4D 2.0</em> framework does, but for both Delphi and FPC. It publishes <code>TDynArray</code> and <code>TSynDictionary</code> high-level features like indexing, sorting, JSON/binary serialization or thread safety as Generics strong typing.</p>
<p>Resulting performance is great, especially for its enumerators, and your resulting executable size won't blow up as with the regular RTL unit.</p> <h4>Delphi or FPC Generics</h4>
<p>Delphi has its own generics support since Delphi 2009. FPC 2.6 did follow, with a more template-like approach, but almost the same behavior in "Delphi mode".<br />
Both compilers are still buggy about generics compilation. It is pretty easy to generate internal errors. Using generics on early Delphi revisions was almost impossible. Advanced generics work is not really possible before Delphi XE8. FPC seems more stable, but it has also limits and could be broken on some edge cases, and Lazarus parser is still confused about them.</p>
<p>But from generics come great expressiveness, and you can do wonders with them, <a href="https://github.com/grijjy/JustAddCode/blob/master/TypeIndices/SimpleTypeIndex.dpr">like this per-type integer trick</a>.<br />
And it could induce some strong typing in your regular code, so they are worth considering.</p>
<h4>Generics Collections</h4>
<p>There are several collection libraries using generics, in the modern object pascal world, i.e. Delphi and FPC.</p>
<p>The one from the Delphi RTL was the first to exist, and has been enhanced a lot during the last decade. It uses compile time intrinsics like <code>IsManagedType() GetTypeKind() SizeOf()</code> to compile efficiently, and has a very verbose unrolled code for most type sizes.</p>
<p>The one from the FPC RTL is compatible with most of the Delphi RTL, and has been contributed by our friend Maciej (from DaThoX). Its somewhat huge code base has a lot of bells and whistles, like several hashing or memory expansion algorithms. Its performance was not the main point, but is was a good proof that FPC generics were stable and usable.</p>
<p><a href="https://bitbucket.org/sglienke/spring4d">Spring4D</a> is a well known and very well crafted library. It has a lot of very clever features, and is very close to what you could have in modern C# collection libraries.</p>
<p>All those generics collections tend to generate huge executable size. The Delphi compiler tries to reduce the redundant code, but the pre-compiled units are still huge (the Delphi .dcu files for instance), and the compile time and resource consumption suffer from it.</p>
<h4>Spring4D Version 2</h4>
<p>The upcoming Spring4D version 2 tries to resolve the binary bloat, and also the Spring4D 1.x performance issues - performance was not the main goal at that time, expressiveness and usefulness was.</p>
<p><a href="https://delphisorcery.blogspot.com/2021/06/spring4d-20-sneak-peek-evolution-of.html">This blog article</a> is worth a read.<br />
Here is an extract:</p>
<blockquote><p>- as you know generics can cause quite a huge chunk of binary code because for every specialization the code is basically duplicated even for binary identical types including their RTTI. In 2.0 all RTTI for the implementing classes which you will never touch anyway is turned off and with some trickery, many generic specializations are folded and forced into the Spring.Collections.dcu to not pollute each and every dcu you compile a list or dictionary into. And it does not only shrink your binary, but also speeds up your compilation as the compiler simply has less work to do - no generating code (RTTI and executable code) into dcu which the linker later has to go through and eliminate duplicates.</p></blockquote>
<p>Nice work Stephan!</p>
<h4>Entering The mORMot Zone</h4>
<p>Since the beginning, we have some very powerful data structures in <em>mORMot</em>.<br />
Just to mention the <code>TDynArray</code> and <code>TDynHashedDynArray</code> wrappers, which have a lot of features, and are very efficient. They are used everywhere in the framework core. Also our <code>TSynDictionary</code> is well crafted, thread-safe, and has all the basic features you expect in your daily work on any kind of key/value efficient storage. Even with persistence!</p>
<p>Our new <a href="https://github.com/synopse/mORMot2/blob/master/src/core/mormot.core.collections.pas">mormot.core.collections.pas unit</a> published those two data structure algorithms as two sets of generic interfaces.</p>
<p>The <code>IList<></code> type holds a (thread-safe) list of items, with the most useful methods:</p>
<pre>
IList<T> = interface
function Add(const value: T): PtrInt;
procedure Insert(ndx: PtrInt; const value: T);
function Delete(ndx: PtrInt): boolean;
function Remove(const value: T): boolean;
function Pop(var dest: T; opt: TListPop = []): boolean;
procedure Clear;
procedure Reverse;
procedure Sort(customcompare: TDynArraySortCompare = nil); overload;
procedure Sort(start, stop: integer;
customcompare: TDynArraySortCompare = nil); overload;
procedure Sort(var indexes: TIntegerDynArray;
customcompare: TDynArraySortCompare = nil); overload;
procedure Sort(const customcompare: TOnDynArraySortCompare;
descending: boolean = false); overload;
function AddSorted(const value: T; wasadded: PBoolean = nil): integer;
function Sorted: boolean;
function IndexOf(const value: T): PtrInt;
function Find(const value: T; customcompare: TDynArraySortCompare = nil): PtrInt;
function GetEnumerator: TSynEnumerator<T>;
function Range(Offset: PtrInt = 0; Limit: PtrInt = 0): TSynEnumerator<T>;
function First: pointer;
function AsArray(Offset: PtrInt = 0; Limit: PtrInt = 0): TArray<T>;
property Items[ndx: PtrInt]: T; default;
property Count: PtrInt;
property Capacity: PtrInt;
property Comparer: TDynArraySortCompare;
function Safe: PRWLock;
function Data: PDynArray;
end;
</pre>
<p>Not a lot of methods, but there is access to the underlying <code>Data: PDynArray</code> instance e.g. for JSON or binary serialization (unique among all libraries), built-in multi-read/exclusive-write thread-safety (<code>Safe</code>), stack-like or queue-like behavior (<code>Pop</code>), fast search with optional case insensitivity and hashed or binary (sorted) indexes (<code>IndexOf() Find()</code>).</p>
<p>Most collections libraries tend to multiply classes and types. There is a <code>class<T></code> for a sorted list, another <code>class<T></code> for a queue, another <code>class<T></code> for a stack, another <code>class<T></code> for hashed list, another <code>class<T></code> for thread-oriented list, another <code>class<T></code> for serializable list... in <em>mORMot 2</em>, you have one <code>IList<></code> which features all, just by settings some options at factory level.<br />
As often in <em>mORMot</em>, we started from the use cases and actual needs for your projects, not from how the collections are implemented.</p>
<p>The <code>IKeyValue<></code> dictionary type is even smaller:</p>
<pre>
IKeyValue<TKey, TValue> = interface
procedure Add(const key: TKey; const value: TValue);
function TryAdd(const key: TKey; const value: TValue): boolean;
function TryGetValue(const key: TKey; var value: TValue): boolean;
function GetValueOrDefault(const key: TKey; const defaultValue: TValue): TValue;
function Remove(const key: TKey): boolean;
function Extract(const key: TKey; var value: TValue): boolean;
function ContainsKey(const key: TKey): boolean;
function ContainsValue(const value: TValue): boolean;
function DeleteDeprecated: integer;
procedure Clear; overload;
function Count: integer;
property Items[const key: TKey]: TValue; default;
property Capacity: integer;
property TimeOutSeconds: cardinal;
function Data: TSynDictionary;
end;
</pre>
<p>Here the processing is fully thread-safe, and has some unique features from <code>TSynDictionary</code> like binary or JSON serialization, key lookup using an internal hash table, thread-safety (with several kind of locks, or no lock), and an optional <code>TimeOutSeconds</code> property to delete deprecated items after a while - typical process when caching data.<br />
If those high-level thread-safe functions are not enough, you have access to the internal <code>Data: TSynDictionary</code> instance, and here you could work on the data with its own <code>Safe</code> lock, and run index-based methods, or more complex key/value processing via callbacks.</p>
<p>Our <code>Collections.NewList<T></code> and <code>Collections.NewKeyValue<TKey, TValue></code> factories leverage latest <code>IsManagedType() GetTypeKind() SizeOf()</code> compiler intrinsics for efficiency. For instance, a huge <code>case GetTypeKind(...) of</code> will be reduced at compile time into a single call to a specialized shared factory method - e.g. reusing <code>TSynListSpecialized<integer></code> for all compatible types.</p>
<p>To be fair, both <code>IList<></code> and <code>IKeyValue<></code> are pretty basic, in comparison to RTL collections or Spring4D libraries.<br />
But they do what they do: store values, or key/value pairs, in a straightforward and efficient way.<br />
If you need to have some cascaded filters, or build a fluent interface, use Spring4D. But I guess that for most of usual work, our little interfaces are enough.</p>
<h3>Face Your Interface</h3>
<p>The first thing you will notice about our types, in respect to the RTL collections, is that they are defined as generic <code>interface</code>, not <code>class</code>.</p>
<p>One obvious benefit is that it may ease memory management. When the <code>interface</code> reference count reaches 0, the whole list or dictionary will be released and all its stored items will be freed - or not, depending on your needs. Nice and easy. No more <code>try/finally</code> to write.</p>
<p>But another benefit is that, as <em>Spring4D 2.0</em> does, we will be able to reuse the VMT of each <code>interface</code>.</p>
<p>With a regular class, it you have a <code>TObjectList<TObject1></code> and another <code>TObjectList<TObject2></code> then the very same code will be generated twice, once for <code>TObject1</code> and another time for <code>TObject2</code>. Even if both objects are in fact just... pointers in the type definition, and in the generated asm. A lot of duplicated code, which the compiler linker tries to identify and remove from the executable, but it is not always working. This is the "binary bloat" Stephan was talking about in his article.</p>
<p>Using interfaces and factories allow to reuse the very same <code>interface</code> for all similar types.<br />
For instance, on Win32, if you use an <code>IList<integer></code> or an <code>IList<pointer></code> or an <code>IList<TObject1></code> or an <code>IList<TObject2></code>, they will use in fact a single <code>interface</code> - a <code>IList<integer></code> to be precise, which is the so-called "specialized" 32-bit ordinal/pointer processing class. Only the <code>interface</code> definition is duplicated/specific. But a single implementation <code>class</code> will be reused for all those types, sharing the very same <code>IList<></code> VMT.<br />
And, in respect to plain RTL collections, this implementation class will stay within <code>mormot.core.collections.pas</code>, and not expanded/defined/generated in each unit using the <code>IList<></code> definitions. Only the <code>interface</code> and its associated enumerators are generated in each unit. This reduces the <code>.dcu .ppu</code> size a lot, and also eases the compiler work and resources consumption.</p>
<p>To let this magic happen, you would not call any class instance - like <code>TSynListSpecialized<T></code>, but you will use some factories:</p>
<pre>
var
i: integer;
li: IList<integer>;
begin
li := Collections.NewList<integer>;
li.Capacity := MAX + 1; // faster Add() thanks to pre-allocation
for i := 0 to MAX do // populate with some data
Check(li.Add(i) = i);
for i := 0 to li.Count - 1 do // regular Items[] access
Check(li[i] = i);
for i in li do // use an enumerator - safe and clean
Check(cardinal(i) <= MAX);
for i in li.Range(-5) do // use an enumerator for the last 5 items
Check(i > MAX - 5);
for i in li do
Check(li.IndexOf(i) = i); // O(n) brute force search using SSE2 asm
end; // no need to set li := nil or write any try..finally Free end; block
</pre>
<p>And if you used instead:</p>
<pre>
li := Collections.NewList<integer>([loCreateUniqueIndex]);
</pre>
<p>then it will use a hash table for the <code>IList<integer>.Find()</code> searches:</p>
<pre>
for i in li do
Check(li.Find(i) = i); // O(1) hash table search
</pre>
<p>To create a dictionary, it is as simple as using the <code>Collections.NewKeyValue<TKey, TValue></code> factory method.</p>
<p>Check <a href="https://github.com/synopse/mORMot2/blob/9cdef61a9e59d3c883660f269d63094bbc5c32af/test/test.core.collections.pas#L166">our regression tests</a> for more reference code about how to use those interfaces.</p>
<h3>Behind The Scene</h3>
<p>As I wrote, behind the scene, a <code>TDynArray</code> or a <code>TSynDictionary</code> is involved. So most of the code is using plain RTTI, but very optimized code. Not bloated per-type code generated for each type, but very tuned code.<br />
For instance, searching for a <code>byte</code>, <code>word</code> or <code>integer</code> as in <code>IList<...>.IndexOf()</code> would use SSE2 fast assembly and not a naive <code>for i := 0 to Count - 1 do...</code> loop.</p>
<p>Please take a little time and look at how the <a href="https://github.com/synopse/mORMot2/blob/master/src/core/mormot.core.collections.pas">mormot.core.collections.pas unit</a> is written.<br />
You will see that most of the code is the <code>interface</code> definitions (and comments/documentation), with small wrapper classes (one non-generic, another generic, to reduce the code size), and that the implementation part is mostly about the specialization to the main types to reuse the VMT. The resulting unit is much smaller and easier to debug and maintain than other libraries, which use generics down to the implementation - which we did not.</p>
<p>We tried to make our enumerators as efficient as possible. A <code>for i in li do</code> loop would make no call, will be efficiently inlined and only use a single pointer on the stack. No memory allocation involved.</p>
<p>Of course, <em>mORMot 2</em> ORM methods could use those <code>IList<TOrm></code> type, if you prefer to use this syntax when retrieving <code>TOrm</code> instances.<br />
We still need to leverage their power within the interface definitions, for our SOA stack.<br />
Stay tuned!</p>
<p>As usual, <a href="https://synopse.info/forum/viewtopic.php?id=6096">discussion and feedback are welcome in our forum</a>!</p>EKON 25 Slidesurn:md5:aed86aeed11190901cf050e9842ec0bc2021-11-16T12:34:00+00:002021-11-16T12:34:00+00:00Arnaud BouchezmORMot Framework64bitAESAES-CTRAES-GCMAES-NiauthenticationCertificatesCrossPlatformDDDDelphiECCECDHECIESECSDAed25519EKONFreePascalinterfacelibdeflatemORMotmORMot2multithreadOpenSSLperformancerandomSOASourceWebSockets<p><a href="https://entwickler-konferenz.de/">EKON 25 at Düsseldorf</a> was a great conference (konference?).</p>
<p>At last, a <strong>physical</strong> gathering of Delphi developers, mostly from Germany, but also from Europe - and even some from USA! No more virtual meetings, which may trigger the well known 'Abstract Error' on modern pascal coders.<br />
There were some happy FPC users too - as I am now. <img src="https://blog.synopse.info?pf=smile.svg" alt=":)" class="smiley" /></p>
<p><img src="https://blog.synopse.info?post/public/blog/Ekon25.png" alt="" /></p>
<p>I have published the slides of my conferences, mostly about mORMot 2.<br />
By the way, I wish we would be able to release officially mORMot 2 in December, before Christmas. I think it starts to be stabilized and already known to be used on production. We expect no more breaking change in the next weeks.</p> <p>Here are the slides of my two 1-hour sessions.</p>
<h5>mORMot Cryptography</h5>
<p>The OpenSource mORMot framework has a strong set of cryptography features. It offers symmetric cryptography with hashing and encryption, together with asymmetric cryptography via private/public key pairs. Its optimized pascal and assembly engines can be embedded into your executable, but you could also call an external OpenSSL library if needed. This session will present mormot.crypt.* units, and apply them to some use cases, from low-level algorithms to high-level JWT or file encryption and signing.</p>
<p><a href="https://www.slideshare.net/ArnaudBouchez1/ekon25-mormot-2-cryptography">mORMot 2 Cryptography on SlideShare</a></p>
<p>I just had an interesting discussion with Michael on <a href="https://gitlab.com/freepascal.org/fpc/source/-/commit/3229cb712e33374b85258aed43726058be633bed#note_734398698">FPC new gitlab platform</a>: the FPC RTL is gaining some official cryptography functions, and I proposed to use mORMot code base as reference, and to introduce some RTL wrapper functions which can redirect to a plain pascal FPC RTL version, or use another engines, like OpenSSL or mORMot, if available.</p>
<h5>Server-Side REST Notifications with mORMot</h5>
<p>The most powerful way of writing REST services is to define them via interfaces, then let the SOA/REST framework do all the routing, data marshalling and communication behind the scenes. One distinctive feature of mORMot is to define a method parameter as a notification interface, and let the server call back the client when needed, as with regular Delphi code. This session will present the benefit of defining REST services using interfaces, and how WebSockets can offer real-time notifications into your rich Delphi client applications.</p>
<p><a href="https://www.slideshare.net/ArnaudBouchez1/ekon25-mormot-2-serverside-notifications">mORMot 2 Server-Side Notifications on SlideShare</a></p>
<p>Feedback is <a href="https://synopse.info/forum/viewtopic.php?id=6051">welcome on our forum, as usual.</a></p>