Tag - performance

Entries feed - Comments feed

2021-05-14

Enhanced HTTP/HTTPS Support in mORMot 2

HTTP(S) is the main protocol of the Internet.
We enhanced the mORMot 2 socket client to push its implementation into more use cases. The main new feature is perhaps WGET-like processing, with hashing, resuming, console feedback, and direct file download.

Continue reading

2021-05-08

Enhanced Faster ZIP Support in mORMot 2

The .zip format is from last century, back to the early DOS days, but can still be found everywhere. It is even hidden when you run a .docx document, a .jar application, or any Android app!
It is therefore (ab)used not only as archive format, but as application file format / container - even if in this respect using SQLite3 may have much more sense.

We recently enhanced our mormot.core.zip.pas unit:

  • to support Zip64,
  • with enhanced .zip read/write,
  • to have a huge performance boost during its process,
  • and to integrate better with signed executables.

Continue reading

2021-02-22

OpenSSL 1.1.1 Support for mORMot 2

Why OpenSSL? OpenSSL is the reference library for cryptography and secure TLS/HTTPS communication. It is part of most Linux/BSD systems, and covers a lot of use cases and algorithms. Even if it had some vulnerabilities in the past, it has been audited and validated for business use. Some algorithms  […]

Continue reading

2021-02-13

Fastest AES-PRNG, AES-CTR and AES-GCM Delphi implementation

Last week, I committed new ASM implementations of our AES-PRNG, AES-CTR and AES-GCM for mORMot 2.
They handle eight 128-bit at once in an interleaved fashion, as permitted by the CTR chaining mode. The aes-ni opcodes (aesenc aesenclast) are used for AES process, and the GMAC of the AES-GCM mode is computed using the pclmulqdq opcode.

Resulting performance is amazing: on my simple Core i3, I reach 2.6 GB/s for aes-128-ctr, and 1.5 GB/s for aes-128-gcm for instance - the first being actually faster than OpenSSL!

Continue reading

2021-02-12

New AesNiHash for mORMot 2

I have just committed some new AesNiHash32 AesNiHash64 AesNiHash128 Hashers for mORMot 2. They are using AES-NI and SSE4.1 opcodes on x86_64 and i386. This implementation is faster than the fastest SSE4.1 crc32c and with a much higher usability (less collisions). Logic was extracted from the Go  […]

Continue reading

2020-11-04

EKON 24 Presentation Slides

EKON_24.png, Nov 2020

EKON 24 just finished. "The conference for Delphi & more" was fully online this year, due to the viral context... But this was a great event, and I am very happy to have been part of it. Please find the slides on my two sessions: mORMot 2 Performance: from Delphi to AVX2 Of course,  […]

Continue reading

2020-06-05

SQlite3 Encryption Not Possible Any More Since 3.32.x

About latest SQlite3 3.32.xxx there is a big problem with codecs.

Critical changes to the public SQLite code were introduced on Feb 7, 2020: “Simplify the code by removing the unsupported and undocumented SQLITE_HAS_CODEC compile-time option”. With the release of SQLite version 3.32.0 on May 22, 2020 these changes finally took officially effect, although they weren't officially announced.

As a sad and unexpected consequence, we are NOT ANY MORE able to compile the new SQlite3 amalgamation with our encryption patch.

Continue reading

2020-05-07

New Multi-thread Friendly Memory Manager for FPC written in x86_64 assembly

As a gift to the FPC community, I just committed a new Memory Manager for FPC.
Check mormot.core.fpcx64mm.pas in our mORMot2 repository.
This is a stand-alone unit for FPC only.

It targets Windows and Linux multi-threaded Service applications - typically mORMot daemons.
It is written in almost pure x86_64 assembly, and some unique tricks in the Delphi/FPC Memory Manager world.

It is based on FastMM4 (not FastMM5), and we didn't follow the path of the FastMM4-AVX version - instead of AVX, we use plain good (non-temporal) SSE2 opcode, and we rely on the mremap API on Linux for very efficient reallocation. Using mremap is perhaps the biggest  benefit of this memory manager - it leverages a killer feature of the Linux kernel for sure. By the way, we directly call the Kernel without the need of the libc.

We tuned our x86_64 assembly a lot, and made it cross-platform (Windows and POSIX). We profiled the multi-threading, especially by adding some additional small blocks for GetMem (which is a less expensive notion of "arenas" as used in FastMM5 and most C allocators), introducing an innovatice and very efficient round-robin of tiny blocks (<128 bytes), and proper spinning for FreeMem and medium blocks.

It runs all our regression tests with huge performance and stability - including multi-threaded tests with almost no slow down: sleep is reported as less than 1 ms during a 1 minute test. It has also been validated on some demanding multi-threaded tasks.

Continue reading

2020-03-28

Faster Double-To-Text Conversion

On server side, a lot of CPU is done processing conversions to or from text. Mainly JSON these days.

In mORMot, we take care a lot about performance, so we have rewritten most conversion functions to have something faster than the Delphi or FPC RTL can offer.
Only float to text conversion was not available. And RTL str/floattexttext performance, at least under Delphi, is not consistent among platforms.
So we just added a new Double-To-Text set of functions.

Continue reading

2020-02-17

New move/fillchar optimized sse2/avx asm version

Our Open Source framework includes some optimized asm alternatives to RTL's move() and fillchar(), named MoveFast() and FillCharFast().

We just rewrote from scratch the x86_64 version of those, which was previously taken from third-party snippets.
The brand new code is meant to be more efficient and maintainable. In particular, we switched to SIMD 128-bit SSE2 or 256bit AVX memory access (if available), whereas current version was using 64-bit regular registers. The small blocks (i.e. < 32 bytes) process occurs very often, e.g. when processing strings, so has been tuned a lot. Non temporal instructions (i.e. bypassing the CPU cache) are used for biggest chunks of data. We tested ERMS support, but it was found of no benefit in respect to our optimized SIMD, and was actually slower than our non-temporal variants. So ERMS code is currently disabled in the source, and may be enabled on demand by a conditional.

FPC move() was not bad. Delphi's Win64 was far from optimized - even ERMS was poorly introduced in latest RTL, since it should be triggered only for blocks > 2KB. Sadly, Delphi doesn't support AVX assembly yet, so those opcodes would be available only on FPC.

Resulting numbers are talking by themselves. Working on Win64 and Linux, of course.

Continue reading

2018-11-12

EKON 22 Slides and Code

I've uploaded two sets of slides from my presentations at EKON 22 : Object Pascal Clean Code Guidelines Proposal High Performance Object Pascal Code on Servers with the associated source code The WorkShop about "Getting REST with mORMot" has a corresponding new Samples folder in our  […]

Continue reading

2018-03-12

New AES-based SQLite3 encryption

We just committed a deep refactoring of the SynSQlite3Static.pas unit - and all units using static linking for FPC. It also includes a new encryption format for SQlite3, using AES, so much more secure than the previous one. This is a breaking change, so worth a blog article! Now all static .o .a  […]

Continue reading

2018-02-07

Status of mORMot ORM SOA MVC with FPC

In the last weeks/months, we worked a lot with FPC.
Delphi is still our main IDE, due to its better debugging experience under Windows, but we target to have premium support of FPC, on all platforms, especially Linux.

The new Delphi Linux compiler is out of scope, since it is heavily priced, its performance is not so good, and ARC broke memory management so would need a deep review/rewrite of our source code, which we can't afford - since we have FPC which is, from our opinion,  a much better compiler for Linux.
Of course, you can create clients for Delphi Linux and FMX, as usual, using the cross-platform client parts of mORMot. But for server side, this compiler is not supported, and will probably never be.

Continue reading

2016-04-09

AES-256 based Cryptographically Secure Pseudo-Random Number Generator (CSPRNG)

Everyone knows about the pascal random() function.
It returns some numbers, using a linear congruential generator, with a multiplier of 134775813, in its Delphi implementation.
It is fast, but not really secure. Output is very predictable, especially if you forgot to execute the RandSeed() procedure.

In real world scenarios, safety always requires random numbers, e.g. for key/nonce/IV/salt/challenge generation.
The less predictable, the better.
We just included a Cryptographically Secure Pseudo-Random Number Generator (CSPRNG) into our SynCrypto.pas unit.
The TAESPRNG class would use real system entropy to generate a sequence of pseudorandom bytes, using AES-256, so returning highly unpredictable content.

Continue reading

2015-11-17

Benefits of interface callbacks instead of class messages

If you compare with existing client/server SOA solutions (in Delphi, Java, C# or even in Go or other frameworks), mORMot's interface-based callback mechanism sounds pretty unique and easy to work with.

Most Events Oriented solutions do use a set of dedicated messages to propagate the events, with a centralized Message Bus (like MSMQ or JMS), or a P2P/decentralized approach (see e.g. ZeroMQ or NanoMsg). In practice, you are expected to define one class per message, the class fields being the message values. You would define e.g. one class to notify a successful process, and another class to notify an error. SOA services would eventually tend to be defined by a huge number of individual classes, with the temptation of re-using existing classes in several contexts.

Our interface-based approach allows to gather all events:

  • In a single interface type per notification, i.e. probably per service operation;
  • With one method per event;
  • Using method parameters defining the event values.

Since asynchronous notifications are needed most of the time, method parameters would be one-way, i.e. defined only as const - in such case, an evolved algorithm would transparently gather those outgoing messages, to enhance scalability when processing such asynchronous events. Blocking request may also be defined as var/out, as we will see below, inWorkflow adaptation.

Behind the scene, the framework would still transmit raw messages over IP sockets (currently over a WebSockets connection), like other systems, but events notification would benefit from using interfaces, on both server and client sides.
We will now see how...

Continue reading

2015-09-16

Feedback from the Wild

We just noticed a nice feedback from a mORMot user. Vojko Cendak commented the well-known DataSnap analysis based on Speed & Stability tests blog article written by Roberto some months years (!) ago. It is not meant to be the final word, perhaps there was some tuning possible for RTC (which is  […]

Continue reading

2015-09-14

Performance issue in NextGen ARC model - much better now

Back in 2013, I found out an implementation weakness in the implementation of ARC weak references in the RTL.
A giant lock was freezing all threads and cores, so would decrease a lot the performance abilities of any ARC application, especially in multi thread.

I just investigated that things are now better.

Continue reading

2015-08-15

Breaking Change in mORMot WebSockets binary protocol

Among all its means of transmission, our mORMot framework features WebSockets, allowing bidirectional communications, and interface-based callbacks for real time notification of SOA events.
After several months of use in production, we identified some needed changes for this just emerged feature.

We committed today a breaking change of the data layout used for our proprietary WebSockets binary protocol.
From our tests, it would increase the performance and decrease the resource consumption, especially in case of high number of messages.

Continue reading

2015-06-30

Faster String process using SSE 4.2 Text Processing Instructions STTNI

A lot of our code, and probably yours, is highly relying on text process. In our mORMot framework, most of its features use JSON text, encoded as UTF-8. Profiling shows that a lot of time is spent computing the end of a text buffer, or comparing text content. You may know that In its SSE4.2 feature  […]

Continue reading

2015-06-21

Why FPC may be a better compiler than Delphi

Almost every time I'm debugging some core part of our framework, I like to see the generated asm, and trying to optimize the pascal code for better speed - when it is worth it, of course! I just made a nice observation, when comparing the assembler generated by Delphi to FPC's output. Imagine you  […]

Continue reading

- page 1 of 7