Delphi or FPC Generics

Delphi has its own generics support since Delphi 2009. FPC 2.6 did follow, with a more template-like approach, but almost the same behavior in "Delphi mode".
Both compilers are still buggy about generics compilation. It is pretty easy to generate internal errors. Using generics on early Delphi revisions was almost impossible. Advanced generics work is not really possible before Delphi XE8. FPC seems more stable, but it has also limits and could be broken on some edge cases, and Lazarus parser is still confused about them.

But from generics come great expressiveness, and you can do wonders with them, like this per-type integer trick.
And it could induce some strong typing in your regular code, so they are worth considering.

Generics Collections

There are several collection libraries using generics, in the modern object pascal world, i.e. Delphi and FPC.

The one from the Delphi RTL was the first to exist, and has been enhanced a lot during the last decade. It uses compile time intrinsics like IsManagedType() GetTypeKind() SizeOf() to compile efficiently, and has a very verbose unrolled code for most type sizes.

The one from the FPC RTL is compatible with most of the Delphi RTL, and has been contributed by our friend Maciej (from DaThoX). Its somewhat huge code base has a lot of bells and whistles, like several hashing or memory expansion algorithms. Its performance was not the main point, but is was a good proof that FPC generics were stable and usable.

Spring4D is a well known and very well crafted library. It has a lot of very clever features, and is very close to what you could have in modern C# collection libraries.

All those generics collections tend to generate huge executable size. The Delphi compiler tries to reduce the redundant code, but the pre-compiled units are still huge (the Delphi .dcu files for instance), and the compile time and resource consumption suffer from it.

Spring4D Version 2

The upcoming Spring4D version 2 tries to resolve the binary bloat, and also the Spring4D 1.x performance issues - performance was not the main goal at that time, expressiveness and usefulness was.

This blog article is worth a read.
Here is an extract:

- as you know generics can cause quite a huge chunk of binary code because for every specialization the code is basically duplicated even for binary identical types including their RTTI. In 2.0 all RTTI for the implementing classes which you will never touch anyway is turned off and with some trickery, many generic specializations are folded and forced into the Spring.Collections.dcu to not pollute each and every dcu you compile a list or dictionary into. And it does not only shrink your binary, but also speeds up your compilation as the compiler simply has less work to do - no generating code (RTTI and executable code) into dcu which the linker later has to go through and eliminate duplicates.

Nice work Stephan!

Entering The mORMot Zone

Since the beginning, we have some very powerful data structures in mORMot.
Just to mention the TDynArray and TDynHashedDynArray wrappers, which have a lot of features, and are very efficient. They are used everywhere in the framework core. Also our TSynDictionary is well crafted, thread-safe, and has all the basic features you expect in your daily work on any kind of key/value efficient storage. Even with persistence!

Our new mormot.core.collections.pas unit published those two data structure algorithms as two sets of generic interfaces.

The IList<> type holds a (thread-safe) list of items, with the most useful methods:

  IList<T> = interface
    function Add(const value: T): PtrInt;
    procedure Insert(ndx: PtrInt; const value: T);
    function Delete(ndx: PtrInt): boolean;
    function Remove(const value: T): boolean;
    function Pop(var dest: T; opt: TListPop = []): boolean;
    procedure Clear;
    procedure Reverse;
    procedure Sort(customcompare: TDynArraySortCompare = nil); overload;
    procedure Sort(start, stop: integer;
      customcompare: TDynArraySortCompare = nil); overload;
    procedure Sort(var indexes: TIntegerDynArray;
      customcompare: TDynArraySortCompare = nil); overload;
    procedure Sort(const customcompare: TOnDynArraySortCompare;
      descending: boolean = false); overload;
    function AddSorted(const value: T; wasadded: PBoolean = nil): integer;
    function Sorted: boolean;
    function IndexOf(const value: T): PtrInt;
    function Find(const value: T; customcompare: TDynArraySortCompare = nil): PtrInt;
    function GetEnumerator: TSynEnumerator<T>;
    function Range(Offset: PtrInt = 0; Limit: PtrInt = 0): TSynEnumerator<T>;
    function First: pointer;
    function AsArray(Offset: PtrInt = 0; Limit: PtrInt = 0): TArray<T>;
    property Items[ndx: PtrInt]: T; default;
    property Count: PtrInt;
    property Capacity: PtrInt;
    property Comparer: TDynArraySortCompare;
    function Safe: PRWLock;
    function Data: PDynArray;
  end;

Not a lot of methods, but there is access to the underlying Data: PDynArray instance e.g. for JSON or binary serialization (unique among all libraries), built-in multi-read/exclusive-write thread-safety (Safe), stack-like or queue-like behavior (Pop), fast search with optional case insensitivity and hashed or binary (sorted) indexes (IndexOf() Find()).

Most collections libraries tend to multiply classes and types. There is a class<T> for a sorted list, another class<T> for a queue, another class<T> for a stack, another class<T> for hashed list, another class<T> for thread-oriented list, another class<T> for serializable list... in mORMot 2, you have one IList<> which features all, just by settings some options at factory level.
As often in mORMot, we started from the use cases and actual needs for your projects, not from how the collections are implemented.

The IKeyValue<> dictionary type is even smaller:

IKeyValue<TKey, TValue> = interface
  procedure Add(const key: TKey; const value: TValue);
  function TryAdd(const key: TKey; const value: TValue): boolean;
  function TryGetValue(const key: TKey; var value: TValue): boolean;
  function GetValueOrDefault(const key: TKey; const defaultValue: TValue): TValue;
  function Remove(const key: TKey): boolean;
  function Extract(const key: TKey; var value: TValue): boolean;
  function ContainsKey(const key: TKey): boolean;
  function ContainsValue(const value: TValue): boolean;
  function DeleteDeprecated: integer;
  procedure Clear; overload;
  function Count: integer;
  property Items[const key: TKey]: TValue; default;
  property Capacity: integer;
  property TimeOutSeconds: cardinal;
  function Data: TSynDictionary;
end;

Here the processing is fully thread-safe, and has some unique features from TSynDictionary like binary or JSON serialization, key lookup using an internal hash table, thread-safety (with several kind of locks, or no lock), and an optional TimeOutSeconds property to delete deprecated items after a while - typical process when caching data.
If those high-level thread-safe functions are not enough, you have access to the internal Data: TSynDictionary instance, and here you could work on the data with its own Safe lock, and run index-based methods, or more complex key/value processing via callbacks.

Our Collections.NewList<T> and Collections.NewKeyValue<TKey, TValue> factories leverage latest IsManagedType() GetTypeKind() SizeOf() compiler intrinsics for efficiency. For instance, a huge case GetTypeKind(...) of will be reduced at compile time into a single call to a specialized shared factory method - e.g. reusing TSynListSpecialized<integer> for all compatible types.

To be fair, both IList<> and IKeyValue<> are pretty basic, in comparison to RTL collections or Spring4D libraries.
But they do what they do: store values, or key/value pairs, in a straightforward and efficient way.
If you need to have some cascaded filters, or build a fluent interface, use Spring4D. But I guess that for most of usual work, our little interfaces are enough.

Face Your Interface

The first thing you will notice about our types, in respect to the RTL collections, is that they are defined as generic interface, not class.

One obvious benefit is that it may ease memory management. When the interface reference count reaches 0, the whole list or dictionary will be released and all its stored items will be freed - or not, depending on your needs. Nice and easy. No more try/finally to write.

But another benefit is that, as Spring4D 2.0 does, we will be able to reuse the VMT of each interface.

With a regular class, it you have a TObjectList<TObject1> and another TObjectList<TObject2> then the very same code will be generated twice, once for TObject1 and another time for TObject2. Even if both objects are in fact just... pointers in the type definition, and in the generated asm. A lot of duplicated code, which the compiler linker tries to identify and remove from the executable, but it is not always working. This is the "binary bloat" Stephan was talking about in his article.

Using interfaces and factories allow to reuse the very same interface for all similar types.
For instance, on Win32, if you use an IList<integer> or an IList<pointer> or an IList<TObject1> or an IList<TObject2>, they will use in fact a single interface - a IList<integer> to be precise, which is the so-called "specialized" 32-bit ordinal/pointer processing class. Only the interface definition is duplicated/specific. But a single implementation class will be reused for all those types, sharing the very same IList<> VMT.
And, in respect to plain RTL collections, this implementation class will stay within mormot.core.collections.pas, and not expanded/defined/generated in each unit using the IList<> definitions. Only the interface and its associated enumerators are generated in each unit. This reduces the .dcu .ppu size a lot, and also eases the compiler work and resources consumption.

To let this magic happen, you would not call any class instance - like TSynListSpecialized<T>, but you will use some factories:

var
  i: integer;
  li: IList<integer>;
begin
  li := Collections.NewList<integer>;
  li.Capacity := MAX + 1;       // faster Add() thanks to pre-allocation
  for i := 0 to MAX do          // populate with some data
    Check(li.Add(i) = i);
  for i := 0 to li.Count - 1 do // regular Items[] access
    Check(li[i] = i);
  for i in li do                // use an enumerator - safe and clean
    Check(cardinal(i) <= MAX);
  for i in li.Range(-5) do      // use an enumerator for the last 5 items
    Check(i > MAX - 5);
  for i in li do
    Check(li.IndexOf(i) = i);   // O(n) brute force search using SSE2 asm
end; // no need to set li := nil or write any try..finally Free end; block

And if you used instead:

  li := Collections.NewList<integer>([loCreateUniqueIndex]);

then it will use a hash table for the IList<integer>.Find() searches:

  for i in li do
    Check(li.Find(i) = i);      // O(1) hash table search

To create a dictionary, it is as simple as using the Collections.NewKeyValue<TKey, TValue> factory method.

Check our regression tests for more reference code about how to use those interfaces.

Behind The Scene

As I wrote, behind the scene, a TDynArray or a TSynDictionary is involved. So most of the code is using plain RTTI, but very optimized code. Not bloated per-type code generated for each type, but very tuned code.
For instance, searching for a byte, word or integer as in IList<...>.IndexOf() would use SSE2 fast assembly and not a naive for i := 0 to Count - 1 do... loop.

Please take a little time and look at how the mormot.core.collections.pas unit is written.
You will see that most of the code is the interface definitions (and comments/documentation), with small wrapper classes (one non-generic, another generic, to reduce the code size), and that the implementation part is mostly about the specialization to the main types to reuse the VMT. The resulting unit is much smaller and easier to debug and maintain than other libraries, which use generics down to the implementation - which we did not.

We tried to make our enumerators as efficient as possible. A for i in li do loop would make no call, will be efficiently inlined and only use a single pointer on the stack. No memory allocation involved.

Of course, mORMot 2 ORM methods could use those IList<TOrm> type, if you prefer to use this syntax when retrieving TOrm instances.
We still need to leverage their power within the interface definitions, for our SOA stack.
Stay tuned!

As usual, discussion and feedback are welcome in our forum!