ORM Master/Slave Replication - Synopse Open Source

Enable synchronization

In order to enable this replication mechanism, you should define a TRecordVersion published property in the TSQLRecord class type definition:

  TSQLRecordPeopleVersioned = class(TSQLRecordPeople)
  protected
    fFirstName: RawUTF8;
    fLastName: RawUTF8;
    fVersion: TRecordVersion;
  published
    property FirstName: RawUTF8 read fFirstName write fFirstName;
    property LastName: RawUTF8 read fLastName write fLastName;
    property Version: TRecordVersion read fVersion write fVersion;
  end;

Only a single TRecordVersion field is allowed per TSQLRecord class - it would not mean anything to manage more than one field of this type.

Note that this field will be somewhat "hidden" to most ORM process: a regular TSQLRest.Retrieve won't fill this Version property, since it is an internal implementation detail.
If you want to lookup its value, you would have to explicitly state its field name at retrieval. Any TRecordVersion is indeed considered as a "non simple field", just like BLOB fields, so would need explicit retrieval of its value.

In practice, any TSQLRest.Add and TSQLRest.Update on this TSQLRecordPeopleVersioned class will increase this Version revision number field, and a TSQLRest.Delete will populate an external TSQLRecordTableDelete table with the ID of the deleted record, associated with a TRecordVersion revision.

As consequences:

The monotonic TRecordVersion number is shared at TSQLRestServer level, among all tables containing a TRecordVersion published field;
The TSQLRecordTableDelete table should be part of the TSQLModel, in conjunction with TSQLRecordPeopleVersioned;
If the TSQLRecordTableDelete table is not part of the TSQLModel, the TSQLRestServer will add it - but you should better make it explicitly appearing in the data model;
A single TSQLRecordTableDelete table will maintain the list of all deleted data rows, of all tables containing a TRecordVersion published field;
The TSQLRecordPeopleVersioned table appearance order in the TSQLModel will matter, since TSQLRecordTableDelete.ID will use this table index order in the database model to identify the table type of the deleted row - in a similar way to TRecordReference and TRecordReferenceToBeDeleted.

All the synchronization preparation will be taken care by the ORM kernel on its own, during any write operation. There is nothing particular to maintain or setup, in addition to this TRecordVersion field definition, and the global TSQLRecordTableDelete table.

From master to slave

To replicate this TSQLRecordPeopleVersioned table from another TSQLRestServer instance, just call the following method:

 aServer.RecordVersionSynchronizeSlave(TSQLRecordPeopleVersioned,aClient);

This single line will request a remote server via a Client: TSQLRestClientURI connection (which may be over HTTP) for any pending modifications since its last call, then will fill the local aServer: TSQLRestServer database so that the local TSQLRecordPeopleVersioned table will contain the very same content as the remote master TSQLRestServer.

You can safely call TSQLRestServer.RecordVersionSynchronizeSlave from several clients, to replicate the master data in several databases.

Only the modified data will be transmitted over the wire, as two REST/JSON queries (one for the insertions/updates, another for the deletions), and all the local write process will use optimized BATCH writing. This means that the synchronization process will try to use as minimal bandwidth and resources as possible, on both sides.

Of course, the slaves should be considered as read-only, otherwise the version numbers may conflict, and the whole synchronization may become a failure.
But you can safely replicate servers in cascade, if needed: the version numbers will be propagated from masters to slaves, and the data will always be in a consistent way.

Replication use cases

We may consider a very common corporate infrastructure:

This kind of installation, with a main central office, and a network of local offices, would benefit from this master/slave replication.
Simple redirection may be used - see Redirect to an external TSQLRest - but it would expect the work to continue, even in case of Internet network failure.
REST redirection would expect a 100% connection uplink, which may be critical in some cases.

You could therefore implement replication in several ways:

Either the main office is the master, and any write would be push to the Main Server, whereas local offices would have a replicated copy of the information - drawback is that in case of network failure, the local office would be able to only read the data;
Or each local office may host its own data in a dedicated table, synchronized as a master database; the main office will replicate (as a slave) the private data of each local servers; in addition, all this data gathered by the Main Server may be further replication to the other local offices, and be still accessible in read mode - in case of network failure, all the data is available on the local servers, and the local private table is still writable.

Of course, the second solution seems preferable, even if a bit more difficult to implement. The ablity of all local offices to work offline on their own private data, but still having all the other data accessible as read-only, would be a huge ROI.

As a benefit of using replication, the central main server would be less stressed, since most of the process would take place in local servers, and the main office server would only be used for shared data backup and read-only gathering of the other local databases. Only a small network bandwith would be necessary (much less than a pure web solution), and CPU/storage resources would be minimal.

Feedback is welcome on our forum, as usual!
See also the associated feature request, for upcoming features related to replication.
And the corresponding documentation article, which will be updated often, so is preferred to this static blog article!