MongoDB client
The SynMongoDB.pas
unit features direct optimized access to a
MongoDB server.
It gives access to any BSON data, including documents, arrays, and MongoDB's custom types (like ObjectID, dates, binary, regex or Javascript):
- For instance, a
TBSONObjectID
can be used to create some genuine document identifiers on the client side (MongoDB does not generate the IDs for you: a common way is to generate unique IDs on the client side); - Generation of BSON content from any Delphi types (via
TBSONWriter
); - Fast in-place parsing of the BSON stream, without any memory allocation
(via
TBSONElement
); - A
TBSONVariant
custom variant type, to store MongoDB's custom type values; - Interaction with the
SynCommons
' TDocVariant custom variant type as document storage and late-binding access; - Marshalling BSON to and from JSON, with the MongoDB extended syntax for handling its custom types.
This unit defines some objects able to connect and manage databases and collections of documents on any MongoDB servers farm:
- Connection to one or several servers, including secondary hosts, via the
TMongoClient
class; - Access to any database instance, via the
TMongoDatabase
class; - Access to any collection, via the
TMongoCollection
class; - It features some nice abilities about speed, like BULK insert or delete mode, and explicit Write Concern settings.
At collection level, you can have direct access to the data, with high level
structures like TDocVariant
/TBSONVariant
, with
easy-to-read JSON, or low level BSON content.
You can also tune most aspects of the client process, e.g. about error handling
or write concerns (i.e. how remote data modifications are
acknowledged).
Connecting to a server
Here is some sample code, which is able to connect to a MongoDB server, and returns the server time:
var Client: TMongoClient; DB: TMongoDatabase; serverTime: TDateTime; res: variant; // we will return the command result as TDocVariant errmsg: RawUTF8; begin Client := TMongoClient.Create('localhost',27017); try DB := Client.Database['mydb']; writeln('Connecting to ',DB.Name); // will write 'mydb' errmsg := DB.RunCommand('hostInfo',res); // run a command if errmsg<>'' then exit; // quit on any error serverTime := res.system.currentTime; // direct conversion to TDateTime writeln('Server time is ',DateTimeToStr(serverTime)); finally Client.Free; // will release the DB instance end; end;
Note that for this low-level command, we used a TDocVariant
,
and its late-binding abilities.
In fact, if you put your mouse over the res
variable during
debugging, you will see the following JSON content:
{"system":{"currentTime":"2014-05-06T15:24:25","hostname":"Acer","cpuAddrSize":64,"memSizeMB":3934,"numCores":4,"cpuArch":"x86_64","numaEnabled":false},"os":{"type":"Windows","name":"Microsoft Windows 7","version":"6.1 SP1 (build 7601)"},"extra":{"pageSize":4096},"ok":1}
And we simply access to the server time by writing
res.system.currentTime
.
Adding some documents to the collection
We will now explain how to add documents to a given collection.
We assume that we have a DB: TMongoDatabase
instance available.
Then we will create the documents with a TDocVariant
instance,
which will be filled via late-binding, and via a doc.Clear
pseudo-method used to flush any previous property value:
var Coll: TMongoCollection;
doc: variant;
i: integer;
begin
Coll := DB.CollectionOrCreate[COLL_NAME];
TDocVariant.New(doc);
for i := 1 to 10 do
begin
doc.Clear;
doc.Name := 'Name '+IntToStr(i+1);
doc.Number := i;
Coll.Save(doc);
writeln('Inserted with _id=',doc._id);
end;
end;
Thanks to TDocVariant
late-binding abilities, code is pretty
easy to understand and maintain.
This code will display the following on the console:
Inserted with _id=5369029E4F901EE8114799D9 Inserted with _id=5369029E4F901EE8114799DA Inserted with _id=5369029E4F901EE8114799DB Inserted with _id=5369029E4F901EE8114799DC Inserted with _id=5369029E4F901EE8114799DD Inserted with _id=5369029E4F901EE8114799DE Inserted with _id=5369029E4F901EE8114799DF Inserted with _id=5369029E4F901EE8114799E0 Inserted with _id=5369029E4F901EE8114799E1 Inserted with _id=5369029E4F901EE8114799E2
It means that the Coll.Save()
method was clever enough to
understand that the supplied document does not have any _id
field,
so will compute one on the client side before sending the document data to the
MongoDB server.
We may have written:
for i := 1 to 10 do
begin
doc.Clear;
doc._id := ObjectID;
doc.Name := 'Name '+IntToStr(i+1);
doc.Number := i;
Coll.Save(doc);
writeln('Inserted with _id=',doc._id);
end;
end;
Which will compute the document identifier explicitly before calling
Coll.Save()
.
In this case, we may have called directly Coll.Insert()
, which is
somewhat faster.
Note that you are not obliged to use a MongoDB ObjectID as identifier. You can use any value, if you are sure that it will be genuine. For instance, you can use an integer:
for i := 1 to 10 do begin doc.Clear; doc._id := i; doc.Name := 'Name '+IntToStr(i+1); doc.Number := i; Coll.Insert(doc); writeln('Inserted with _id=',doc._id); end; end;
The console will display now:
Inserted with _id=1 Inserted with _id=2 Inserted with _id=3 Inserted with _id=4 Inserted with _id=5 Inserted with _id=6 Inserted with _id=7 Inserted with _id=8 Inserted with _id=9 Inserted with _id=10
Note that the mORMot ORM will compute a genuine series of integers
in a similar way, which will be used as expected by the
TSQLRecord.ID
primary key property.
The TMongoCollection
class can also write a list of documents,
and send them at once to the MongoDB server: this BULK insert mode -
close to the Array Binding feature of some SQL providers, and
implemented in our SynDB classes - see BATCH sequences for
adding/updating/deleting records - can increase the insertion by a factor
of 10 times, even when connected to a local instance: imagine how much time it
may save over a physical network!
For instance, you may write:
var docs: TVariantDynArray; ... SetLength(docs,COLL_COUNT); for i := 0 to COLL_COUNT-1 do begin TDocVariant.New(docs[i]); docs[i]._id := ObjectID; // compute new ObjectID on the client side docs[i].Name := 'Name '+IntToStr(i+1); docs[i].FirstName := 'FirstName '+IntToStr(i+COLL_COUNT); docs[i].Number := i; end; Coll.Insert(docs); // insert all values at once ...
You will find out later for some numbers about the speed increase due to such BULK insert.
Retrieving the documents
You can retrieve the document as a TDocVariant
instance:
var doc: variant; ... doc := Coll.FindOne(5); writeln('Name: ',doc.Name); writeln('Number: ',doc.Number);
Which will write on the console:
Name: Name 6 Number: 5
You have access to the whole Query
parameter, if needed:
doc := Coll.FindDoc('{_id:?}',[5]); doc := Coll.FindOne(5); // same as previous
This Query
filter is similar to a WHERE clause in SQL. You can
write complex search patterns, if needed - see http://docs.mongodb.org/manual/reference/method/db.collection.find
for reference.
You can retrieve a list of documents, as a dynamic array of
TDocVariant
:
var docs: TVariantDynArray; ... Coll.FindDocs(docs); for i := 0 to high(docs) do writeln('Name: ',docs[i].Name,' Number: ',docs[i].Number);
Which will output:
Name: Name 2 Number: 1 Name: Name 3 Number: 2 Name: Name 4 Number: 3 Name: Name 5 Number: 4 Name: Name 6 Number: 5 Name: Name 7 Number: 6 Name: Name 8 Number: 7 Name: Name 9 Number: 8 Name: Name 10 Number: 9 Name: Name 11 Number: 10
If you want to retrieve the documents directly as JSON, we can write:
var json: RawUTF8; ... json := Coll.FindJSON(null,null); writeln(json); ...
This will append the following to the console:
[{"_id":1,"Name":"Name 2","Number":1},{"_id":2,"Name":"Name 3","Number":2},{"_id ":3,"Name":"Name 4","Number":3},{"_id":4,"Name":"Name 5","Number":4},{"_id":5,"N ame":"Name 6","Number":5},{"_id":6,"Name":"Name 7","Number":6},{"_id":7,"Name":" Name 8","Number":7},{"_id":8,"Name":"Name 9","Number":8},{"_id":9,"Name":"Name 1 0","Number":9},{"_id":10,"Name":"Name 11","Number":10}]
You can note that FindJSON()
has two properties, which are the
Query
filter, and a Projection
mapping (similar to
the column names in a SELECT col1,col2
).
So we may have written:
json := Coll.FindJSON('{_id:?}',[5]); writeln(json);
Which would output:
[{"_id":5,"Name":"Name 6","Number":5}]
Note here than we used an overloaded FindJSON()
method, which
accept the MongoDB extended syntax (here, the field name is unquoted),
and parameters as variables.
We can specify a projection:
json := Coll.FindJSON('{_id:?}',[5],'{Name:1}'); writeln(json);
Which will only return the "Name" and "_id" fields (since _id
is, by MongoDB convention, always returned:
[{"_id":5,"Name":"Name 6"}]
To return only the "Name" field, you can specify '_id:0,Name:1'
as extended JSON for the projection parameter.
[{"Name":"Name 6"}]
There are other methods able to retrieve data, also directly as BSON binary
data. They will be used for best speed e.g. in conjunction with our ORM, but
for most end-user code, using TDocVariant
is safer and easier to
maintain.
Updating or deleting documents
The TMongoCollection
class has some methods dedicated to alter
existing documents.
At first, the Save()
method can be used to update a document
which has been first retrieved:
doc := Coll.FindOne(5); doc.Name := 'New!'; Coll.Save(doc); writeln('Name: ',Coll.FindOne(5).Name);
Which will write:
Name: New!
Note that we used here an integer value (5) as key, but we may use an ObjectID instead, if needed.
The Coll.Save()
method could be changed into
Coll.Update()
, which expects an explicit Query
operator, in addition to the updated document content:
doc := Coll.FindOne(5); doc.Name := 'New!'; Coll.Update(BSONVariant(['_id',5]),doc); writeln('Name: ',Coll.FindOne(5).Name);
Note that by MongoDB's design, any call to Update()
will replace the whole document.
For instance, if you write:
writeln('Before: ',Coll.FindOne(3)); Coll.Update('{_id:?}',[3],'{Name:?}',['New Name!']); writeln('After: ',Coll.FindOne(3));
Then the Number
field will disappear!
Before: {"_id":3,"Name":"Name 4","Number":3} After: {"_id":3,"Name":"New Name!"}
If you need to update only some fields, you will have to use the
$set
modifier:
writeln('Before: ',Coll.FindOne(4)); Coll.Update('{_id:?}',[4],'{$set:{Name:?}}',['New Name!']); writeln('After: ',Coll.FindOne(4));
Which will write on the console the value as expected:
Before: {"_id":4,"Name":"Name 5","Number":4} After: {"_id":4,"Name":"New Name!","Number":4}
Now the Number
field remains untouched.
You can also use the Coll.UpdateOne()
method, which will update
the supplied fields, and leave the non specified fields untouched:
writeln('Before: ',Coll.FindOne(2)); Coll.UpdateOne(2,_Obj(['Name','NEW'])); writeln('After: ',Coll.FindOne(2));
Which will output as expected:
Before: {"_id":2,"Name":"Name 3","Number":2} After: {"_id":2,"Name":"NEW","Number":2}
You can refer to the documentation of the SynMongoDB.pas
unit,
to find out all functions, classes and methods available to work with
MongoDB.
Write Concern and Performance
You can take a look at the MongoDBTests.dpr
sample - located in
the SQLite3- MongoDB
sub-folder of the source code repository, and
the TTestDirect
classes, to find out some performance
information.
In fact, this TTestDirect
is inherited twice, to run the same
tests with diverse write concern.
The difference between the two classes will take place at client initialization:
procedure TTestDirect.ConnectToLocalServer; ... fClient := TMongoClient.Create('localhost',27017); if ClassType=TTestDirectWithAcknowledge then fClient.WriteConcern := wcAcknowledged else if ClassType=TTestDirectWithoutAcknowledge then fClient.WriteConcern := wcUnacknowledged; ...
wcAcknowledged
is the default safe mode: the MongoDB
server confirms the receipt of the write operation. Acknowledged write concern
allows clients to catch network, duplicate key, and other errors. But it adds
an additional round-trip from the client to the server, and wait for the
command to be finished before returning the error status: so it will slow down
the write process.
With wcUnacknowledged
, MongoDB does not acknowledge
the receipt of write operation. Unacknowledged is similar to errors ignored;
however, drivers attempt to receive and handle network errors when possible.
The driver's ability to detect network errors depends on the system's
networking configuration.
The speed difference between the two is worth mentioning, as stated by the regression tests status, running on a local MongoDB instance:
1. Direct access
1.1. Direct with acknowledge: - Connect to local server: 6 assertions passed 4.72ms - Drop and prepare collection: 8 assertions passed 9.38ms - Fill collection: 15,003 assertions passed 558.79ms 5000 rows inserted in 548.83ms i.e. 9110/s, aver. 109us, 3.1 MB/s - Drop collection: no assertion 856us - Fill collection bulk: 2 assertions passed 74.59ms 5000 rows inserted in 64.76ms i.e. 77204/s, aver. 12us, 7.2 MB/s - Read collection: 30,003 assertions passed 2.75s 5000 rows read at once in 9.66ms i.e. 517330/s, aver. 1us, 39.8 MB/s - Update collection: 7,503 assertions passed 784.26ms 5000 rows updated in 435.30ms i.e. 11486/s, aver. 87us, 3.7 MB/s - Delete some items: 4,002 assertions passed 370.57ms 1000 rows deleted in 96.76ms i.e. 10334/s, aver. 96us, 2.2 MB/s Total failed: 0 / 56,527 - Direct with acknowledge PASSED 4.56s
1.2. Direct without acknowledge: - Connect to local server: 6 assertions passed 1.30ms - Drop and prepare collection: 8 assertions passed 8.59ms - Fill collection: 15,003 assertions passed 192.59ms 5000 rows inserted in 168.50ms i.e. 29673/s, aver. 33us, 4.4 MB/s - Drop collection: no assertion 845us - Fill collection bulk: 2 assertions passed 68.54ms 5000 rows inserted in 58.67ms i.e. 85215/s, aver. 11us, 7.9 MB/s - Read collection: 30,003 assertions passed 2.75s 5000 rows read at once in 9.99ms i.e. 500150/s, aver. 1us, 38.5 MB/s - Update collection: 7,503 assertions passed 446.48ms 5000 rows updated in 96.27ms i.e. 51933/s, aver. 19us, 7.7 MB/s - Delete some items: 4,002 assertions passed 297.26ms 1000 rows deleted in 19.16ms i.e. 52186/s, aver. 19us, 2.8 MB/s Total failed: 0 / 56,527 - Direct without acknowledge PASSED 3.77s
As you can see, the reading speed is not affected by the Write
Concern settings.
But data writing can be multiple times faster, when each write command is not
acknowledged.
Since there is no error handling, wcUnacknowledged
is not to be
used on production. You may use it for replication, or for data consolidation,
e.g. feeding a database with a lot of existing data as fast as possible.
Stay tuned for the next article, which will detail the MongoDB
integration 's ORM... and the last one of the series, with benchmarks of MongoDB
against SQL engines, via our ORM...
Feedback is welcome
on our forum, as usual!