Synopse Open Source - Tag - BigTablemORMot MVC / SOA / ORM and friends2024-02-02T17:08:25+00:00urn:md5:cc547126eb580a9adbec2349d7c65274DotclearSynopse Big Table 1.12aurn:md5:de77cc3b3d5de04df58c02c133dbdcef2011-01-22T13:48:00+01:002012-11-28T20:54:27+01:00AB4327-GANDISynopse BigTableBigTableDatabaseDelphiMetaDataSource<p><em>Synopse Big Table</em> is an open source Delphi unit for very fast data
storage and access, using key/values pairs, or records organized with
fields.</p>
<p>With this 1.12a version, the unit has evolved into a true field-oriented
database, with two new classes:<br />
- <em>TSynBigTableRecord</em> to store an unlimited number of
records with fields;<br />
- <em>TSynBigTableMetaData</em> to store any data (pictures, HTML, text)
associated with metadata fields.</p>
<p>Both classes handle variable-length storage of integers, floats, currency,
text (Unicode or not) with a field name. On-the-fly field adding, integrated
indexing and search capabilities.<br />
Data access can be either fast direct access, or via late-binding (i.e. use
<em>Record.Field</em> in your Delphi code).</p>
<p>Classic <em>Key/Value</em> storage is always possible via
<em>TSynBigTable</em> or <em>TSynBigTableString</em>, but is now faster and
safer. A few issues were corrected.</p>
<p><strong>Update: version 1.12b has been published (same <a href="http://synopse.info/files/SynBigTable.zip">download link</a>).<br />
Some <a href="http://synopse.info/forum/viewtopic.php?id=230">issues have been
fixed</a> about packing and the two new classes types.</strong></p> <h4>Database creation</h4>
<p>In order to understand how the two new classes work, we will create a new
database with some fields:</p>
<pre>
var Table: TSynBigTableRecord;
FieldText, FieldInt: TSynTableFieldProperties;
begin
Table := TSynBigTableRecord.Create('FileName.ext','TableName');
FieldText := Table.AddField('text',tftWinAnsi,[tfoIndex]);
FieldInt := Table.AddField('Int',tftInt32,[tfoIndex,tfoUnique]);
Table.AddFieldUpdate;
</pre>
<p>The database will be stored in the <em>FileName.ext</em> file,
and will have internally <em>TableName</em> as table name (this table name
will be used later, inside our main framework interface those classes via SQL -
you won't have to care about this by now, anyway).<br />
Its first field will be named TEXT, will contain some ansi text (we don't need
true Unicode here, and WinAnsi will save some disk space). It will use an
index.<br />
The second field is named INT, and will contain some integer 32 bit value. It
will have also an index, and during record creation, it will be checked that
every value is unique.</p>
<p>Of course, if the file already exists, the <em>AddField</em> calls won't do
anything: the field layout is stored in the file, so the fields won't be
created each time - only if needed.<br />
The <em>AddFieldUpdate</em> method must be called at last, because if some
fields were just added from a file already containing some data, this method
will process the rows in order to prepare the storage of any new field.</p>
<p>It's worth noting that the storage layout on disk will follow a
"performance" order: fixed size fields will be placed first in every record,
and indexed fields will also be first. Layout on disk won't follow the order in
which the <em>AddField</em> method has been called. You can even store integer
values in variable-length. It will be a bit slower, but it could save a lot of
disk space.<br />
In all cases, just know that it was designed to be fast, and use as less disk
space as possible.</p>
<h4>Fields and records handling</h4>
<p>OK. We have a database with fields.<br />
But how do we handle the data?</p>
<p>There are several ways of handling fields: via direct access or via
late-binding.</p>
<p>Direct access will use a <em>TSynTableData</em> record type to
store the data of a table row:</p>
<pre>
var rec: TSynTableData;<br /><br /> rec.Init(Table.Table);
rec.Field['TEXT'] := 'Some text';
rec.SetFieldValue(FieldInt,12345);
aID := Table.RecordAdd(rec);
if aID=0 then
ShowMessage('Error adding record');
</pre>
<p>The above code will initialize the local <em>rec</em> instance to work
with the <em>Table</em> field layout, via the <em>Table.Table</em>
property<em>.</em><br />
Note that the <em>rec</em> instance is an object allowed on stack: you
don't have to call any <em>rec.Free</em> or add any <em>try..finally</em>
block.<br />
Then a value is set to the TEXT field. The <em>rec.Field[fieldname]</em> can be
read or set with any variant value.<br />
The INT field is accessed direcly, via the <em>SetFieldValue</em> method (which
is faster than the <em>Field</em> method, because it's not necessary to search
for the field name).<br />
Then the record content is added to the database (<em>RecordAdd</em> returns
the ID of the added row).</p>
<p>Late-binding makes use of a custom variant type:</p>
<pre>
var vari: Variant;<br /><br /> vari := Table.VariantVoid;
vari.text := 'Some text';
vari.int := 12345;
if Table.VariantAdd(vari)=0 then
ShowMessage('Error adding record');
</pre>
The above code will initialize the local <em>vari</em> instance with a custom
variant type "knowing" the <em>Table</em> field layout, via
the <em>Table.VariantVoid</em> property<em>.</em><br />
Then, you can access to the record properties, just by using their name. The
custom variant type will retrieve the <em>TSynTableFieldProperties</em> field
by late-binding (i.e. during the execution). An exception will be raised in
case of wrong field name.<br />
<p>Of course, this has a cost: using this variant type will be slower than
direct<em>TSynTableData</em> record access (and the faster will
be <em>TSynTableData.SetFieldSBFValue</em> method, because it won't
use any variant). But for common use, using a variant could make your code
cleaner.</p>
<p>You can retrieve a record field content by using one of the two types:</p>
<pre>
rec := Table.RecordGet(aID);
assert(rec.ID=aID);
assert(rec.GetFieldValue(FieldText)='Some text');
vari := Table.VariantGet(aID);
assert(vari.ID=aID);
assert(vari.Text='Some text');
</pre>
<p>Some dedicated methods are of course available to update or delete some
records.</p>
<h4>Search opportunities</h4>
<p>Both classes offer advanced search features.<br />
They allow to fast iterate through all records for a value, or can use an
internal index, for immediate retrieval:</p>
<pre>
var IDs: TIntegerDynArray;<br /> Count: integer;<br /><br /> assert(Table.Search(FieldText,'Some text',IDs,Count));<br /> assert(Count=1);<br /> assert(IDs[0]=aID);
</pre>
<p>As shown in the above code, you can search for records matching a specified
field value. If an index was created with the field (but you can also create
later an index to any existing field), search will use this one, and will be
immediate.<br />
The <em>Search</em> method returns its results in a array of integer,
containing all matching IDs.</p>
<h4>Benchmarks</h4>
<p>Speed is, with the moderate disk space usage, one major goal of this
unit.<br />
Thanks to its unique design, I think you have at hand the fastest database
engine for Delphi. Much faster than any SQL engine around, in all cases.</p>
<p>Creating 1,000,000 records with some text and an integer value, both fields
using an index, and the integer field set as unique is less than 880 ms on my
laptop.<br />
Reading all 1,000,000 records, and checking both field values take 220 ms
in direct, 360 ms using <em>TSynTableData</em>, and 1560 ms using the
late-binding (i.e. using a variant type - which is, as expected, the slower but
cleaner method).<br />
Writing the content to file is about 70 ms. Opening a file 30 ms. Adding a
field then recreating the file layout 470 ms.<br />
Searching 50 text values iterating takes 1970 ms; 200 text values using an
index only 0.3 ms.<br />
Searching 50 integer values iterating takes 1660 ms; 200 integer values using
an index only 0.1 ms.<br />
File size is only 19 MB big, including all data, indexes, and field layout.</p>
<p>We provide a sample executable with the source code, so that you could test
it on your own PC.</p>
<h4>Get the source and make yourself your idea</h4>
<p>Available from <a href="http://synopse.info/fossil">our Source Code
repository</a> and <a href="http://synopse.info/files/SynBigTable.zip">from a
zip archive</a>.<br />
Compiles with Delphi 6 up to Delphi XE (fully Unicode-compatible, even before
Delphi 2009).<br />
Licensed under a <a href="http://synopse.info/forum/viewtopic.php?id=27">MPL/GPL/LGPL tri-license</a>,
ready to be embedded in any application. </p>
<p><a href="http://synopse.info/forum/viewtopic.php?id=200">Feedback, full
benchmak and comments are welcome on our forum</a>.</p>Synopse Big Table 1.12urn:md5:92c7e9d6eeaf0fb3f9ab6fa039d118c32010-12-21T22:21:00+01:002011-01-23T12:47:20+01:00AB4327-GANDISynopse BigTableBigTableDatabaseDelphi<p><em>Synopse Big Table</em> is an open source Delphi unit for very fast data
storage and access, using key/values pairs.<br />
If you just need to save raw data on disk, and retrieve it with an unique ID
number or string, this unit could fit your needs.<br />
The unit has been deeply rewritten for the new version 1.12.</p>
<p>Main enhancements are great speed improving, less disk space use, new
dedicated methods (and direct update of any record content).</p> About this version 1.12:<br />
- this is a MAJOR update: the file format changed (new magics
$ABAB0004/5);<br />
- now uses <code>SynCommons</code> unit (avoid too much duplicated code);<br />
- buffered writing and reading to file: major speed up of the unit, since
Windows file access API are dead slow; for instance, reading uses now
memory-mapped files for best possible performance;<br />
- all previous caching (not working in fact) has been disabled (the caching is
now implemented more efficiently at OS level, within memory mapped
files);<br />
- <code>TSynBigTableString</code> has no 65535 key length limitation any
more;<br />
- values or UTF-8 keys of fixed-size are now stored in the most efficient
way;<br />
- new <code>Update()</code> methods, allowing to change the content of any
record;<br />
- new <code>GetPointer()</code> methods, to retrieve a pointer to the data,
directly in memory mapped buffer (faster than a standar Get() call);<br />
- new <code>GetAsStream()</code> methods, to retrieve a data into an in-memory
stream, pointing into the memory mapped buffer in most cases;<br />
- new <code>GetIterating()</code> method, which will loop into all data items,
calling a call-back with pointers to each data element (very fast
method);<br />
- <code>fDeleted[]</code> array now stored in ascending order, to make whole
unit faster.
<p><br />
<a href="http://synopse.info/forum/viewtopic.php?pid=1011#p1011">New benchmarks
are impressive</a>.<br />
Here for 1,000,000 items of 8 bytes key/values (Y Axis is seconds):<br />
<img src="http://chart.apis.google.com/chart?chxl=1:|Write|Read&chxr=0,0,9.2|1,-5,100&chxt=y,x&chbh=a&chs=500x250&cht=bvg&chco=A2C180,3D7930,FF9900,FF3100,005BB1,FF999B,00E200&chds=0,9.2,0,9.2,0,9.2,0,9.2,0,9.2,0,9.2,0,9.2&chd=t:2.896,2.263|2.779,0.962|5.118,3.55|9.108,3.109|7.219,0.789|0.402,0.334|0.320,0.026&chdl=B%2B+tree+API+of+TC+(at+random)+|Quick+Database+Manager+1.8.77|New+Database+Manager+5.1|Berkeley+DB+4.6.21|Trivial+Database+1.0.6|Tokyo+Cabinet+|Synopse+Big+Table+1.12&chtt=DBM+performance" alt="" /></p>
<p>In short: it performs better than any other key/value library, even the
Tokyo Cabinet.</p>
<p>Our results (on my laptop, i.e. less powerful than the Xeon quad core of the
reference pdf used for other DBM on this graph) with 1,000,000 items: write
time 320.2 ms, read time (new GetIterating() method) 26.8 ms, file size
18,984,259 (18.1 MB).<br />
The new buffered reading and writing, and usage of memory-mapped files,
improves performance a lot.<br />
File storage has been enhanced a lot: the indexes and offsets are stored in a
very optimized way, reducing the necessary disk space needed.</p>
<p>It's worth adding that our main unit purpose is not to have the fastest
access of low sized data (e.g. 8 bytes key/values, as in this test), but to
store any amount of data (up to 1 GB) on disk, with no total size limitation
(64 bit indexes).</p>
<p>Available <a href="http://synopse.info/fossil/finfo?name=SynBigTable.pas">from our Source Code
repository</a> and from a <a href="http://synopse.info/files/SynBigTable.zip">zip archive</a> (with bundled exe
to make your own benchmarks). Compiles with Delphi 6 up to Delphi XE.</p>
<p><a href="http://synopse.info/forum/viewtopic.php?id=181">Feedback and
comments are welcome on our forum</a>.</p>Synopse Big Table 1.9.2urn:md5:5d07c4fdf10a6adec2ba6fe926bd84bd2010-09-20T10:44:00+02:002011-01-23T12:47:20+01:00AB4327-GANDISynopse BigTableBigTableDatabaseDelphi<p>Synopse Big Table has been updated into version 1.9.2.<br />
Some new methods, and string key values can be safely bigger than 65535 chars
now.</p>
<p>New benchmark available:<br />
36 seconds for creating more than 150,000 records, storing 3 GB of data.<br />
393 ms to create 1,000,000 records, with an associated string key.<br />
Delphi rocks!</p> <p>Synopse Big Table has been updated into version 1.9.2.<br />
- new TSynBigTable.GetLength() method;<br />
- new TSynBigTable.ReadToStream() method;<br />
- can set additional file open mode flags in TSynBigTable.Create;<br />
- fixed an obscure possible issue for saving/loading TSynBigTableString with
string IDs bigger in size than 65535 chars;<br />
- Range Checking forced OFF to avoid problems with some projects;<br />
- fFile type modified to THandle, instead of integer;</p>
<p>To be downloaded from <a href="http://synopse.info/files/SynBigTable.zip">http://synopse.info/files/SynBigTable.zip</a>.</p>
<p>I tried TSynBigTable with huge data, to have some benchmark of our little
database engine.<br />
That is, more than 150,000 records, for a whole data of 3 GB.<br />
It worked without any problem, with some good performance. The only bottleneck
here is the hard drive speed itself.</p>
<p>In short:<br />
- creating 150,450 records, for 3 GB of data: 36 seconds;<br />
- opening and reading all records: 2 seconds (all 3 GB of data remains in the
PC RAM);<br />
- opening and reading all records, random access: 3 seconds;<br />
- packing after deletion: 46 seconds (the whole 3 GB is read then written in
place);<br />
- the benchmarks are very close with integer keys (i.e. TSynBigTable) or string
keys (i.e. TSynBigTableString);<br />
- all benchmark were performed on my laptop, with 6 GB of RAM, running under
Windows Seven 64 bit, with Nod32 running, and an internal 2.5 inchs hard
drive.</p>
<p>More details about this benchmark, comments and feeback are welcome <a href="http://synopse.info/forum/viewtopic.php?id=113">on our forum</a>.</p>Synopse Big Table v1.8urn:md5:cbe9b22d79de447cb83e61c2d9873e4d2010-06-12T11:08:00+02:002011-01-23T12:47:50+01:00AB4327-GANDISynopse BigTableBigTableDatabaseDelphiUnicode<p>The Synopse Big Table library has been updated to the 1.8 version.</p>
<p>Some bug fixes, a Thread safe way of working and a some new methods
(AddFile, GetPart).</p> <p>Here is the version history details:</p>
<p><em>Version 1.0</em><br />
- initial release <br />
<br />
<em>Version 1.1</em><br />
- Fix save on disk issue, when some items are deleted but none
added <br />
- enhanced unitary testing procedure <br />
<br />
<em>Version 1.2</em><br />
- new TSynBigTableString class to store data from a UTF-8 encoded string
ID instead of a numerical ID<br />
- added caching for last Get() items (may speed up next Get() a little
bit) <br />
- custom Get() method for range retrieval into a dynamic
array <br />
- TSynBigTable modified in order to handle custom data in header (used
to store string IDs for TSynBigTableString for instance)<br />
- whole engine more robust against any file corruption or type
mistmatch <br />
- Count property returned an incorrect value (including deleted
values) <br />
- added timing (in 1/10 ms) for test steps <br />
- version 1.2b: even (much) faster TSynBigTableString.Add() <br />
<br />
<em>Version 1.3</em><br />
- new Open() Read() and Seek() methods to read data like in a
TStream <br />
- new Clear method to flush the table and rebuild from
scratch <br />
- don't cache data bigger than 1 MB (to save RAM) </p>
<p><em>Version 1.4</em><br />
- added RawByteStringFromFile() and FileFromRawByteString()
procedures <br />
- added TSynBigTable.AddFile() method <br />
<br />
<em>Version 1.7</em><br />
- Thread safe version of the Synopse Big Table <br />
<br />
<em>Version 1.8</em><br />
- new GetPart() method for retrieving a part of a stored file (to be
used especially for big file content)<br />
- fix issue with files > 2 GB (thanks to sanyin for the
report) </p>
<p>You can freely download the 1.3 updated version from <a href="http://synopse.info/files/SynBigTable.zip">http://synopse.info/files/SynBigTable.zip</a></p>
<p>Released under a MPL/GPL/LGPL tri-license.</p>
<p>Enjoy!</p>Synopse Big Table v1.3urn:md5:e59c4773b9f6169f86dc882888759b922010-03-22T22:27:00+01:002011-01-23T12:47:50+01:00AB4327-GANDISynopse BigTableBigTableDatabaseDelphiUnicode<span class="Apple-style-span" style="font-family: 'Lucida Grande', 'Lucida Sans Unicode', sans-serif; font-size: 11px; color: rgb(5, 10, 15); line-height: 16px;">The
Synopse Big Table has been updated to the 1.3 version.<br />
You can now access to your data with more common Open() then Read() and Seek()
methods. Some other enhancements.</span> <p>Here are the main changes to this version:<br />
- new Open() Read() and Seek() methods to read data like in
a <em>TStream</em><br />
- new Clear method to flush the table and rebuild from
scratch<br />
- don't cache data bigger than 1 MB (to save RAM)</p>
<p>You can freely download the 1.3 updated version from <a href="http://synopse.info/files/SynBigTable.zip">http://synopse.info/files/SynBigTable.zip</a></p>
<p>The purpose of this unit seems not to be very clear. Here are some
typical usage:</p>
<p><strong>What are these classes meant for?</strong><br />
- store thunmbails of pictures which content is not intented to change<br />
- a logging or read/only audit trail mechanism (this kind of data is not often
deleted)<br />
- access to compressed data on a CD-ROM or DVD-ROM (see our Open Source
compression libraries in our web site) - you can even add items to the list,
since they will remain in memory; but they will be lost when you close the
file<br />
- cache a huge quantity of generated HTML or XML pages<br />
- add a simple storage and data persistence to your application, in a
few KB of code<br />
- have an easy way to share data between Delphi 7 and Delphi 2010 applications,
without any Unicode headache<br />
- etc... etc...</p>
<p><strong>What are these classes NOT for?</strong><br />
- replacing NTFS - go to Linux and pick the right file system you want<br />
- storing hierarchical data (like directories) - use B-Tree instead<br />
- store big data items (more than some MB)<br />
- store data elements which change a lot, and which are often deleted<br />
- replace a SQL database engine - use our <a href="http://blog.synopse.info/category/Open-Source-Projects/SQLite3-Framework">SQLite3
framework</a> instead<br />
- save your soul or make me rich</p>Synopse Big Tableurn:md5:2ac452ad7d01e3754adf026a82d7dbfc2010-03-16T07:49:00+01:002011-01-23T12:47:50+01:00AB4327-GANDISynopse BigTableBigTableDatabaseDelphi<p>An open source Delphi unit for very fast data storage and access. If
you just need to save raw data on disk, and retrieve it with an unique ID
number, you can use this unit, which is much faster than any database engine.
Work from Delphi 2 to Delphi 2010. Licensed under a MPL/GPL/LGPL
tri-license.</p> <em>TSynBigTable</em> is a class to store huge amount of data, just specified
by an integer ID<br />
- data is stored in an unique file<br />
- data is appended at the end of this file at adding (but use a caching
mechanism for immediate adding)<br />
- use a temporary in memory adding, till the <em>UpdateToFile</em> method is
called<br />
- retrieval is very fast (can't be faster IMHO)<br />
- data items can be deleted<br />
- file can be packed using the <em>Pack</em> method in order to retrieve free
space from deleted entries (sounds like a VACUUM command, but faster)<br />
- total size of file has no limit (but your hard disk, of course)<br />
- limit of one data block depends on RAM (<em>RawByteString</em> is used as
storage for data block)<br />
- before Delphi 2007, much faster when using FastMM4 memory manager - see
Project1.dpr source file<br />
- after profiling, most of the time is spent in the Windows kernel, waiting
from hard disk write of raw data; in all cases, this class is much faster than
any SQL engine storing BLOB, and than plain Win32 files.<br />
<br />
Source code example (extracted from <em>TestBigTable</em> function):<br />
<pre>
T := TSynBigTable.Create(FN);<br /> try<br /> for i := 1 to n do<br /> if T.Add(CreateString(i))<>i then<br /> exit else<br /> if T.CurrentInMemoryDataSize>10 shl 20 then // write on disk every 10 MB<br /> T.UpdateToFile;<br /> for i := 1 to n do<br /> if not T.Get(i,Data) or not TestString(i,Data) then<br /> exit;<br /> finally<br /> T.Free;<br /> end;
</pre>
<p>You can download the source code and the unit and the test program from
<a href="http://synopse.info/files/SynBigTable.zip">http://synopse.info/files/SynBigTable.zip</a></p>
<p>Note that the test program, embedding the full database engine, is less than
32 KB in size (without any UPX)...</p>