How to make it fast?

On our forum, a clever question was posted about publishing some enhanced RTL functions for newer versions of Delphi - as we did for Delphi 7 and 2007.

I was looking for a faster IntToStr implementation and discovered SynCommons.pas.
That's really too bad, SynCommons.pas really does contain some seriously fast stuff, people would greatly benefit from it if it was made general-purpose.

In fact, it would not be enough to change the RTL function implementations.
IMHO, to write something scalable, you need to get rid of such functions.

Continue reading


Go language and Delphi

Do you know the Go language?

It is a strong-typed, compiled, cross-platform, and concurrent.
It features some nice high-level structures, like maps and strings, and still have very low-level access to the generated code: pointers are there, in a safe strong-typed implementation just like in pascal, and there is even a "goto", which sounds like an heresy to dogmatic coders, but does make sense to me, at least when you want to optimize code speed, in some rare cases.

It is created/pushed by Google, used internally by the company in their computer farms, and was designed by one of the original C creators.

Continue reading


How function results are allocated

One potential issue with Delphi coding, is about how the result of a functions are implemented.

If you forget to set a result value to a function, you'll get a compiler warning.
Never underestimate such warning: IMHO this is not a warning, but an error.

And you should better be aware of the handling of reference-counted types (e.g. string) in a function results: those are passed the stack as var parameters, so the result of a function may be set even if an exception is raised during function execution!

Continue reading


Currency is your friend

The currency type is the standard Delphi type to be used when storing and handling monetary values. It will avoid any rounding problems, with 4 decimals precision. It is able to safely store numbers in the range -922337203685477.5808 .. 922337203685477.5807. Should be enough for your pocket change.

As stated by the official Delphi documentation:

Currency is a fixed-point data type that minimizes rounding errors in monetary calculations. On the Win32 platform, it is stored as a scaled 64-bit integer with the four least significant digits implicitly representing decimal places. When mixed with other real types in assignments and expressions, Currency values are automatically divided or multiplied by 10000.

In fact, this type matches the corresponding OLE and .Net implementation of currency, and the one used by most database providers (when it comes to money, a dedicated type is worth the cost in a "rich man's world"). It is still implemented the same in the Win64 platform (since XE 2). The Int64 binary representation of the currency type (i.e. value*10000 as accessible via PInt64(aCurrencyValue)^) is a safe and fast implementation pattern.

In our framework, we tried to avoid any unnecessary conversion to float values when dealing with currency values. Some dedicated functions have been implemented for fast and secure access to currency published properties via RTTI, especially when converting values to or from JSON text. Using the Int64 binary representation can be not only faster, but also safer: you will avoid any rounding problem which may be introduced by the conversion to a float type. Rounding issues are a nightmare to track - it sounds safe to have a framework handling natively a currency type from the ground up.

Continue reading


Using Extended in Delphi XE2 64 bit

Unfortunately, Delphi's 64-bit compiler (dcc64) and RTL do not support 80-bit extended floating point values on Win64, but silently alias Extended = Double on Win64.

There are situations, however, where this is clearly undesirable, e.g. if the additional precision gained from Extended is required.

The Open-source uTExtendedX87 unit provides a replacement FPU-backed 80-bit Extended floating point type (TExtendedX87) for Win64.

Continue reading


Multi-threading and Delphi

Writing working multi-threaded code is not easy - it's even hard, as as a Delphi expert just wrote in his blog.

In fact, the first step into multi-thread application development could be:

"protect your shared variables with locks (aka critical sections), because you are not sure that the data you read/write is the same for all threads".

The CPU per-core cache is just one of the possible issues, which will lead into reading wrong values. Another issue which may lead into race condition is two threads writing to a resource at the same time: it's impossible to know which value will be stored afterward.

Continue reading


Our mORMot won't hibernate this winter, thanks to FireMonkey

Everybody is buzzing about FireMonkey...

Our little mORMot will like FireMonkey!
Here is why...

Continue reading


Which Delphi compiler produces faster code?

After a question on StackOverflow, I wanted to comment about the speed of generated code by diverse Delphi compiler versions.

Since performance matters when we write general purpose libraries like ours, we have some feedback to propose:

Continue reading


Intercepting exceptions: a patch to rule them all

In order to let our TSynLog logging class intercept all exceptions, we use the low-level global RtlUnwindProc pointer, defined in System.pas.

Alas, under Delphi 5, this global RtlUnwindProc variable is not existing. The code calls directly the RtlUnWind Windows API function, with no hope of custom interception.

Two solutions could be envisaged:

  • Modify the Sytem.pas source code, adding the new RtlUnwindProc variable, just like Delphi 7; 
  • Patch the assembler code, directly in the process memory.

The first solution is simple. Even if compiling System.pas is a bit more difficult than compiling other units, we already made that for our Enhanced RTL units. But you'll have to change the whole build chain in order to use your custom System.dcu instead of the default one. And some third-party units (only available in .dcu form) may not like the fast that the System.pas interface changed...

So we used the second solution: change the assembler code in the running process memory, to let call our RtlUnwindProc variable instead of the Windows API.

Continue reading

True per-class variable

For our ORM, we needed a class variable to be available for each TSQLRecord class type.

This variable is used to store the properties of this class type, i.e. the database Table properties (e.g. table and column names and types) associated with a particular TSQLRecord class, from which all our ORM objects inherit.

The class var statement was not enough for us:
- It's not available on earlier Delphi versions, and we try to have our framework work with Delphi 6-7 up to XE;
- This class var instance will be shared by all classes inheriting from the class where it is defined - and we need ONE instance PER class type, not ONE instance for ALL

We needed to find another way to implement this class variable

An unused VMT slot in the class type description was identified, then each class definition was patched in the process memory to contain our class variable.

Continue reading


Calling a 64 bit library from a Delphi 32 bit process

Since we are still waiting for a Delphi 64 bit compiler, the only available solution to access a 64 bit library from an application written in Object pascal, is to use the 64 bit version of the FreePascal Compiler.

But you just can not recompile your VCL/GUI based Delphi application with FPC:

  • Some low-level part of your code may not be directly compatible with a 64 bit process (e.g. since the pointer size changed);
  • The GUI part of the application can not be ported directly with FPC - the Lazarus project try to be as close as possible to VCL, but it can be a very difficult, either impossible if you use some third-party components.
I just found out a solution from CodeCentral, allowing to call any 64 bit dll from a Delphi 32 bit process.

Continue reading


How to write fast multi-thread Delphi applications

How to make your software run fast, especially in a multi-threaded architecture?

We tried to remove the Memory Manager scaling problems in our SynScaleMM. It worked as expected in a multi-threaded server environment. Scaling is much better than FastMM4, for some critical tests. But it's not ready for production yet...

To be honest, the Memory Manager is perhaps not the bigger bottleneck in Multi-Threaded applications.

Here are some (not dogmatic, just from experiment and knowledge of low-level Delphi RTL) advice if you want to write FAST multi-threaded application in Delphi.

Continue reading


SynScaleMM - a multi-thread scaling Memory Manager

We just released a new unit to the source code repository.

It's a simple, small and compact MM, built on top of the main Memory Manager(FastMM4 is a good candidate, standard since Delphi 2007), architectured in order to scale on multi core CPU's (which is what FastMM4 is lacking).

Original code is ScaleMM - Fast scaling memory manager for Delphi by André Mussche.

Continue reading

When inlining works

It was told on this Blog that the Delphi memory manager (FastMM4 since Borland 2006 - but it was even worse with the previous "Borland's" MM), doesn't scale well on multi-code CPU. That is, if you have a multi-threaded application with a lot of memory handling (e.g. aString := aString+someString), the Delphi MM won't scale with multi cores. When I mean "don't scale", I mean that the optimistic though of "my CPU has 4 cores, therefore the same work run in 4 threads will be 4 times faster than with 1 thread" is false. It performs even worse with multiple threads than with 1 thread...

So we went into forking a nice project, named ScaleMM, and created our scalable optimized MM, named SynScaleMM. Our forked modifications were even included in the main ScaleMM branch.

During my profiling of our SynScaleMM, I discovered some very nice results with Delphi compiler inlining features.

Continue reading


Writing Delphi code for 64 bits compiler

There will be an upcoming 64 bits Delphi compiler. Embarcadero promised it.

Florian (the architect of FPC) showed a first "Hello world" program for Win64 in March 2006.
This was remarkable since GCC and the binutils don't even support this target at this time.
In fact, FPC used its Internal linker on Win32 and Win64 platforms, just like Delphi does.

Here are some points on how you could make your code ready to compile under FPC 64 bits, therefore (I hope) under future Delphi 64 bits compiler.

Continue reading


Compiler enhancement proposal: threadlocalvar

As I wrote in a previous post, Delphi string, dynamic array and memory manager don't like multi-core CPU.

My proposal is to add a threadlocalvar keyword, to be used instead of var in your code, to mark some variables to be used in only the current thread. Then the compiler and RTL won't have to use the LOCK instruction, and the application will be MUCH faster in multi-thread environment.

Continue reading


Delphi doesn't like multi-core CPUs (or the contrary)

If you're like me, you are proud of the new CPU your computer runs on - in my case a i7-720Q with 8 embedded cores...

But Delphi is not very multi-thread or multi-core friendly... guess why....

Continue reading


Mac OS X Stack Alignment, asm and trolls

In a very interesting commentPhiS spoke about the Mac OS X Stack Alignment problem, and the way asm code should be written for the future Cross Platform Delphi compiler. Here are some (hope without any Troll hidden) reflections I went through.

Continue reading


Fast JPEG decoder using SSE/SSE2 version 1.2

The Fast JPEG decoder using SSE/SSE2 library file has been updated, and is now in version 1.2, released under a MPL/GPL/LGPL tri-license. It's mainly a bug issue fix.

Continue reading


CopyRecord faster proposal

After some speed debates occurred in the Delphi community, I've rewritten the _CopyRecord function of the system.pas unit, with speed in mind.

Continue reading

- page 2 of 3 -