Tag - SynScaleMM

Entries feed - Comments feed

2020-05-07

New Multi-thread Friendly Memory Manager for FPC written in x86_64 assembly

As a gift to the FPC community, I just committed a new Memory Manager for FPC.
Check mormot.core.fpcx64mm.pas in our mORMot2 repository.
This is a stand-alone unit for FPC only.

It targets Windows and Linux multi-threaded Service applications - typically mORMot daemons.
It is written in almost pure x86_64 assembly, and some unique tricks in the Delphi/FPC Memory Manager world.

It is based on FastMM4 (not FastMM5), and we didn't follow the path of the FastMM4-AVX version - instead of AVX, we use plain good (non-temporal) SSE2 opcode, and we rely on the mremap API on Linux for very efficient reallocation. Using mremap is perhaps the biggest  benefit of this memory manager - it leverages a killer feature of the Linux kernel for sure. By the way, we directly call the Kernel without the need of the libc.

We tuned our x86_64 assembly a lot, and made it cross-platform (Windows and POSIX). We profiled the multi-threading, especially by adding some additional small blocks for GetMem (which is a less expensive notion of "arenas" as used in FastMM5 and most C allocators), introducing an innovatice and very efficient round-robin of tiny blocks (<128 bytes), and proper spinning for FreeMem and medium blocks.

It runs all our regression tests with huge performance and stability - including multi-threaded tests with almost no slow down: sleep is reported as less than 1 ms during a 1 minute test. It has also been validated on some demanding multi-threaded tasks.

Continue reading

2013-12-05

New Open Source Multi-Thread ready Memory Manager: SAPMM

Do you remember this former article about scalability of the Delphi memory manager, in multi-thread execution context?

Our SynScaleMM is still experimental.
But did pretty well, for an experiment!

At first, you can take a look at ScaleMM2, which is more stable, and based on the same ground.

But a new multi-thread friendly memory manager for Delphi just came out.
It is in fact the anonymous (and already famous) "NN memory manager" Primož talked about in his article about string building and memory managers.


(Note that in this article, our SynScaleMM was found to be scaling very well, but on the other hand, Primož did compile its benchmark program in Debug mode, so our TTextWriter was not in good shape: when you compile in Release mode, optimizations and inlining are ON, and our good TTextWriter just flies... See the note at the beginning of the article - this is why I never find those benchmarks very informative. I always prefer profiling from the real world with real useful process… and was never convinced by any such naive benchmark.)

OK, back to our business!

SapMM is an interesting beast.
https://code.google.com/p/sapmm

Sounds like if Alexei (the initial coder) has a C coding background. But that's fine when you have to deal with low-level structures and algorithms, as required by a memory manager. :)
It features everything we may ask for such a piece of code: clear design, optimized code (mostly by inlining process), memory leak reporting, some parameters for tuning.

It is only for Delphi XE (and up) under Win32 by now, but contributors are welcome!
It is used in production since more than half a year, and it passed all FastcodeMM benchmark tests.

If you want a direct link of the today's source code, without SVN, you may try this direct link from our site.
(but it probably will never be updated - you are warned)

Continue reading

2012-12-20

How to make it fast?

On our forum, a clever question was posted about publishing some enhanced RTL functions for newer versions of Delphi - as we did for Delphi 7 and 2007.

I was looking for a faster IntToStr implementation and discovered SynCommons.pas.
(....)
That's really too bad, SynCommons.pas really does contain some seriously fast stuff, people would greatly benefit from it if it was made general-purpose.

In fact, it would not be enough to change the RTL function implementations.
IMHO, to write something scalable, you need to get rid of such functions.

Continue reading

2011-12-08

Avoiding Garbage Collector: Delphi and Apple side by side

Among all trolling subject in forums, you'll find out the great Garbage Collection theme.

Fashion languages rely on it. At the core of the .Net and Java framework, and all scripting languages (like JavaScript, Perl, Python or Ruby), you'll find a Garbage Collector. New developers, just released from schools, do learn about handling memory only in theory, and just can't understand how is memory allocated - we all have seen such rookies involved in Delphi code maintenance, leaking memory as much as they type. In fact, most of them did not understood how a computer works. I warned you this will be a trolling subject.

And, in Delphi, there is no such collector. We handle memory in several ways:

  • Creating static variables - e.g. on the stack, inside a class or globally;
  • Creating objects with class instances allocated on heap - in at least three ways: with a try..finally Free block, with a TComponent ownership model in the VCL, or by using an interface (which creates an hidden try..finally Free block);
  • Creating reference-counted variables, i.e. string, array of, interface or variant kind of variables.

It is a bit complex, but it is also deadly powerful. You have several memory allocation models at hand, which can be very handy if you want to tune your performance and let program scale. Just like manual recycling at home will save the planet. Some programmers will tell you that it's a waste of cell brain, typing and time. Linux kernel gurus would not say so, I'm afraid.

Then came the big Apple company, which presented its new ARC model (introduced in Mac OS X 10.7 Lion) as a huge benefit for Objective-C in comparison with the Garbage Collection model. And let's face it: this ARC just sounds like the Delphi memory model.

Continue reading

2011-08-28

Multi-threading and Delphi

Writing working multi-threaded code is not easy - it's even hard, as as a Delphi expert just wrote in his blog.

In fact, the first step into multi-thread application development could be:

"protect your shared variables with locks (aka critical sections), because you are not sure that the data you read/write is the same for all threads".

The CPU per-core cache is just one of the possible issues, which will lead into reading wrong values. Another issue which may lead into race condition is two threads writing to a resource at the same time: it's impossible to know which value will be stored afterward.

Continue reading

2010-12-04

SynScaleMM - a multi-thread scaling Memory Manager

We just released a new unit to the source code repository.

It's a simple, small and compact MM, built on top of the main Memory Manager(FastMM4 is a good candidate, standard since Delphi 2007), architectured in order to scale on multi core CPU's (which is what FastMM4 is lacking).

Original code is ScaleMM - Fast scaling memory manager for Delphi by André Mussche.

Continue reading

When inlining works

It was told on this Blog that the Delphi memory manager (FastMM4 since Borland 2006 - but it was even worse with the previous "Borland's" MM), doesn't scale well on multi-code CPU. That is, if you have a multi-threaded application with a lot of memory handling (e.g. aString := aString+someString), the Delphi MM won't scale with multi cores. When I mean "don't scale", I mean that the optimistic though of "my CPU has 4 cores, therefore the same work run in 4 threads will be 4 times faster than with 1 thread" is false. It performs even worse with multiple threads than with 1 thread...

So we went into forking a nice project, named ScaleMM, and created our scalable optimized MM, named SynScaleMM. Our forked modifications were even included in the main ScaleMM branch.

During my profiling of our SynScaleMM, I discovered some very nice results with Delphi compiler inlining features.

Continue reading