Tempering Garbage Collection
By A.Bouchez on 2013, Wednesday July 24, 11:01 - Pascal Programing - Permalink
I'm currently fighting against out of memory errors on an heavy-loaded Java server.
If only it had been implemented in Delphi and mORMot!
But at this time, the mORMot was still in its burrow.
Copy-On-Write and a good heap manager can do wonders of stability.
Here are some thoughts about Garbage Collector, and how to temper
their limitations.
They may apply to both the JVM and the .Net runtime, by the way.
Some general patterns about Garbage Collection (GC):
- It is almost impossible to know how much memory is used by a memory structure at runtime, since the corresponding objects may not have been marked as deprecated so are still in memory even if they are not in the queue any more;
- Direct use of object references is handled by some internal reference-counting mechanism, until you define some circular references. Sadly, most GC’s algorithms are much more complex than a simple reference counting mechanism: since a GC favors allocation speed, its tendency is to allocate as many objects as possible, only re-using and collecting the objects as late as possible.
- You can force the GC to collect the memory, but it is usually a blocking process (so may be a wrong idea on a real-time service);
- And since the GC has not a deterministic behavior, you can not be sure which threshold value of your heap use may be a good trigger of garbage collection;
- Some authors state that most GC algorithms expects from 3 to 5 times the used memory to be available (i.e. if you expect 200 MB of data, you need 800 MB of free RAM for your process) - this is mostly due to the performance optimization ;
- On the other hand, giving too much memory may do the opposite as expected, i.e. reduce the global performance, depending on how the VM works;
- From my experiments, the .Net memory model seems to be more aggressive than the one in Java, especially in multi-thread process.
Some usual fixes/optimizations paths:
- Re-use existing objects, and not create new instances (using object pools);
- Use arrays of pre-allocated objects, and restraint use to POJOs/POCOs;
- Some memory structures may use less memory and overhead (e.g. an array of struct in C# are much faster and uses much less memory than a list of objects);
- Limit objects cloning/marshaling/wrapping as much as possible, and pass the data as reference;
- Pre-allocate and re-use memory e.g. for storing text (typical efficient pattern is the string builder);
- Multi-thread process (object locking and monitoring) consumes a lot of resources, so instead of locking at object level, mutexes on small part of the code are much more efficient;
- Do not create more threads than the number of CPU cores it run on – in general, one optimized thread is more efficient than multiple threads: process should happen in one non-blocking thread, then other threads are used to pre-process or post-process the data, e.g. when something slow may take place like serialization or network access;
- Profile the execution, then identify the real bottlenecks to be optimized – for instance working with individual small files is an awful practice;
- Use a fast un-managed in-process storage (e.g. SQLite3, BerkeleyDB, memcached…) instead of storing long-term objects in GC memory.
For server process, or mobile execution, unmanaged environments like Delphi are still a perfect fit!