New crc32c() function using optimized asm and SSE 4.2 instruction
Cyclic Redundancy Check (CRC) codes are widely used for integrity checking
of data in fields such as storage and networking.
There is an ever-increasing need for very high-speed CRC computations on processors for end-to-end integrity checks.
We just introduced to mORMot's core unit
SynCommons.pas) a fast and efficient
It will use either:
- Optimized x86 asm code, with unrolled loops;
- SSE 4.2 hardware crc32 instruction, if available.
Resulting speed is very good.
This is for sure the fastest CRC function available in Delphi.
Note that there is a version dedicated to each Win32 and Win64 platform - both performs at the same speed!
In fact, most popular file formats and protocols (Ethernet, MPEG-2, ZIP,
RAR, 7-Zip, GZip, and PNG) use the polynomial
Intel's hardware implementation is based on another polynomial,
$1EDC6F41 (used in iSCSI and Btrfs).
So you would not use this new
crc32c() function to
replace the zlib's
crc32() function, but as a
convenient very fast hashing function at application level.
For instance, our
TDynArray wrapper will use it for fast items
Here are some speed result, run on a Core i7 notebook.
We did hash 10000 random strings, from 1 to 1250 chars long.
- Our optimized unrolled x86 version - aka
crc32cfast()- performs the test at a very good pace of 1.7 GB/s;
- SSE 4.2 version - aka
crc32csse42()- gives an amazing 3.7 GB/s speed (on both Win32 and Win64 platforms);
- simple rolled version of the algorithm (similar to the one in Delphi
zlibunit) runs at 330 MB/s.
For comparison, on the same random content:
- Our optimized unrolled
kr32()function (i.e. the standard Kernighan & Ritchie hash taken from "The C programming Language", 3rd edition) hashes at 898.8 MB/s;
- Our simple proprietary
Hash32()function runs at 2.5 GB/s, but with much more collisions.