As stated by Wikipedia:
In computer programming, a branch table or jump table is a method of transferring program control (branching) to another part of a program (or a different program that may have been dynamically loaded) using a table of branch or jump instructions. It is a form of multiway branch. The branch table construction is commonly used when programming in assembly language but may also be generated by a compiler, especially when implementing an optimized switch statement where known, small ranges are involved with few gaps.
Here is a simple case ... of ... end
statement, as found in our
SynCrossPlatformJSON.pas unit:
case VType of
{$ifndef NEXTGEN}
vtString: result := string(VString^);
vtAnsiString: result := string(AnsiString(VAnsiString));
vtChar: result := string(VChar);
vtPChar: result := string(VPChar);
vtWideString: result := string(WideString(VWideString));
{$endif}
{$ifdef UNICODE}
vtUnicodeString: result := string(VUnicodeString);
{$endif}
vtPWideChar: result := string(VPWideChar);
vtWideChar: result := string(VWideChar);
vtBoolean: if VBoolean then result := '1' else result := '0';
vtInteger: result := IntToStr(VInteger);
vtInt64: result := IntToStr(VInt64^);
vtCurrency: DoubleToJSON(VCurrency^,result);
vtExtended: DoubleToJSON(VExtended^,result);
vtObject: result := ObjectToJSON(VObject);
vtVariant: if TVarData(VVariant^).VType<=varNull then
result := 'null' else begin
wasString := VarIsStr(VVariant^);
result := VVariant^;
end;
else result := '';
end;
Here is the code generated by Delphi on Win64:
SynCrossPlatformJSON.pas.727: case VType of
0000000000560F40 480FB64608 movzx rax,byte ptr [rsi+$08]
0000000000560F45 4883F809 cmp rax,$09
0000000000560F49 7F6B jnle VarRecToValue + $B6
0000000000560F4B 4883F809 cmp rax,$09
0000000000560F4F 0F842F010000 jz VarRecToValue + $184
0000000000560F55 4883F803 cmp rax,$03
0000000000560F59 7F33 jnle VarRecToValue + $8E
0000000000560F5B 4883F803 cmp rax,$03
0000000000560F5F 0F8496010000 jz VarRecToValue + $1FB
0000000000560F65 4883E801 sub rax,$01
0000000000560F69 4883F8FF cmp rax,-$01
0000000000560F6D 0F844F010000 jz VarRecToValue + $1C2
0000000000560F73 4885C0 test rax,rax
0000000000560F76 0F8419010000 jz VarRecToValue + $195
0000000000560F7C 4883E801 sub rax,$01
0000000000560F80 4885C0 test rax,rax
0000000000560F83 0F85C4010000 jnz VarRecToValue + $24D
0000000000560F89 E9A5000000 jmp VarRecToValue + $133
0000000000560F8E 4883E804 sub rax,$04
0000000000560F92 4885C0 test rax,rax
0000000000560F95 747C jz VarRecToValue + $113
0000000000560F97 4883E802 sub rax,$02
0000000000560F9B 4885C0 test rax,rax
0000000000560F9E 0F84A0000000 jz VarRecToValue + $144
0000000000560FA4 4883E801 sub rax,$01
0000000000560FA8 4885C0 test rax,rax
0000000000560FAB 0F859C010000 jnz VarRecToValue + $24D
0000000000560FB1 E956010000 jmp VarRecToValue + $20C
0000000000560FB6 4883F80D cmp rax,$0d
0000000000560FBA 7F32 jnle VarRecToValue + $EE
0000000000560FBC 4883F80D cmp rax,$0d
0000000000560FC0 0F8456010000 jz VarRecToValue + $21C
0000000000560FC6 4883E80A sub rax,$0a
0000000000560FCA 4885C0 test rax,rax
0000000000560FCD 0F84A1000000 jz VarRecToValue + $174
0000000000560FD3 4883E801 sub rax,$01
0000000000560FD7 4885C0 test rax,rax
0000000000560FDA 7447 jz VarRecToValue + $123
0000000000560FDC 4883E801 sub rax,$01
0000000000560FE0 4885C0 test rax,rax
0000000000560FE3 0F8564010000 jnz VarRecToValue + $24D
0000000000560FE9 E9F3000000 jmp VarRecToValue + $1E1
0000000000560FEE 4883E80F sub rax,$0f
0000000000560FF2 4885C0 test rax,rax
0000000000560FF5 745D jz VarRecToValue + $154
0000000000560FF7 4883E801 sub rax,$01
0000000000560FFB 4885C0 test rax,rax
0000000000560FFE 0F84CD000000 jz VarRecToValue + $1D1
0000000000561004 4883E801 sub rax,$01
0000000000561008 4885C0 test rax,rax
000000000056100B 0F853C010000 jnz VarRecToValue + $24D
0000000000561011 EB51 jmp VarRecToValue + $164
And here is the code generated by FPC on Win64:
mov eax, dword ptr [rsi] ; 0027 _ 8B. 06
cmp eax, 2 ; 0029 _ 83. F8, 02
jc ?_0067 ; 002C _ 72, 15
cmp eax, 3 ; 002E _ 83. F8, 03
stc ; 0031 _ F9
jz ?_0067 ; 0032 _ 74, 0F
sub eax, 12 ; 0034 _ 83. E8, 0C
cmp eax, 2 ; 0037 _ 83. F8, 02
jc ?_0067 ; 003A _ 72, 07
cmp eax, 4 ; 003C _ 83. F8, 04
stc ; 003F _ F9
jz ?_0067 ; 0040 _ 74, 01
clc ; 0042 _ F8
?_0067: setae byte ptr [rdi] ; 0043 _ 0F 93. 07
mov rax, qword ptr [rsi] ; 0046 _ 48: 8B. 06
cmp rax, 16 ; 0049 _ 48: 83. F8, 10
ja ?_0084 ; 004D _ 0F 87, 000001F6
lea rdx, [?_0086] ; 0053 _ 48: 8D. 15, 00000000(rel)
movsxd rax, dword ptr [rdx+rax*4] ; 005A _ 48: 63. 04 82
lea rax, [rdx+rax] ; 005E _ 48: 8D. 04 02
jmp rax ; 0062 _ FF. E0
...
?_0086 label dword ; switch/case jump table
dd ?_0077-$ ; 0000 _ 00000172 (rel)
dd ?_0075-$+4H ; 0004 _ 00000148 (rel)
dd ?_0070-$+8H ; 0008 _ 00000098 (rel)
dd ?_0080-$+0CH ; 000C _ 000001DF (rel)
dd ?_0068-$+10H ; 0010 _ 00000078 (rel)
dd ?_0084-$+14H ; 0014 _ 00000261 (rel)
dd ?_0071-$+18H ; 0018 _ 000000CC (rel)
dd ?_0081-$+1CH ; 001C _ 00000204 (rel)
dd ?_0084-$+20H ; 0020 _ 0000026D (rel)
dd ?_0074-$+24H ; 0024 _ 00000144 (rel)
dd ?_0073-$+28H ; 0028 _ 00000124 (rel)
dd ?_0069-$+2CH ; 002C _ 000000AB (rel)
dd ?_0079-$+30H ; 0030 _ 000001E0 (rel)
dd ?_0082-$+34H ; 0034 _ 0000023D (rel)
dd ?_0084-$+38H ; 0038 _ 00000285 (rel)
dd ?_0072-$+3CH ; 003C _ 00000114 (rel)
dd ?_0078-$+40H ; 0040 _ 000001CF (rel)
As you can see, the FPC 2.7.1 compiler generates a branch table, so
will perform much better.
The single movsxd rax, dword ptr [rdx+rax*4]
instruction replaces a huge list of cmp/jz
statements.
Sounds like if the Open Source FreePascal compiler generates better
code than Delphi's,
not only for floating-point computations, but for simple general-usage
code.
BTW the floating-point regression issue in XE6 was marked as resolved in
QC and fixed in XE6 update 1. But still slower than FPC on 32
bit...