long long multiply-and-add far from optimal
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Linaro GCC |
Fix Released
|
Low
|
Unassigned |
Bug Description
This testcase compiles correctly, but give suboptimal code on ARM:
long long foolong (long long x, short *a, short *b)
{
return x + (long long)*a * (long long)*b;
}
Here's the output from upstream gcc:
ldrsh r3, [r3, #0]
ldrsh r2, [r2, #0]
push {r4, r5}
asrs r4, r3, #31
asrs r5, r2, #31
mul r4, r2, r4
mla r4, r3, r5, r4
umull r2, r3, r2, r3
adds r3, r4, r3
adds r0, r0, r2
adc r1, r1, r3
pop {r4, r5}
bx lr
It really should look more like this:
ldrh r2, [r2, #0]
ldrh r3, [r3, #0]
smlalbb r0, r1, r2, r3
bx lr
It's the same in gcc-4.6. 0-RC-20110314