gromacs ns.c miscompiled on armel with -O1 -fcse-follow-jumps -finline-small-functions

Bug #693502 reported by Dr. David Alan Gilbert
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Linaro GCC
Fix Released
Medium
Richard Sandiford
Linaro GCC Tracking
Fix Released
Undecided
Unassigned
gcc
Fix Released
Critical
gcc-4.5 (Ubuntu)
Fix Released
Undecided
Unassigned

Bug Description

Gromacs (as included in the SPEC2006 benchmark) fails with the error:

Fatal error: One of the box vectors has become shorter than twice the cut-off length.

when built with -O2 but works with -O1; I've followed it down to
-O1 -fcse-follow-jumps -finline-small-functions failing, which seems to be the minimal set and this is the
ns.c file of Gromacs that needs those options to fail.

Dave

Related branches

visibility: public → private
Changed in gcc-linaro:
assignee: nobody → Dr. David Alan Gilbert (davidgil-uk)
Revision history for this message
Dr. David Alan Gilbert (davidgil-uk) wrote :

Mis-inlining of norm2 into search_neighbours; norm2 is a simple vector x[0]*x[0]+x[1]*x[1]+x[2]*x[2] but it looks like it's using the wrong base address for the 2nd and 3rd calls to norm2 (thinking it can share with the 1st one).

Dave

Revision history for this message
Dr. David Alan Gilbert (davidgil-uk) wrote :

I think it's just the 2nd call that's wrong; the calls are to norm2 all based off the same matrix, with the first call being at the base, the 2nd at base+12 and the 3rd at base+24; it looks to me like some of the accesses in the 2nd call are being made to base rather than base+12 and then shared with the loads from the 1st call.

From the rtl dumps, I think ns.c.169r.loop2_done is ok, but ns.c.193r.split2 is bad.

I'm not sure about ns.c.184r.subreg2 ; it looks like it's trying to post increment 12 onto the address but then I suspect it's that post increment that isn't being properly followed through:

-----------------------------------------------------------------
(insn 2430 105 106 2 ns.c:1677 (set (reg/f:SI 571 [ D.9349 ])
        (reg/v/f:SI 849 [ box ])) 591 {*thumb2_movsi_vfp} (nil))

(insn 106 2430 107 2 ns.c:1677 (set (reg:SF 861)
        (mem/s/j:SF (post_modify:SI (reg/f:SI 571 [ D.9349 ])
                (plus:SI (reg/f:SI 571 [ D.9349 ])
                    (const_int 12 [0xc]))) [0 S4 A32])) 598 {*thumb2_movsf_vfp} (expr_list:REG_INC (reg/f:SI 571 [ D.9349 ])
        (nil)))

(insn 107 106 109 2 ns.c:1677 (set (mem/s/j:SF (plus:SI (reg/f:SI 25 sfp)
                (const_int -12 [0xfffffffffffffff4])) [0 box_size+0 S4 A32])
        (reg:SF 861)) 598 {*thumb2_movsf_vfp} (nil))

(insn 109 107 110 2 ns.c:1677 (set (reg:SF 862)
        (mem/s/j:SF (plus:SI (reg/f:SI 571 [ D.9349 ])
                (const_int 4 [0x4])) [0 S4 A32])) 598 {*thumb2_movsf_vfp} (nil))

-----------------------------------------------------------------

then later we have:
(insn 131 129 133 4 vec.h:355 (set (reg:SF 389 [ D.11242 ])
        (mem:SF (plus:SI (reg/v/f:SI 849 [ box ])
                (const_int 12 [0xc])) [0 S4 A32])) 598 {*thumb2_movsf_vfp} (nil))

(insn 133 131 134 4 vec.h:355 (set (reg:SF 396 [ D.11235 ])
        (mem:SF (plus:SI (reg/f:SI 571 [ D.9349 ])
                (const_int 8 [0x8])) [0 S4 A32])) 598 {*thumb2_movsf_vfp} (nil))

(note 134 133 135 4 NOTE_INSN_DELETED)

(insn 135 134 136 4 vec.h:355 (set (reg:SF 869)
        (mult:SF (reg:SF 862)
            (reg:SF 862))) 615 {*mulsf3_vfp} (expr_list:REG_DEAD (reg:SF 862)
        (nil)))
-----------------------------------------------------------------

and I believe that insn 133/131/134 should be using the value from box+12+8, and insn 135 is uisng the SF 862 which
I believe should be the value from box+12+4 which will only happen if that post inc happens.

Dave

visibility: private → public
Revision history for this message
In , Ibolton (ibolton) wrote :
Download full text (3.7 KiB)

As of r162558 on trunk, and r164498 on 4.5 branch, SpecCPU2000 Ammp has failed for ARM -O3 thumb. The same patch was committed for both of these revisions, intending to fix pr45051:

Index: gcc/reload1.c
===================================================================
--- gcc/reload1.c (revision 164495)
+++ gcc/reload1.c (revision 164498)
@@ -8325,6 +8325,8 @@ delete_output_reload (rtx insn, int j, i
   int n_inherited = 0;
   rtx i1;
   rtx substed;
+ unsigned regno;
+ int nregs;

   /* It is possible that this reload has been only used to set another reload
      we eliminated earlier and thus deleted this instruction too. */
@@ -8376,6 +8378,12 @@ delete_output_reload (rtx insn, int j, i
   if (n_occurrences > n_inherited)
     return;

+ regno = REGNO (reg);
+ if (regno >= FIRST_PSEUDO_REGISTER)
+ nregs = 1;
+ else
+ nregs = hard_regno_nregs[regno][GET_MODE (reg)];
+
   /* If the pseudo-reg we are reloading is no longer referenced
      anywhere between the store into it and here,
      and we're within the same basic block, then the value can only
@@ -8387,7 +8395,7 @@ delete_output_reload (rtx insn, int j, i
       if (NOTE_INSN_BASIC_BLOCK_P (i1))
        return;
       if ((NONJUMP_INSN_P (i1) || CALL_P (i1))
- && reg_mentioned_p (reg, PATTERN (i1)))
+ && refers_to_regno_p (regno, regno + nregs, PATTERN (i1), NULL))
        {
          /* If this is USE in front of INSN, we only have to check that
             there are no more references than accounted for by inheritance. */

I assume the patch was meant to prevent deletions that shouldn't occur. This
might be what happens for the original symptomatic test, but I am now seeing
extra deletions that shouldn't happen for Ammp.

For example, without this patch, you get these insns somewhere in the ira dump
for mm_fv_update_nonbon() from rectmm.c:

 (insn 3163 3161 3164 107 rectmm.c:1041 (set (reg:SI 1 r1)
    (plus:SI (reg:SI 1 r1)
       (const_int 280 [0x118]))) 4 {*arm_addsi3} (nil))

 (insn 3164 3163 1730 107 rectmm.c:1041 (set (reg:SI 3 r3)
    (reg:SI 1 r1)) 586 {*thumb2_movsi_vfp} (nil))

With the patch, you lose the add and just get this:

 (insn 3164 3161 1730 107 rectmm.c:1041 (set (reg:SI 3 r3)
    (reg:SI 1 r1)) 586 {*thumb2_movsi_vfp} (nil))

The incrementing of r1 is perfectly legitimate and useful and removing it is a
bug. Other increments of r9, ip, r0 and r3 are also lost.

I think the issue might be that reg_mentioned_p() considers output registers to
have been "mentioned", whereas the refers_to_regno_p() does not consider an
output register to have been "referred to". I can see the problem with only
using reg_mentioned_p() because it doesn't handle subregs, but there is also a
problem with only using refers_to_regno_p(), as we can see with this segfault
in Ammp.

I therefore wonder if the fix might be this:

Index: gcc/reload1.c
===================================================================
--- gcc/reload1.c (revision 168082)
+++ gcc/reload1.c (working copy)
@@ -8395,7 +8395,8 @@ delete_output_reload (rtx insn, int j, i
       if (NOTE_INSN_BASIC_BLOCK_P (i1))
        return;
       if ((NONJUMP_INS...

Read more...

Revision history for this message
In , Bernds-gcc (bernds-gcc) wrote :

Please post a preprocessed source file and the exact compiler options that must be passed to cc1 to produce incorrect assembly. If possible, also include the two dump files from before/after things go wrong.

Revision history for this message
In , Ibolton (ibolton) wrote :

Created attachment 22896
Preprocessed source, before/after .s file, before/after IRA dump

Contains the following:

pr47166/rectmm.i - the preprocessed source

pr47166/CMD - how to compile it

pr47166/rectmm.r164495.i.188r.ira - working IRA dump
pr47166/rectmm.r164495.s - working assembly

pr47166/rectmm.r164498.i.188r.ira - broken IRA dump
pr47166/rectmm.r164498.s - broken assembly

Note that the revision numbers represent working/broken revisions of gcc 4.5 branch.

Revision history for this message
In , Ibolton (ibolton) wrote :

>
> pr47166/CMD - how to compile it
>

Sorry, I forgot to give cc1 commands:

cc1 -fpreprocessed rectmm.i -quiet -dumpbase rectmm.i -mthumb -mcpu=cortex-a9 -mfloat-abi=softfp -mfpu=vfpv3-d16 -auxbase rectmm -O3 -version -fno-tree-vectorize -fdump-rtl-ira-details -o rectmm.s

Revision history for this message
In , Hp-gcc (hp-gcc) wrote :

(In reply to comment #0)
> (insn 3163 3161 3164 107 rectmm.c:1041 (set (reg:SI 1 r1)
> (plus:SI (reg:SI 1 r1)
> (const_int 280 [0x118]))) 4 {*arm_addsi3} (nil))

Please consider my request in <http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45051#c8>.

Revision history for this message
In , Hp-gcc (hp-gcc) wrote :

(In reply to comment #4)
> (In reply to comment #0)
> > (insn 3163 3161 3164 107 rectmm.c:1041 (set (reg:SI 1 r1)
> > (plus:SI (reg:SI 1 r1)
> > (const_int 280 [0x118]))) 4 {*arm_addsi3} (nil))
>
> Please consider my request in
> <http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45051#c8>.

I mean, the <But r1 is an input as well as an output , i.e. "referred to", so insn 3163 should have matched without your patch. I'm missing analysis on why that didn't happen> part.

Changed in gcc-linaro:
status: New → Confirmed
assignee: Dr. David Alan Gilbert (davidgil-uk) → Richard Sandiford (rsandifo)
status: Confirmed → In Progress
Revision history for this message
In , Ibolton (ibolton) wrote :

> I mean, the <But r1 is an input as well as an output , i.e. "referred to", so
> insn 3163 should have matched without your patch. I'm missing analysis on why
> that didn't happen> part.

OK, I will do more analysis to try to determine what's going on.

Revision history for this message
In , Ibolton (ibolton) wrote :

I have altered the source, so that I call both refers_to_regno_p and reg_mentioned_p and print out when the two disagree. For the attached rectmm.i input file, there are only 5 disagreements:

mismatch for regno=2343: refersto=F,mentions=T
reg = (reg/f:SI 2343)
i1 = (insn 3162 1730 3165 99 rectmm.c:1041 (set (reg/f:SI 2343)
        (reg:SI 1 r1)) -1 (nil))

mismatch for regno=2355: refersto=F,mentions=T
reg = (reg/f:SI 2355)
i1 = (insn 3166 1745 3169 99 rectmm.c:1043 (set (reg/f:SI 2355)
        (reg:SI 9 r9)) -1 (nil))

mismatch for regno=2377: refersto=F,mentions=T
reg = (reg/f:SI 2377)
i1 = (insn 3170 1770 1731 99 rectmm.c:1045 (set (reg/f:SI 2377)
        (reg:SI 12 ip)) -1 (nil))

mismatch for regno=2349: refersto=F,mentions=T
reg = (reg/f:SI 2349)
i1 = (insn 3174 1737 3177 99 rectmm.c:1042 (set (reg/f:SI 2349)
        (reg:SI 0 r0)) -1 (nil))

mismatch for regno=2361: refersto=F,mentions=T - reg and i1 follow ...
reg = (reg/f:SI 2361)
il = (insn 3178 1752 1772 99 rectmm.c:1044 (set (reg/f:SI 2361)
        (reg:SI 3 r3)) -1 (nil))

This occurs because reg_mentioned_p looks at output regs, but refers_to_regno_p does not.

The knock-on effect of this difference must lead to those increments being lost, but I don't know why yet.

Revision history for this message
In , Bernds-gcc (bernds-gcc) wrote :

I think the real problem here is that when reloading autoincs, we somehow end up with

(gdb) p spill_reg_store[3]
$42 = (rtx) 0xf7a1118c
(gdb) pr
(insn 3163 3161 3164 99 rectmm.c:1041 (set (reg:SI 1 r1)
        (plus:SI (reg:SI 1 r1)
            (const_int 280 [0x118]))) 4 {*arm_addsi3} (nil))

which doesn't set R3 at all - insn 3164 does.

Revision history for this message
In , Bernds-gcc (bernds-gcc) wrote :

Created attachment 22945
Test patch

Could you try the following? It's a variant of a patch Richard Sandiford recently posted.

Revision history for this message
In , Ibolton (ibolton) wrote :

(In reply to comment #9)
> Created attachment 22945 [details]
> Test patch
>
> Could you try the following? It's a variant of a patch Richard Sandiford
> recently posted.

Thanks for looking into this.

The patch has worked on gcc 4.5 for Ammp, but I still need to test Applu (which is failing for current version of gcc 4.5 branch) and then Ammp on trunk. I'll get back to you. It's looking promising though.

Revision history for this message
In , Ibolton (ibolton) wrote :

I have now confirmed Richard's patch (http://gcc.gnu.org/ml/gcc-patches/2011-01/msg00548.html) is doing the right thing on 4.5 branch and trunk, for the tests that were previously failing.

Thanks very much for debugging this, Bernd.
Many thanks for your patch, Richard.

Revision history for this message
In , Ibolton (ibolton) wrote :

(In reply to comment #9)
> Created attachment 22945 [details]
> Test patch
>
> Could you try the following? It's a variant of a patch Richard Sandiford
> recently posted.

Just noticed you said this is a "variant" of Richard's patch. Hopefully the two patches can be reconciled?

Revision history for this message
In , Ebotcazou (ebotcazou) wrote :

Please consider reverting the reload change on the branch. I don't think we want to keep fiddling with reload there.

Revision history for this message
In , Bernds-gcc (bernds-gcc) wrote :

(In reply to comment #13)
> Please consider reverting the reload change on the branch. I don't think we
> want to keep fiddling with reload there.

I don't see how this makes any sense. We've identified two bugs in reload, both leading to incorrect code, and you'd rather revert the fix for one than fix both?

Revision history for this message
In , Ebotcazou (ebotcazou) wrote :

> I don't see how this makes any sense. We've identified two bugs in reload, both
> leading to incorrect code, and you'd rather revert the fix for one than fix
> both?

Yes, because the first one wasn't a regression and fixing the second may well cause a third to pop up, you never know with reload. This is called RM...

Revision history for this message
In , Hp-gcc (hp-gcc) wrote :

(In reply to comment #15)
> the first one wasn't a regression and fixing the second may well
> cause a third to pop up, you never know with reload. This is called RM...

FWIW, you'd then be in danger of regressing 41085.
I have to say it won't be very RM:y to proactively unfix two bugs just because it's reload, given two separate fixes...

Revision history for this message
In , Ebotcazou (ebotcazou) wrote :

> I have to say it won't be very RM:y to proactively unfix two bugs just because
> it's reload, given two separate fixes...

Well, the second is still not fixed as of this writing, that's why I suggested reverting the problematic first fix in the first place. As for "just because it's reload", yes, that's precisely the point; patching reload is always risky.

Revision history for this message
In , Rguenth (rguenth) wrote :

So am I right and this is a regression on the branch? Please fill in
known-to-work and known-to-fail versions. Thx.

Revision history for this message
In , Hp-gcc (hp-gcc) wrote :

Correct dependencies to include PR41085 (which _was_ a 4.5 regression, unlike PR45051 which was only a regression on 4.6).

Revision history for this message
In , Bernds-gcc (bernds-gcc) wrote :

Author: bernds
Date: Sun Jan 23 21:11:24 2011
New Revision: 169144

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=169144
Log:
 PR rtl-optimization/47166
 * reload1.c (emit_reload_insns): Disable the spill_reg_store
 mechanism for PRE_MODIFY and POST_MODIFY.
 (inc_for_reload): For PRE_MODIFY, return the insn that sets the
 reloadreg.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/reload1.c

Revision history for this message
In , Jakub-gcc (jakub-gcc) wrote :

So is this now fixed on the trunk? Can anyone run SPEC2k?

Revision history for this message
In , Ibolton (ibolton) wrote :

(In reply to comment #21)
> So is this now fixed on the trunk? Can anyone run SPEC2k?

I can run it. I will report back when done.

Revision history for this message
In , Ibolton (ibolton) wrote :

(In reply to comment #21)
> So is this now fixed on the trunk? Can anyone run SPEC2k?

Spec2K's Ammp now runs correctly for trunk, with -mthumb -O3.

The rest of Spec2K is OK too (apart from mesa, which is the subject of another bug).

Revision history for this message
In , Jakub-gcc (jakub-gcc) wrote :

Fixed on the trunk then.

Revision history for this message
In , Ramana-gcc (ramana-gcc) wrote :

(In reply to comment #15)
> > I don't see how this makes any sense. We've identified two bugs in reload, both
> > leading to incorrect code, and you'd rather revert the fix for one than fix
> > both?
>
> Yes, because the first one wasn't a regression and fixing the second may well
> cause a third to pop up, you never know with reload. This is called RM...

I am curious to know what we are doing with this ?.

The patch for PR41085 caused this regression for ARM with the 4.5 branch and this fix is to fix the issue exposed by the fix for PR41085. Are we going to revert the fix for PR41085 or are we going to backport the current fix for this ?

Thanks
Ramana

Revision history for this message
In , Bernds-gcc (bernds-gcc) wrote :

I'll be testing the backport and committing the patch.

Revision history for this message
In , Ramana Radhakrishnan (ramana) wrote :

On Tue, Feb 1, 2011 at 1:35 AM, bernds at gcc dot gnu.org
<email address hidden> wrote:
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47166
>
> --- Comment #26 from Bernd Schmidt <bernds at gcc dot gnu.org> 2011-02-01 01:34:59 UTC ---
> I'll be testing the backport and committing the patch.

Thanks - I wasn't sure I knew what was happening.

Ramana

>
> --
> Configure bugmail: http://gcc.gnu.org/bugzilla/userprefs.cgi?tab=email
> ------- You are receiving this mail because: -------
> You are on the CC list for the bug.
>

Revision history for this message
In , Bernds-gcc (bernds-gcc) wrote :

Author: bernds
Date: Fri Feb 11 16:01:19 2011
New Revision: 170053

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=170053
Log:
 PR rtl-optimization/47166
 * reload1.c (emit_reload_insns): Disable the spill_reg_store
 mechanism for PRE_MODIFY and POST_MODIFY.
 (inc_for_reload): For PRE_MODIFY, return the insn that sets the
 reloadreg.

Modified:
    branches/gcc-4_5-branch/gcc/ChangeLog
    branches/gcc-4_5-branch/gcc/reload1.c

Revision history for this message
In , Bernds-gcc (bernds-gcc) wrote :

I've had some problems with timeouts in my test setup, but I now have runs that differ only in that pthread1.cc times out for one multilib and no longer times out for another. I took that as good enough to commit the patch.

Revision history for this message
In , Ibolton (ibolton) wrote :

(In reply to comment #29)
> I've had some problems with timeouts in my test setup, but I now have runs that
> differ only in that pthread1.cc times out for one multilib and no longer times
> out for another. I took that as good enough to commit the patch.

Just wanted to confirm that it is working for me with 4.5 now as well. Thanks for all your efforts, Bernd.

Michael Hope (michaelh1)
Changed in gcc-linaro:
importance: Undecided → Medium
Revision history for this message
Andrew Stubbs (ams-codesourcery) wrote :

Richard has posted a merge request to fix this bug.

Changed in gcc:
importance: Unknown → Critical
status: Unknown → Fix Released
Revision history for this message
Andrew Stubbs (ams-codesourcery) wrote :

Richard's patch has now been committed to Linaro GCC 4.5.

Changed in gcc-linaro:
milestone: none → 4.5-2011.03-0
status: In Progress → Fix Committed
Revision history for this message
Andrew Stubbs (ams-codesourcery) wrote :

This is a backport from FSF 4.6.

Related: lp:gcc-linaro/4.5,revno=99482

Changed in gcc-linaro-tracking:
milestone: none → 4.6.0
status: New → Fix Committed
Michael Hope (michaelh1)
Changed in gcc-linaro:
status: Fix Committed → Fix Released
Revision history for this message
In , Rsandifo-gcc (rsandifo-gcc) wrote :

Author: rsandifo
Date: Mon Mar 14 13:46:12 2011
New Revision: 170939

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=170939
Log:
gcc/testsuite/
 PR rtl-optimization/47166
 * gcc.c-torture/execute/postmod-1.c: New test.

Added:
    trunk/gcc/testsuite/gcc.c-torture/execute/postmod-1.c
Modified:
    trunk/gcc/testsuite/ChangeLog

Revision history for this message
In , Rsandifo-gcc (rsandifo-gcc) wrote :

Author: rsandifo
Date: Mon Mar 14 13:48:46 2011
New Revision: 170940

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=170940
Log:
gcc/testsuite/
 PR rtl-optimization/47166
 * gcc.c-torture/execute/postmod-1.c: New test.

Added:
    branches/gcc-4_5-branch/gcc/testsuite/gcc.c-torture/execute/postmod-1.c
Modified:
    branches/gcc-4_5-branch/gcc/testsuite/ChangeLog

Revision history for this message
In , Rsandifo-gcc (rsandifo-gcc) wrote :

Author: rsandifo
Date: Tue Mar 15 09:38:07 2011
New Revision: 170982

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=170982
Log:
gcc/testsuite/
 PR rtl-optimization/47166
 * gcc.c-torture/execute/postmod-1.c: New test.

Added:
    branches/gcc-4_6-branch/gcc/testsuite/gcc.c-torture/execute/postmod-1.c
Modified:
    branches/gcc-4_6-branch/gcc/testsuite/ChangeLog

Changed in gcc-linaro-tracking:
status: Fix Committed → Fix Released
Revision history for this message
Matthias Klose (doko) wrote :

fixed in natty

Changed in gcc-4.5 (Ubuntu):
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.