[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[patches] fix ev500v2 build failure

The attached patch fixes a build failure with recent versions of GCC
when compiling eglibc for the PPC e500v2. The failure mode is that when

  /* Increment the counter.  */
  atomic_increment (&fromp->here->count);

from elf/dl-profile.c, which expands to:

  __asm__ __volatile__("1:	lwarx	%0,0,%2\n"
                       "	addi	%0,%0,1\n"
                       "	stwcx.	%0,0,%2\n"
                       "	bne-	1b"
                       :"=&b" __val, "=m" *&fromp->here->count
                       :"b" &fromp->here->count, "m" *&fromp->here->count
                       :"memory", "cr0");

GCC does not consider the output expression '*&fromp->here->count' to be
properly aligned (the e500 is a STRICT_ALIGNMENT target) and generates a
byte-wise load for the expression.  This places it in a register, rather
than in memory, as the constraint requires, which then causes an ICE.

'fromp' is a 'struct here_fromstruct':

  struct here_fromstruct
    struct here_cg_arc_record volatile *here;
    uint16_t link;

'struct here_cg_arc_record' is defined as follows:

  struct here_cg_arc_record
    uintptr_t from_pc;
    uintptr_t self_pc;
    uint32_t count;
  } __attribute__ ((packed));

The fix is to declare the 'count' member as being properly aligned
despite the 'packed' attribute declared on the containing structure:

struct here_cg_arc_record
    uintptr_t from_pc;
    uintptr_t self_pc;
    uint32_t count __attribute__((aligned(__alignof__(uint32_t))));
  } __attribute__ ((packed));

Otherwise GCC believes that 'count' could fall on a non-aligned address
and therefore exhibit the failure described above.  However, we need to
show that 'count' will always fall on a 'uint32_t' boundary.

Working backwards, the expected size of the data structure 'struct
here_cg_arc_record' resides in is:

  expected_size = (sizeof (struct gmon_hdr)
                   + 4 + sizeof (struct gmon_hist_hdr) + kcountsize
                   + 4 + 4 + fromssize * sizeof (struct here_cg_arc_record));

Everything in this expression is a multiple of four, except for
'kcountsize'.  If it weren't for the 'kcountsize' in there, then the
members of 'struct here_cg_arc_record' would always be at least
four-byte aligned and we would have no issues.  The question, then, is
whether 'kcountsize' is a multiple of four.

Looking at the source code, 'kcountsize' is determined by the following

  lowpc = ROUNDDOWN (mapstart + GL(dl_profile_map)->l_addr,
                     HISTFRACTION * sizeof (HISTCOUNTER));
  highpc = ROUNDUP (mapend + GL(dl_profile_map)->l_addr,
                    HISTFRACTION * sizeof (HISTCOUNTER));
  textsize = highpc - lowpc;
  kcountsize = textsize / HISTFRACTION;

'mapstart' and 'mapend' are page-aligned values, so they will be
multiples of four in any reasonable operating environment.  HISTFRACTION
is 2 and HISTCOUNTER is 'unsigned short', so 'HISTFRACTION * sizeof
(HISTCOUNTER)' will be 4.  So 'lowpc' and 'highpc' will be four-byte
aligned, as will 'textsize'.

The problem, then, is that 'textsize' is divided by 2, giving
'kcountsize' only two-byte alignment.  However, if we assume that
'GL(dl_profile_map)->l_addr' is *also* page-aligned and that
page-alignment implies at least eight-byte alignment, then 'lowpc' and
'highpc' also have at least eight-byte alignment.  It follows, then,
that kcountsize has at least four-byte alignment.  Which means that the
bits of 'struct here_arc_cg_record' always wind up on at least four-byte

And so we have shown that placing an '__attribute__ ((aligned
(__alignof__(uint32_t))))' on the 'count' member in 'struct
here_cg_arc_record' only serves to inform the compiler of how things
will look at runtime so the compiler can generate proper code.

OK to commit?

2007-03-12  Nathan Froyd  <froydnj@xxxxxxxxxxxxxxxx>

	* elf/dl-profile.c (struct here_cg_arc_record): Declare 'count'
	as being properly aligned.

Index: elf/dl-profile.c
--- elf/dl-profile.c	(revision 165331)
+++ elf/dl-profile.c	(working copy)
@@ -131,7 +131,7 @@ struct here_cg_arc_record
     uintptr_t from_pc;
     uintptr_t self_pc;
-    uint32_t count;
+    uint32_t count __attribute__((aligned(__alignof__(uint32_t))));
   } __attribute__ ((packed));
 static struct here_cg_arc_record *data;