[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[RP] Re: window numbering bug/race?

From: Mark Eichin
Subject: [RP] Re: window numbering bug/race?
Date: Sat, 26 Nov 2005 15:34:01 -0500

Ah, you found the line.  I've been poking at it for a couple of days,
and just found that line too, but from a different direction...

It *is* a code bug, not a compiler bug.  It's tricky though:
numset_find_empty_cell realloc's numbers taken... *which can cause it
to move*.  So the original assignment would then be writing into the
old memory address, if it looks up ns->numbers_taken before the call
and then makes the call and then does the assignment...  what made me
see this was printing out the numset and seeing it go from 0 1 2 3 4 5
6 7 8 9 to 0 1 2 3 4 5 6 7 8 9 65529, when the assignment clearly
*should* have been happenning.

So, this might be *masked* on some platforms by compiler differences
(though I'd have to dig into the ANSI spec and reread the stuff on
sequence points to convince me the compiler's allowed to do it both
ways - I *suspect* that any compiler that does the lookup after the
call (such that it doesn't show the problem) actually has a bug.)

This also tells me that I should have *started* this effort by firing
up Electric Fence - it would have caught this, and caused it to
segfault at the point of the assignment to old memory.  (But since
ratpoison wasn't crashing, I didn't suspect memory issues.)

So, does that convince you to commit the change in that form?

ps.  Here are some helper functions I found useful; everywhere that
had a "ns=%p", ns became a "ns=%ps(%s)", ns, nsname(ns), and
debug_numset got called in a bunch of places.  Now that you've nailed
the problem you probably don't need them, though...

static char *
nsname (struct numset *ns)
  if (ns == rp_window_numset) return "rp_window_numset";
  if (ns == rp_frame_numset) return "rp_frame_numset";
  /* various: rp_screen.frames_numset */
  return "???";

debug_numset (struct numset *ns)
#ifdef DEBUG
  int i;
  printf("DN: ns=%p(%s) taken=%d max=%d\n", ns, nsname(ns),
         ns->num_taken, ns->max_taken);
  for (i = 0; i < ns->max_taken; i++) {
    if (i < ns->num_taken)
      printf(" %d", ns->numbers_taken[i]);
      printf("(%d)", ns->numbers_taken[i]);
  printf("[nt=%p]\n", ns->numbers_taken);

On 11/25/05, Joshua Neuheisel <address@hidden> wrote:
> On 11/23/05, address@hidden <address@hidden> wrote:
> >
> > "ratpoison -c windows" shows that I have 12 windows, two of which are
> > numbered "10".  If I select window 9 and go "next", I get one of
> > them; if I select 0 and go "prev", I get the other.  One is an xterm,
> > the other is a dclock; they were all started sequentially using
> > xtoolwait (thus rapidly, but in a well defined order.)
> >
> > First noticed it with 1.3.0-7 under ubuntu; that doesn't mean it
> > wasn't happening under debian, but I hadn't *noticed* it there.  Built
> > from CVS a few days ago, with the latest ChangeLog entry being
> > 2005-11-05, and it still happens the same way.
> Alright, I think I have some new info here.  I was able to reliably
> reproduce the problem on MacOS X Tiger with gcc version as such:
> powerpc-apple-darwin8-gcc-4.0.0 (GCC) 4.0.0 20041026 (Apple Computer, Inc.
> build 4061)
> To fix it, I changed line 79 in src/number.c from
> ns->numbers_taken[numset_find_empty_cell(ns)] = n;
> to
> int ec;
> ec = numset_find_empty_cell(ns);
> ns->numbers_taken[ec] = n;
> and the problem went away.  When I stepped through the code, I saw that the
> return value from numset_find_empty_cell was being discarded, and the
> assigment was being ignored.  Obviously, this is a compiler error.  I was
> not able to reproduce it on my i686/linux machine running gcc version:
> gcc (GCC) 3.4.5 20051026 (prerelease).
> I was also not able to reproduce it on my MacOS X with the gcc-3.3 version:
> gcc-3.3 (GCC) 3.3 20030304 (Apple Computer, Inc. build 1809).
> What gcc version did the original poster use?  Or was it another compiler?
> Joshua

_Mark_ <address@hidden> <address@hidden>

reply via email to

[Prev in Thread] Current Thread [Next in Thread]