bug-grub
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: address@hidden: Bug#293722: grub: simulated stack not marked PROT_EX


From: Peter Jones
Subject: Re: address@hidden: Bug#293722: grub: simulated stack not marked PROT_EXEC, causes segfaults on new hardware]
Date: Tue, 08 Feb 2005 18:13:16 -0500

On Sun, 2005-02-06 at 01:25 +1100, Jason Thomas wrote:
> ----- Forwarded message from Colin Watson <address@hidden> -----
> 
> > Date: Sat, 5 Feb 2005 10:50:40 +0000
> > From: Colin Watson <address@hidden>
> > To: address@hidden
> > Subject: Bug#293722: grub: simulated stack not marked PROT_EXEC, causes 
> > segfaults on new hardware
> > 
> > Package: grub
> > Version: 0.95+cvs20040624-12
> > Severity: important
> > 
> > When using Linux 2.6.10, grub's 'install' command segfaults on new
> > hardware that has the NX bit available (e.g. AMD64, and I think also new
> > Pentium 4 systems). This turns out to be because:

[agreed on all parts]

> > The attached patch corrects this problem (tested), and I believe should
> > be harmless on older systems. Please apply. Most of it came from the
> > mprotect() man page and/or is probably too obvious/short to be
> > copyrightable, but if I need to sign an assignment to have this go
> > upstream then I'll be happy to do so.

The mprotect man page is wrong and buggy in quite a variety of ways,
especially the example code that's been copied here.

More comments inline, and a new patch at the end.

> > --- grub-0.95+cvs20040624.orig/grub/asmstub.c
> > +++ grub-0.95+cvs20040624/grub/asmstub.c
> > @@ -42,6 +42,12 @@
> >  #include <sys/time.h>
> >  #include <termios.h>
> >  #include <signal.h>
> > +#include <sys/mman.h>
> > +
> > +#include <limits.h>
> > +#ifndef PAGESIZE
> > +#define PAGESIZE 4096
> > +#endif

Using PAGESIZE to determine addresses for PROT_EXEC regions is a bug.
The return value from sysconf(_SC_PAGESIZE) should be used instead, in
_any_ situation where you need to know the page size of the current
execution environment.

There's absolutely no guarantee that the page size in limits.h will be
the page size when a program is executed, or even that the page size
will be unchanged between two concurrent executions on the same machine.
This is especially true on multi-arch boxes, such as x86_64/em64t.

I don't know how much we really have to worry about this _on x86_ today,
but there's a right way, and grub should use it, so there aren't
problems when somebody does something fancy with new hardware in the
future.  This is, after all, the exact situation being fixed.

Arbitrarily deciding upon a page size, as the code above does, is also
an incredibly bad idea.  In practice, though, I think limits.h will
always set it; it just may not reflect page size in the compiled
executable's environment.

> > +    /* Align to a multiple of PAGESIZE, assumed to be a power of two. */
> > +    p = (char *) (((long) scratch) & ~(PAGESIZE - 1));

There are two big problems here.  First, scratch comes from malloc().
Second is the way it forms an aligned address.

mprotect() is defined as a function which manipulates protection flags
set on mapped files.  malloc() may be implemented using mmap(), but it
may also use brk() or sbrk() to allocate memory.  As such it is
forbidden to change protection flags using mprotect() on memory
allocated with malloc().

There's also no guarantee that malloc(), the rest of libc, or even the
kernel aren't using the memory in nearby locations for things that
depend on it not having PROT_EXEC.  It could well be, for instance, that
there's a guard page for the *real* stack there, or any number of other
important things.  In any case, changing the protection bits on memory
allocated with malloc() is forbidden.  So memory really should be
_allocated_ on a page boundary, and the size of the allocation should
also be a multiple of the page size.  The program may also execute if
posix_memalign() is used for allocation, but that'd still be incorrect,
as it may use malloc() underneath the covers.  Really, mmap() is the
only right answer.

Zeroing low-order bits from a pointer to get an aligned address for
memory that's already allocated is just plain wrong.

This means the program is changing the protection flags on parts of
itself other than those regions which have been allocated for this
purpose.  See above as to the dangers here.

The rule for mprotect() is that you must give it a page-aligned address
which points to memory which has been correctly allocated.  Any address
that isn't within a range which has been correctly allocated is not a
valid value to be passed to mprotect().

Strictly speaking, mprotect() should be returning -1 and setting errno
to EFAULT when it is passed this pointer.

> > +    /* The simulated stack needs to be executable, since GCC uses stack
> > +     * trampolines to implement nested functions.
> > +     */
> > +    ret = mprotect (p, 0x100000 + EXTENDED_MEMSIZE + 15,
> > +               PROT_READ | PROT_WRITE | PROT_EXEC);

And then because "p" is at some (near random) offset before "scratch",
this line doesn't tell mprotect to set the tail end of the buffer as
executable.

That is:

   |    allocated        |
--------------------------
|     protected       |

The line in the middle is our address space, the region above is where
memory has been allocated, and the area below is what's mapped
PROT_EXEC.  The next call to malloc() may well return the protected area
below our allocation, or we could later try to execute in our allocated
(but unprotected) space, either of which could cause catastrophic
results.

Also, just for the purposes of a dry-run harness such as this, it's
probably a good idea to grab an extra page at the end of this artificial
stack and set it to PROT_NONE.  But doing it sometimes by accident is
not the way :)

> > +    assert (ret == 0);
> > +  }
> > +
> >    grub_scratch_mem = (char *) ((((int) scratch) >> 4) << 4);

I realize this line is unchanged, but even so it's incredibly ugly.
Once again, the resulting address is below the start of the allocation.

Here's a patch.  I've only tested it on linux, where we have
MAP_ANONYMOUS, but I suspect it'll work elsewhere.  It was tested on
both an x86 and x86_64 hardware.  If anybody spots any problems or has
any suggestions, I'd be glad to hear them.

--- grub-0.95/grub/asmstub.c.nxstack    2005-02-08 12:17:23.000000000 -0500
+++ grub-0.95/grub/asmstub.c    2005-02-08 17:14:23.000000000 -0500
@@ -42,6 +42,7 @@
 #include <sys/time.h>
 #include <termios.h>
 #include <signal.h>
+#include <sys/mman.h>
 
 #ifdef __linux__
 # include <sys/ioctl.h>                /* ioctl */
@@ -82,7 +83,7 @@
 struct apm_info apm_bios_info;
 
 /* Emulation requirements. */
-char *grub_scratch_mem = 0;
+void *grub_scratch_mem = 0;
 
 struct geometry *disks = 0;
 
@@ -106,14 +107,62 @@
 static unsigned int serial_speed;
 #endif /* SIMULATE_SLOWNESS_OF_SERIAL */
 
+/* This allocates page-aligned storage of the specified size, which must be
+ * a multiple of the page size as determined by calling sysconf(_SC_PAGESIZE)
+ */
+#ifdef __linux__
+static void *
+grub_mmap_alloc(size_t len)
+{
+  int mmap_flags = MAP_ANONYMOUS|MAP_PRIVATE|MAP_EXECUTABLE;
+
+#ifdef MAP_32BIT
+  mmap_flags |= MAP_32BIT;
+#endif
+  /* Mark the simulated stack executable, as GCC uses stack trampolines
+   * to implement nested functions. */
+  return mmap(NULL, len, PROT_READ|PROT_WRITE|PROT_EXEC, mmap_flags, -1, 0);
+}
+#else /* !defined(__linux__) */
+static void *
+grub_mmap_alloc(size_t len)
+{
+  int fd = 0, offset = 0, ret = 0;
+  void *pa = MAP_FAILED; 
+  char template[] = "/tmp/grub_mmap_alloc_XXXXXX";
+  errno_t e;
+
+  fd = mkstemp(template);
+  if (fd < 0)
+    return pa;
+
+  unlink(template);
+
+  ret = ftruncate(fd, len);
+  if (ret < 0)
+    return pa;
+
+  /* Mark the simulated stack executable, as GCC uses stack trampolines
+   * to implement nested functions. */
+  pa = mmap(NULL, len, PROT_READ|PROT_WRITE|PROT_EXEC,
+                  MAP_PRIVATE|MAP_EXECUTABLE, fd, offset);
+
+  e = errno;
+  close(fd);
+  errno = e;
+  return pa;
+}
+#endif /* defined(__linux__) */
+
 /* The main entry point into this mess. */
 int
 grub_stage2 (void)
 {
   /* These need to be static, because they survive our stack transitions. */
   static int status = 0;
-  static char *realstack;
-  char *scratch, *simstack;
+  static void *realstack;
+  void *simstack_alloc_base, *simstack;
+  size_t simstack_size, page_size;
   int i;
 
   /* We need a nested function so that we get a clean stack frame,
@@ -143,9 +193,35 @@
   }
 
   assert (grub_scratch_mem == 0);
-  scratch = malloc (0x100000 + EXTENDED_MEMSIZE + 15);
-  assert (scratch);
-  grub_scratch_mem = (char *) ((((int) scratch) >> 4) << 4);
+
+  /* Allocate enough pages for 0x100000 + EXTENDED_SIZE + 15, and
+   * make sure the memory is aligned to a multiple of the system's
+   * page size */
+  page_size = sysconf (_SC_PAGESIZE);
+  simstack_size = ( 0x100000 + EXTENDED_MEMSIZE + 15);
+  if (simstack_size % page_size)
+    {
+      /* If we're not on a page_size boundary, round up to the next one */
+      simstack_size &= ~(page_size-1);
+      simstack_size += page_size;
+    }
+
+  /* Add one for a PROT_NONE boundary page at each end. */
+  simstack_size += 2 * page_size;
+
+  simstack_alloc_base = grub_mmap_alloc(simstack_size);
+  assert (simstack_alloc_base != MAP_FAILED);
+
+  /* mark pages above and below our simstack area as innaccessable.
+   * If the implementation we're using doesn't support that, then the
+   * new protection modes are undefined.  It's safe to just ignore
+   * them, though.  It'd be nice if we knew that we'd get a SEGV for
+   * touching the area, but that's all.  it'd be nice to have. */
+  mprotect (simstack_alloc_base, page_size, PROT_NONE);
+  mprotect ((void *)((unsigned long)simstack_alloc_base +
+                         simstack_size - page_size),  page_size, PROT_NONE);
+
+  grub_scratch_mem = (void *)((unsigned long)simstack_alloc_base + page_size);
 
   /* FIXME: simulate the memory holes using mprot, if available. */
 
@@ -218,7 +294,7 @@
   device_map = 0;
   free (disks);
   disks = 0;
-  free (scratch);
+  munmap(simstack_alloc_base, simstack_size);
   grub_scratch_mem = 0;
 
   if (serial_device)
--- grub-0.95/stage2/shared.h.nxstack   2005-02-08 14:56:38.000000000 -0500
+++ grub-0.95/stage2/shared.h   2005-02-08 17:17:23.000000000 -0500
@@ -36,7 +36,7 @@
 
 /* Maybe redirect memory requests through grub_scratch_mem. */
 #ifdef GRUB_UTIL
-extern char *grub_scratch_mem;
-# define RAW_ADDR(x) ((x) + (int) grub_scratch_mem)
+extern void *grub_scratch_mem;
+# define RAW_ADDR(x) ((x) + (unsigned long) grub_scratch_mem)
 # define RAW_SEG(x) (RAW_ADDR ((x) << 4) >> 4)
 #else

-- 
        Peter





reply via email to

[Prev in Thread] Current Thread [Next in Thread]