[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Axiom-developer] Re: [Gcl-devel] exec-shield mmap & brk randomization

From: Camm Maguire
Subject: [Axiom-developer] Re: [Gcl-devel] exec-shield mmap & brk randomization
Date: 19 Nov 2003 10:44:39 -0500
User-agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.2


Roland McGrath <address@hidden> writes:

> > Bingo.  In fact, we use exactly emacs' unexec.  I'm hoping your past
> > tense of the verb to break here indicates that a fix for unexec is
> > already at hand?
> Not exactly.  I do plan to resolve the situation in the Fedora Core 2
> development cycle, but exactly how is not yet clear.  I'd like to
> understand GCL's constraints as well as Emacs's before deciding what to do.

Thanks as always for your efforts!

> > Thankfully, GCL does not have to know precisely where its heap lower
> > bound is (apart from unexec) until the first call to sbrk(0) is made,
> > but we do absolutely need contiguity of this heap.  As long as you
> > don't dump brk altogether, I may retain my head of hair. :-).  
> Do you make many repeated sbrk calls to expand the heap?  If you make just

Yes.  This is to keep images from becoming unnecessarily large -- we
expand the heap just enough to hold the working code.

> one big allocation and don't really care what the base address is, then
> mmap is just as good for you.  However, it sounds like you want to choose
> the actual address range at compile time.  Is that so?

In short, at this moment, we just need a coarse floor to the heap
start at compile time.  We don't need to specify it, just to know what
it is.  There may be a relatively simple modification possible in
which we could forgo this compile time step in favor of a runtime
initialized static variable.

GCL uses several statically defined arrays to hold marking and type
information on the pages in the heap -- statically defined to separate
them from GCL's memory management system.  So basically all GCL needs
to my knowledge is some reliable way to map the elements of these
arrays to the pages in the heap.  Right now, this mapping is done by
running a little test in configure to get an idea of where sbrk will
start, and then taking its floor against some large divisor.  A DBEGIN
constant is compiled in, which on i386 Linux is currently 0x8000000.
A given heap page address is then mapped to the static array element
by subtracting this value, and rightshifting by the (log) PAGEWIDTH
(>>12). Even though sbrk will commence some pages above DBEGIN, this
is not critical, but will merely waste some initial elements in the
array. So the random start of sbrk, as long as it is above DBEGIN,
will merely waste a few more such elements, and reduce the actual
amount of heap allocatable somewhat further.

This extra 'heap wasteage' will be, alas, not a one time event in
typical GCL usage.  Applications built on GCL usually go through
several cycles of compile and load, save(unexec), restart new image,
compile and load, etc.  With each iteration, there will be a hole
introduced into the heap (presumably sbrk will in no circumstances
return an address below the existing .data section end).  GCL should
still continue to function, if I understand correctly.  How big is the
random sbrk range?  Could be quite small if one just wants to achieve
protection from attacks, no?

The modification I referred to above would be to skip the compile time
test, define DBEGIN as some static variable, and set this to the first
sbrk.  I think the windows port might already do something like this.
This would eliminate the first hole in the heap, as well as the
possibility of negative array offsets due to the skewing of the
compile-time and runtime sbrk.  It will not prevent subsequent holes
introduced through the typical development cycle mentioned above.
Right now I *think* we could survive this, but we won't know for sure
until all GCL built software is thoroughly tested.

> > Only in unexec.  If emacs has a fix, we can use it directly.
> But, in the binary produced by unexec, do you rely on the _end/end and
> _edata/edata symbols beind adjusted to included brk data allocated by the
> loadup run before the unexec?  (I haven't yet checked whether Emacs does.)

The only place I can possibly imagine this would be used is in
subsequent unexec's.  I.e. a load,save,restart,load,save,restart  must
contain the contents of the first load as well as the second.  If
unexec uses some other algorithm to achieve this, I don't think GCL
cares about _end and _edata (though I'd need to double check to be

> That is, if what unexec did were to just restore some particular memory
> allocated in the first run, disjoint from the original data segment, would
> that make you happy?

See above.  Disjoint should be workable, though a bit wasteful, as
long as we never lose old pages in multiple cycles.

> > I'm not really sure how much memory could be wasted, but this likely
> > seems a very small consideration compared to the complexity of
> > redesigning the garbage collector, etc.
> Sure.  Contiguity is inherently limited in the ways I mentioned, but there
> are plenty of reasons to like it if those limitations aren't your primary
> concern.  If you like contiguity, you just need to find the best ways to
> ask up front for all the contiguity you really need.

The linker script sounds interesting, though a bit complex.  Does the
image size immediately balloon to its full stature, or is it like
mmapping a file with a hole in it?

> > 2) come up with a configure time absolute lower bound to the first
> > sbrk after exec
> That is not something you ought to try to rely on in the current situation.
> It is in fact a known range at the moment, but if the brk randomization
> feature remains, you can't be sure the range will remain the same, or that
> a compile-time determination would apply correctly to running on slightly
> different kernels or different hardware configurations.

We may be able to change this at runtime as mentioned above.  At very
worst, (if the former does not work) we could put in a value by hand.

> > Else we must
> > 
> > 3) use setarch
> This is certainly the right stop-gap solution if you are concerned about
> people building GCL on FC1 tomorrow.  It's trivial to implement in the src
> rpm spec, and probably not worth putting in configure now since it likely
> won't be required for very long.

Tim, I'd like to know what your feelings/time horizons are.  I'm
thinking the best thing for GCL in general is to replace DBEGIN with a
runtime variable in any case if that does not break anything, and in
addition, offer a configure option
--enable-i-dont-want-holes-in-my-heap which will invoke setarch on
each created image.  (Would unexec preserve the effects of setarch?).
Then we wait for an unexec fix from our trusty friends at redhat. In
the meantime, the above configure option is mandatory.  Thoughts?

Take care,

> Thanks,
> Roland

Camm Maguire                                            address@hidden
"The earth is but one country, and mankind its citizens."  --  Baha'u'llah

reply via email to

[Prev in Thread] Current Thread [Next in Thread]