Re: SEGFAULT if bash script make "source" for itself

From: bogun.dmitriy
Subject: Re: SEGFAULT if bash script make "source" for itself
Date: Thu, 28 Aug 2014 14:59:48 -0700

2014-08-28 14:43 GMT-07:00 Eric Blake <address@hidden>:

> On 08/28/2014 03:00 PM, address@hidden wrote:
> >>> Whey there is check on division by zero? We can predict this? - No. But
> >> we
> >>> can detect it... and we out nice, detailed error message.
> >>
> >> Actually, division by zero is fairly easy to check, and this is probably
> >> a case where bash is checking for division by 0 up front rather than
> >> handling SIGFPE after the fact.
> >>
> > Is it so heavy to check length of $BASH_SOURCE array?
> Checking the length of $BASH_SOURCE array (or indeed, ANY check of some
> counter compared to the current recursion depth) is only an
> approximation.  It tells whether you are nearing an artificial limit.
> It does NOT tell you if you will overflow the stack (it's possible to
> set the variable too high, and still trigger a stack overflow; more
> likely, if you set the variable too low, someone can come up with a
> recursive program that _would_ have completed had it been granted full
> access to the stack but now fails because your limit got in the way).
> For _certain_ cases of programming, determining maximum stack usage is
> computible (start at the leaves, figure out how much they allocate, then
> work your way up the call stack).  But the moment you introduce
> recursion into the mix, where the recursion is conditionally gated on
> user input, it is _inherently impossible_ to compute the maximum stack
> depth for all possible program execution flows, shy of actually
> executing the program.  Bash, and many other scripting languages, are in
> such a boat - by giving the end user the power to write a recursive
> function, they are also giving the end user the power to exhaust an
> unknowable stack depth.
> As long as the interpreter implements user recursion by using recursive
> functions itself, there is no way to say "if I call your next function,
> I will overflow the stack, so pre-emptively error out now, but keep my
> interpreter running".  The _BEST_ we can do is detect that "I just ran
> out of stack, but in jumping to the SIGSEGV handler, I can't guarantee
> whether I interrupted a malloc or any other locked code, therefore I
> cannot safely use malloc or any other locking function between now and
> calling _exit()".
> It is possible to write a class of programs that GUARANTEE that if stack
> overflow happens, that it did not happen within any core function that
> might hold a lock, and therefore the program can longjmp back to a safe
> point, abort the overflowing operation, and carry on with life.  But it
> is EXTREMELY TRICKY to do - you have to be absolutely vigilant that you
> separate your code into two buckets - the set of code that might obtain
> any lock, but is used non-recursively (and therefore you can compute the
> maximum stack depth of that code), and the set of code that recurses,
> but cannot obtain any lock without first checking that the current stack
> depth plus the maximum depth of the locking code will still fit in the
> stack.  With a program like that, you can then pre-emptively detect
> stack overflow for the next call into non-recursive code without relying
> on SIGSEGV (you'd still want the SIGSEGV handler for the recursive part,
> but can now longjmp back to your non-recursive outer handler).  But it
> is not practical, and would mean a complete rewrite of the bash source
> code, and probably even a parallel stripped-down rewrite of glibc.
> In many cases, it is also possible to convert recursive code into
> iterative code; but usually, conversions like this involve trade-offs,
> such as requiring heap storage to track progress between iterations
> where the old code used the stack.  Again, doing such conversions to the
> bash code base would mean a complete rewrite.  And such a conversion is
> worthwhile only if everything doable in one leg of the recursion is
> known up front - but bash is a scripting language and can't predict what
> all user input code will want to do at each level of recursion, short of
> executing the script.
> >
> >> So why I should got SIGSEGV instead of nice, detailed error message in
> >>> recursion? We can detect it?
> >>
> >> GNU libsigsegv proves that it is possible to detect when SIGSEGV was
> >> caused by stack overflow.  It can't help prevent stack overflow, and you
> >> _don't_ want to penalize your code by adding checking code into the
> >> common case (if I'm about to overflow, error out instead), but leave
> >> stack overflow as the exceptional case (if I've already overflowed and
> >> received SIGSEGV, convert it into a nice error message to the user
> >> before exiting cleanly, instead of the default behavior of dumping
> >> core).  But someone would have to write the patch for bash to link
> >> against libsigsegv.
> >>
> > I undestand it. It better than getting SIGSEGV, but not a solution for
> this
> > issue. As I think.
> It's not clear what issue you think needs solving.
> If you are trying to solve the issue of "prevent all possible SIGSEGV
> from stack overflow", the answer is "it's impossible".
> If you are trying to solve the issue of "bash dumps core based on user
> input, but I'd prefer a nice error message telling me my program is
> buggy", the answer is "write a patch to use libsigsegv".
> If you are trying to solve the issue of "bash needs to give me a way to
> optionally artificially limit source recursion, so that I reduce the set
> of programs that can be run, but am less likely to trigger stack
> overflow", the answer is "write a patch to copy (or reuse) FUNCNEST
> limitations when sourcing" - but such a solution will not be on by
> default.  Which means that you will only know if the solution makes a
> difference for your program if you first run the program unlimited - but
> then you are back to the earlier question of whether bash can give a
> nice error message instead of a core dump when exiting due to stack
> overflow.

Perfect BUG solving. Thank you for "support".

