Re: Dynamic loading progress

emacs-devel
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Dynamic loading progress

From:	Eli Zaretskii
Subject:	Re: Dynamic loading progress
Date:	Fri, 20 Nov 2015 23:56:04 +0200
> From: Philipp Stephani <address@hidden>
> Date: Fri, 20 Nov 2015 19:58:25 +0000
> Cc: address@hidden, address@hidden, address@hidden
> 
>     So you are saying that module_init runs in the main thread, but
>     dynamic initializers might run in another thread? How's that
>     possible?
> 
> According to http://stackoverflow.com/a/29776416 it is not guaranteed that
> dynamic initializers run in the main thread.

Ah, you are talking about C++ dynamic initializers!  So the model is
that someone writes a module in C++, starts a thread there, and then
inside some dynamic initializer calls emacs-module interface
functions, is that it?  If that's the situation, I'd suggest a
prominent commentary describing this in the source.

>     On what OS can this happen? It cannot happen on Windows, AFAIK,
> 
> http://blogs.msdn.com/b/oldnewthing/archive/2010/08/27/10054832.aspx seems to
> indicate that it is possible that other threads continue running after the 
> main
> thread has exited.

Not in Emacs: we use the C runtime, and never call ExitThread, let
alone ExitProcess.  Doing that would be a silly bug.  There's no need
to make a single component of Emacs, and not the most important one,
protected to such extreme degree against calamities that are only
possible if Emacs developers will lose their senses.

> My initial code didn't use a thread identifier at all because it could have
> been reused after the main thread ended, introducing a subtle bug.

No, it cannot be reused, because we hold a handle on the main thread,
see w32term.c around line 6985.  As long as a single handle is open on
a thread, it still exists from the OS POV, and its ID cannot be
reused.

>     > This is undefined behavior, but we included an explicit check if
>     > checking is enabled because that case is somewhat subtle.
>     
>     I'm surprised this could happen on some OS. Can you point to any
>     details about this?
>     
> 
> It's not ruled out by the C standard or the APIs of operating systems. If 
> Emacs
> doesn't call ExitThread or manipulates the C runtime, then indeed it can't
> happen (at least on Windows), but I think it's brittle to rely on such
> subtleties without explicitly checking for them (somebody might introduce a
> call to ExitThread in the future without looking at module code).

I think these precautions are unnecessary.

Anyway, thanks for explaining this, I now know how to change the code
to DTRT on MS-Windows wrt to the thread checks.

>     OK, but then either there should have been a comment explaining this,
>     or, better, the test should have been after the addition. (Which, by
>     some strange luck -- or maybe something else -- is just what Paul
>     already did ;-)
> 
> If the test gets moved after the addition, then we should have a verify
> (MAX_EMACS_UINT - 1 > MOST_POSITIVE_FIXNUM) or use __builtin_add_overflow to
> make sure the addition doesn't cause UB.

I don't think so.  Please take a look at the code Paul wrote there, I
don't see a problem with it.

>     > No validity checks on 'fin'?
>     >
>     > How should it be validated? In C an arbitrary (invalid) pointer could be
>     > passed. I think we just have to accept that this is UB.
>     
>     Given the extensive checks elsewhere in the file, this surprised me,
>     that's all. How to test it? calling valid_pointer_p would be one
>     thing I'd suggest. Maybe there are other (perhaps platform-dependent)
>     checks possible here, I would have to look around. For example, it
>     would be good to make sure this points into executable code.
>     
> You can't validate pointers in C. For example, is the following pointer valid?
> 
> long a;
> double *p = (double *)(&a);
> 
> The answer is no; this is an aliasing violation, and dereferencing p is UB, 
> but
> you won't detect this by trying to read the process's address space.

The function valid_pointer_p validates a pointer in the sense that it
points into the program's address space.  That's not a full
validation, and won't catch unaligned pointers or pointers into
non-executable portions of memory, but it's better than nothing.  I'd
certainly consider it under "#ifdef ENABLE_CHECKING" at least.

> See also
> http://blogs.msdn.com/b/oldnewthing/archive/2006/09/27/773741.aspx.

We don't use IsBadWritePtr on Windows to check this, see
w32_valid_pointer_p for how this is actually implemented.

Anyway, I'm surprised by this extreme POV: even if we cannot validate
a pointer 100%, surely it doesn't mean we cannot or shouldn't do some
partial job?  Why this "all or nothing" approach?

>     > if (len > INT_MAX || len < envptr->min_arity || (envptr->max_arity >= 0
>     &&
>     > len > envptr->max_arity))
>     > xsignal2 (Qwrong_number_of_arguments, module_format_fun_env (envptr),
>     > make_number (len));
>     >
>     > Why the test against INT_MAX? EMACS_INT can legitimately be a 64-bit
>     > data type with values far exceeding INT_MAX. Why impose this
>     > limitation here?
>     >
>     > Because the nargs argument in the module interface is an int.
>     
>     Shouldn't it be EMACS_INT instead?
> 
> EMACS_INT is an internal type that isn't exposed to module authors.

OK, but the type of nargs is ptrdiff_t, so the test should be against
the maximum of that type.  On 64-bit hosts, ptrdiff_t is a 64-bit
type, while an int is still 32-bit wide.

> Could xmalloc grow a variant that is guaranteed not to call longjmp?

We need to devise a way for it to detect that it was called from
emacs-module.c, the rest is simple, I think.

Thanks.
[Prev in Thread]
Current Thread
[Next in Thread]
Re: Dynamic loading progress, (continued)
Prev by Date: Re: Future release schedules
Next by Date: Re: Dynamic loading progress
Previous by thread: Re: Dynamic loading progress
Next by thread: Re: Dynamic loading progress
Index(es):
- Date
- Thread