Re: Dynamic loading progress

emacs-devel
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Dynamic loading progress

From:	Eli Zaretskii
Subject:	Re: Dynamic loading progress
Date:	Fri, 20 Nov 2015 11:53:19 +0200
> From: Philipp Stephani <address@hidden>
> Date: Thu, 19 Nov 2015 22:41:09 +0000
> Cc: address@hidden
> 
> Thanks for the thorough review; I'll prepare patches for all of these issues.

Thank you.

>     This comment in module.c:
>     
>     /* If checking is enabled, abort if the current thread is not the
>     Emacs main thread. */
>     static void check_main_thread (void);
>     
>     confused me, because a later comment says:
>     
>     void module_init (void)
>     {
>     /* It is not guaranteed that dynamic initializers run in the main thread,
>     therefore we detect the main thread here. */
>     
>     If dynamic initializers might run in a thread different from the Emacs
>     main thread, then the code in module_init will record that other
>     thread, not the Emacs main thread, right?
> 
> No, because module_init is guaranteed to be called from the main thread 
> because
> main calls it explicitly.

So you are saying that module_init runs in the main thread, but
dynamic initializers might run in another thread?  How's that
possible?

>     Also, another comment:
>     
>     /* On Windows, we store both a handle to the main thread and the
>     thread ID because the latter can be reused when a thread
>     terminates. */
>     
>     seems to imply that 'main_thread' here is not the Emacs's main thread,
>     because that thread never terminates as long as the Emacs session is
>     alive.
>     
>     So what's the deal here? what does this thread checking supposed to
>     detect?
> 
> This guards against the Emacs main thread having exited while module code in
> some other thread is still running and attempting to call Emacs functions.

On what OS can this happen?  It cannot happen on Windows, AFAIK, so
complicating the code by using a handle is not necessary on Windows,
because the thread ID of the main Emacs thread will never be reused as
long as the Emacs session is alive.  Moreover, we already have that
thread's ID in a global variable called dwMainThreadId, computed
during the session startup (albeit after module_init, so it would need
to be moved to an earlier stage), so we could simply compare against
its value in check_main_thread.

> This is undefined behavior, but we included an explicit check if
> checking is enabled because that case is somewhat subtle.

I'm surprised this could happen on some OS.  Can you point to any
details about this?

>     In this code from module.c:
>     
>     Lisp_Object value = HASH_VALUE (h, i);
>     eassert (NATNUMP (value));
>     const EMACS_UINT refcount = XFASTINT (value);
>     if (refcount >= MOST_POSITIVE_FIXNUM)
>     {
>     module_non_local_exit_signal_1 (env, Qoverflow_error, Qnil);
>     return NULL;
>     }
>     
>     how can the 'if' clause ever be true? refcount is an Emacs integer,
>     as you have just verified, no? And if this somehow can happen, then
>     why isn't there a similar check in the other functions?
> 
> refcount can be MOST_POSITIVE_FIXNUM because that's an inclusive bound. It's
> important to check that case because later refcount is incremented by one, and
> if it's equal to MOST_POSITIVE_FIXNUM it would be outside the allowed range
> afterwards. No other function increments numbers, thus no other functions need
> this check.

OK, but then either there should have been a comment explaining this,
or, better, the test should have been after the addition.  (Which, by
some strange luck -- or maybe something else -- is just what Paul
already did ;-)

>     Re this fragment from module.c:
>     
>     Lisp_Object ret = list4 (Qlambda,
>     list2 (Qand_rest, Qargs),
>     documentation ? build_string (documentation) : Qnil,
>     list3 (module_call_func,
>     envobj,
>     Qargs));
>     
>     Thou shalt not use build_string, except when you _know_ the argument
>     will always be a pure-ASCII string. Practically, this means the
>     argument must be a constant ASCII string. See these messages (and the
>     preceding discussion, if you are interested) for the gory details:
>     
>     http://lists.gnu.org/archive/html/bug-gnu-emacs/2013-10/msg00955.html
>     http://lists.gnu.org/archive/html/bug-gnu-emacs/2013-10/msg00976.html
>     http://lists.gnu.org/archive/html/bug-gnu-emacs/2013-10/msg00979.html
>     
>     The above should call make_multibyte_string instead.
>     
> 
> We had a discussion about encodings in
> https://github.com/aaptel/emacs-dynamic-module/issues/37. Sorry that this
> didn't get resolved earlier; it's an important point. My suggestion would be 
> to
> always mandate specifying an encoding whenever a char* is passed, and limit
> that to two or three functions dealing with creating strings and accessing
> string contents. Would that address your concerns?

No, this is not about encoding at all.  This is about calling
build_string: you should only call it when the argument is a fixed
ASCII string.  If the argument is not fixed or might include non-ASCII
characters, you should call make_multibyte_string instead.  That's
because build_string might decide on its own whether to produce a
unibyte or a multibyte string, out of your control, whereas we always
want a multibyte string in this context.

build_string doesn't encode or decode its argument, it creates a Lisp
object whose text is taken from the argument.  It's similar to
make_number in that respect.

>     Again in module.c:
>     
>     /*
>     * Emacs internal encoding is more-or-less UTF8, let's assume utf8
>     * encoded emacs string are the same byte size.
>     */
>     
>     if (!buffer || length == 0 || *length-1 < raw_size)
>     {
>     *length = raw_size + 1;
>     return false;
>     }
>     
>     I don't understand why you need this assumption. You are going to
>     encode the string in a moment, so why not test 'length' against the
>     size you actually obtain there? (The above test will misfire when the
>     original string includes characters above the Unicode codespace, which
>     require 5 bytes internally, but their encoding maps them to Unicode
>     codepoints which cannot take more than 4 bytes. So you might reject
>     perfectly valid calls.)
>     
>     In module_make_string you have:
>     
>     /* Assume STR is utf8 encoded */
>     return lisp_to_value (env, make_string (str, length));
>     
>     The discussion I pointed to above concluded that <quote>make_string is
>     a bug</quote>. So please use make_multibyte_string here instead.
>     
> 
> See above; my suggestion would be to change the string handling code by
> limiting encoding and decoding to a small set of functions where the encoding
> would have to be specified explicitly.

Once again, that's not the issue.  AFAIU, you have already specified
(albeit implicitly) that all strings passed as arguments to and from
the module functions are UTF-8 encoded.  And that's fine by me, that's
not what these comments are about.

The first comment is about a minor issue in
module_copy_string_contents.  Consider this fragment from that
function:

  ptrdiff_t raw_size = SBYTES (lisp_str);

  /* Emacs internal encoding is more-or-less UTF8, let's assume utf8
     encoded emacs string are the same byte size.  */

  if (!buffer || length == 0 || *length-1 < raw_size)
    {
      *length = raw_size + 1;
      return false;
    }

  Lisp_Object lisp_str_utf8 = ENCODE_UTF_8 (lisp_str);
  eassert (raw_size == SBYTES (lisp_str_utf8));
  *length = raw_size + 1;
  memcpy (buffer, SDATA (lisp_str_utf8), SBYTES (lisp_str_utf8));
  buffer[raw_size] = 0;

The comment and the 'if' clause after it in effect tell that you want
to check whether 'buffer' has enough space to hold the encoded string,
but you don't know how many bytes will be needed for that.  So you
make some assumption (which, as I pointed out, could be wrong in rare
cases), and reject the strings that fail your test.  What I'm sating
is that after the call to ENCODE_UTF_8, you can compute the exact
length in bytes of the UTF-8 encoded string, so if you move the test
there, you won't have to make any assumptions and you won't err.

The second comment is again about calling make_string (which
build_string calls internally): use make_multibyte_string instead.

>     static void module_set_user_finalizer (emacs_env *env,
>     emacs_value uptr,
>     emacs_finalizer_function fin)
>     {
>     check_main_thread ();
>     eassert (module_non_local_exit_check (env) == emacs_funcall_exit_return);
>     const Lisp_Object lisp = value_to_lisp (uptr);
>     if (! USER_PTRP (lisp)) module_wrong_type (env, Quser_ptr, lisp);
>     XUSER_PTR (lisp)->finalizer = fin; <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
>     }
>     
>     No validity checks on 'fin'?
>     
> How should it be validated? In C an arbitrary (invalid) pointer could be
> passed. I think we just have to accept that this is UB.

Given the extensive checks elsewhere in the file, this surprised me,
that's all.  How to test it? calling valid_pointer_p would be one
thing I'd suggest.  Maybe there are other (perhaps platform-dependent)
checks possible here, I would have to look around.  For example, it
would be good to make sure this points into executable code.

>     Shouldn't module-init call dynlib_close before it returns? Otherwise
>     we are leaking descriptors here, no?
> 
> My impression from reading dlclose(3) is that modules shouldn't be unloaded
> while they are still used.

Yes, it was my mistake in understanding how the dynamic linking works
in this case.  Sorry.

>     if (len > INT_MAX || len < envptr->min_arity || (envptr->max_arity >= 0 &&
>     len > envptr->max_arity))
>     xsignal2 (Qwrong_number_of_arguments, module_format_fun_env (envptr),
>     make_number (len));
>     
>     Why the test against INT_MAX? EMACS_INT can legitimately be a 64-bit
>     data type with values far exceeding INT_MAX. Why impose this
>     limitation here?
> 
> Because the nargs argument in the module interface is an int.

Shouldn't it be EMACS_INT instead?

> If you think functions with more than INT_MAX arguments should be supported,
> the type for nargs should be changed to int64.

I thought your design decision was already to use int64 for any
integer argument, no?

Anyway, something that comes from Lisp should allow EMACS_INT, IMO.

>     allocate_emacs_value calls malloc; shouldn't it call xmalloc instead,
>     or at least conform to the XMALLOC_BLOCK_INPUT_CHECK protocol?
> 
> If xmalloc is called, then we need to make sure that no signals (longjmps) can
> escape to module code. If appropriate setjmps are in place that should be
> doable, but I need to check whether there are edge cases.

Paul already addressed that: xmalloc in Emacs does double or even
triple duty, and it would be a shame to lose that here.  So I think we
should try to find a way to modify xmalloc such that it satisfies the
module code as well.

>     Btw, I wonder whether we should provide a more friendly capabilities
>     for retrieving the function name and its module. dladdr is not
>     portable enough, and we have all the information at our fingertips
>     in the module's init function, we just need to keep it instead of
>     throwing it away. I envision debugging module-related problems will
>     not be a very rare situation, so we need any help we can get. WDYT?
> 
> Hmm, I don't know whether we have access to the function name without using
> dladdr.

I thought about recording their names when you bind them to Lisp
symbols.

>     About the tests: The Makefile in mod-test is Unix-specific: it uses a
>     literal .so extension. I also think the Python script should be
>     rewritten in Emacs Lisp, so that Python installation is not required.
>     Finally, all of the module tests and associated files should be moved
>     into test/, preferably even test/automated/ and made part of the "make
>     check" run.
> 
> Yes, tracked in https://github.com/aaptel/emacs-dynamic-module/issues/34 

You will see that I put there some preliminary stuff, so at least
these tests can be run manually on Windows, by overriding a single
Make variable from the command line.

Thanks.
[Prev in Thread]
Current Thread
[Next in Thread]
Re: Dynamic loading progress, (continued)
Prev by Date: Re: Making --with-wide-int the default
Next by Date: RE: Apparent regression on the undo command
Previous by thread: Re: Dynamic loading progress
Next by thread: Re: Dynamic loading progress
Index(es):
- Date
- Thread