[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Dynamic loading progress

From: Eli Zaretskii
Subject: Re: Dynamic loading progress
Date: Thu, 19 Nov 2015 17:26:59 +0200

Thanks for making this possible.

Please allow me a few questions/comments about the dynamic modules
code, based on some initial reading:

Why module.c, but confusingly emacs_module.h?  Are there any reasons
not to use the same base name, like we do for other C sources?

This comment in module.c:

  /* If checking is enabled, abort if the current thread is not the
     Emacs main thread. */
  static void check_main_thread (void);

confused me, because a later comment says:

  void module_init (void)
    /* It is not guaranteed that dynamic initializers run in the main thread,
       therefore we detect the main thread here. */

If dynamic initializers might run in a thread different from the Emacs
main thread, then the code in module_init will record that other
thread, not the Emacs main thread, right?

Also, another comment:

  /* On Windows, we store both a handle to the main thread and the
     thread ID because the latter can be reused when a thread
     terminates. */

seems to imply that 'main_thread' here is not the Emacs's main thread,
because that thread never terminates as long as the Emacs session is

So what's the deal here? what does this thread checking supposed to

In this code from module.c:

      Lisp_Object value = HASH_VALUE (h, i);
      eassert (NATNUMP (value));
      const EMACS_UINT refcount = XFASTINT (value);
      if (refcount >= MOST_POSITIVE_FIXNUM)
          module_non_local_exit_signal_1 (env, Qoverflow_error, Qnil);
          return NULL;

how can the 'if' clause ever be true?  refcount is an Emacs integer,
as you have just verified, no?  And if this somehow can happen, then
why isn't there a similar check in the other functions?

Re this fragment from module.c:

  Lisp_Object ret = list4 (Qlambda,
                           list2 (Qand_rest, Qargs),
                           documentation ? build_string (documentation) : Qnil,
                           list3 (module_call_func,

Thou shalt not use build_string, except when you _know_ the argument
will always be a pure-ASCII string.  Practically, this means the
argument must be a constant ASCII string.  See these messages (and the
preceding discussion, if you are interested) for the gory details:


The above should call make_multibyte_string instead.

Again in module.c:

   * Emacs internal encoding is more-or-less UTF8, let's assume utf8
   * encoded emacs string are the same byte size.

  if (!buffer || length == 0 || *length-1 < raw_size)
      *length = raw_size + 1;
      return false;

I don't understand why you need this assumption.  You are going to
encode the string in a moment, so why not test 'length' against the
size you actually obtain there?  (The above test will misfire when the
original string includes characters above the Unicode codespace, which
require 5 bytes internally, but their encoding maps them to Unicode
codepoints which cannot take more than 4 bytes.  So you might reject
perfectly valid calls.)

In module_make_string you have:

  /* Assume STR is utf8 encoded */
  return lisp_to_value (env, make_string (str, length));

The discussion I pointed to above concluded that <quote>make_string is
a bug</quote>.  So please use make_multibyte_string here instead.

It looks like XUSER_PTR is used both as an lvalue and an rvalue.  This
is different from any other object, where we have separate Xfoo and
XSETfoo macros.  Suggest to follow suit.

  static void module_set_user_finalizer (emacs_env *env,
                                             emacs_value uptr,
                                             emacs_finalizer_function fin)
    check_main_thread ();
    eassert (module_non_local_exit_check (env) == emacs_funcall_exit_return);
    const Lisp_Object lisp = value_to_lisp (uptr);
    if (! USER_PTRP (lisp)) module_wrong_type (env, Quser_ptr, lisp);
    XUSER_PTR (lisp)->finalizer = fin; <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<

No validity checks on 'fin'?

In module_vec_get:

  /* Prevent error-prone comparison between types of different signedness. */
  const size_t size = ASIZE (lvec);
  eassert (size >= 0);

How can the assertion be ever violated?

In module-load:

  CHECK_STRING (file);
  handle = dynlib_open (SDATA (file));

Every Lisp primitive that accepts file arguments _must_ call
expand-file-name on the file name, before using it.  Otherwise,
relative file names will produce subtle and hard-to-debug problems
when the Lisp program calling them involves changing the current
directory of the Emacs process.

The other mandatory thing absent from the above is ENCODE_FILE.  You
cannot pass unencoded file names to C runtime functions.

  struct {
    struct emacs_runtime pub;
    struct emacs_runtime_private priv;
  } runtime = {
    .pub = {
      .size = sizeof runtime.pub,
      .get_environment = module_get_environment,
      .private_members = &runtime.priv

Is this portable enough?

  int r = module_init (&runtime.pub);

I think calling this function "module_init" is sub-optimal: you have a
void function by the same name in this source file.  How about
renaming it to something else?

Also, it seems that once a module is loaded, it cannot be unloaded
(i.e., no unload-feature equivalent is provided).  Is that on purpose?
An Emacs process can live for a very long time, so keeping all of the
modules open in it at all times is not necessarily TRT.  E.g., how do
I update to a newer version of a module?

Shouldn't module-init call dynlib_close before it returns?  Otherwise
we are leaking descriptors here, no?

In module-call:

  const EMACS_INT len = XINT (Flength (arglist));
  eassert (len >= 0);
    xsignal0 (Qoverflow_error);

How can the 'if' clause ever be true?  XINT by definition cannot
produce anything but a valid EMACS_INT, can it?

  if (len > INT_MAX || len < envptr->min_arity || (envptr->max_arity >= 0 && 
len > envptr->max_arity))
    xsignal2 (Qwrong_number_of_arguments, module_format_fun_env (envptr), 
make_number (len));

Why the test against INT_MAX?  EMACS_INT can legitimately be a 64-bit
data type with values far exceeding INT_MAX.  Why impose this
limitation here?

allocate_emacs_value calls malloc; shouldn't it call xmalloc instead,
or at least conform to the XMALLOC_BLOCK_INPUT_CHECK protocol?

In module_format_fun_env, you produce a unibyte string, and then use
that in calls to functions like xsignal1, which expect Lisp strings in
their internal multibyte representation.  You should instead decode
the unibyte string (using UTF-8) before you return it.

Btw, I wonder whether we should provide a more friendly capabilities
for retrieving the function name and its module.  dladdr is not
portable enough, and we have all the information at our fingertips
in the module's init function, we just need to keep it instead of
throwing it away.  I envision debugging module-related problems will
not be a very rare situation, so we need any help we can get.  WDYT?

In syms_of_module:

  /* Unintern `module-environments' because it is only used
     internally. */
  Funintern (Qmodule_environments, Qnil);

What if some Lisp defines a (interned) symbol by that name?  Won't
they clash?

The Windows-specific parts of dynlib.c need work, e.g. you cannot pass
UTF-8 encoded file names to Windows APIs.  And there are some other
issues.  I'll take care of that.

About the tests: The Makefile in mod-test is Unix-specific: it uses a
literal .so extension.  I also think the Python script should be
rewritten in Emacs Lisp, so that Python installation is not required.
Finally, all of the module tests and associated files should be moved
into test/, preferably even test/automated/ and made part of the "make
check" run.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]