Re: Concurrency via isolated process/thread

emacs-devel
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Concurrency via isolated process/thread

From:	Ihor Radchenko
Subject:	Re: Concurrency via isolated process/thread
Date:	Sun, 09 Jul 2023 13:58:51 +0000
Eli Zaretskii <eliz@gnu.org> writes:

>> >> > Which variables can safely and usefully be made thread-local?
> ...
> In this case, the assumption is that the gap is always accurate except
> for short periods of time, during which buffer text cannot be accessed
> via buffer positions.

Thanks for the clarification.
But note that I was suggesting which global state variables may be
converted to thread-local.

I now understand that the gap can be moved by the code that is not
actually writing text in buffer. However, I do not see how this is a
problem we need to care about more than about generic problem with
simultaneous write.

If a variable or object value is being written, we need to block it.
If a buffer object is being written (like when moving the gap or writing
text), we need to block it. And this blocking will generally pose a
problem only when multiple threads try to access the same object, which
is generally unlikely.

The global state is another story. Redisplay, consing data, current
buffer, point, buffer narrowing, and C variables corresponding to
buffer-local Elisp variables are shared across all the threads, and
are often set by all the threads. And because point, narrowing, and some
buffer-locals (like `case-fold-search') are so ubiquitous, blocking them
will block everything. (Also, unwinding may contribute here, judging
from how thread.c juggles it)

So, if we want to have async support in Emacs, we need to find out how
to deal with each component of the global state without global locking:

1. There is consing, that is principally not asynchronous.

   It is not guaranteed, but I _hope_ that lock during consing
   can be manageable.

   We need to ensure that simultaneous consing will never happen. AFAIU,
   it should be ok if something that does not involve consing is running
   at the same time with cons (correct me if I am wrong here).

2. Redisplay cannot be asynchronous in a sense that it does not make
   sense that multiple threads, possibly working with different buffers
   and different points in those buffers, request redisplay
   simultaneously. Of course, it is impossible to display several places
   in a buffer at once.

   Only a single `main-thread' should be allowed to modify frames,
   window configurations, and generally trigger redisplay. And thread
   that attempts to do such modifications must wait to become
   `main-thread' first.

   This means that any code that is using things like
   `save-window-excursion', `display-buffer', and other display-related
   staff cannot run asynchronously.

   But I still believe that useful Elisp code can be written without a
   need to trigger redisplay. I have seen plenty of examples in Org and
   I have refactored a number of functions to avoid staff like
   `switch-to-buffer' in favour of `with-current-buffer'.

3. Current buffer, point position, and narrowing.

   By current design, Emacs always have a single global current buffer,
   current point position, and narrowing state in that buffer.
   Even when we switch cooperative threads, a thread must update its
   thread->current_buffer to previous_thread->current_buffer; and update
   point and narrowing by calling set_buffer_internal_2.

   Current design is incompatible with async threads - they must be able
   to have different buffers, points, and narrowing states current
   within each thread.

   That's why I suggested to convert PT, BEGV, and ZV into
   thread-locals.

   Note that PT, BEGV, and ZV are currently stored in buffer object
   before leaving a buffer and recovered when setting a new buffer.
   Async threads will make an assumption that
   (set-buffer "1") (goto-char 100) (set-buffer "2") (set-buffer "1")
   (= (point) 100) invalid.

4. Buffer-local variables, defined in C have C variable equivalents that
   are updated as Emacs changes current_buffer.
   
   AFAIU, their purpose is to make buffer-local variables and normal
   Elisp variables uniformly accessible from C code - C code does not
   need to worry about Vfoo being buffer-local or not, and just set it.

   This is not compatible with async threads that work with several buffers.

   I currently do not fully understand how defining C variables works in
   DEFVAR_LISP.

>> Asynchronous writing is probably a poor idea anyway - the very
>> idea of a gap does not work well when we need to write in multiple
>> far-away places in buffer.
>
> What if the main thread modifies buffer text, while one of the other
> threads wants to read from it?

Reading and writing should be blocked while buffer is being modified.

>> >> For example, `org-element-interpret-data' converts Org mode AST to
>> >> string. Just now, I tried it using AST of one of my large Org buffers.
>> >> It took 150seconds to complete, while blocking Emacs.
>> >
>> > It isn't side-effect-free, though.
>> 
>> It is, just not declared so.
>
> No, it isn't.  For starters, it changes obarray.

Do you mean `intern'? `intern-soft' would be equivalent there.

>> We do not, but it may be possible to add assertions that will ensure
>> purity in whatever sense we need.
>
> Those assertions will fire in any useful program with 100% certainty.
> Imagine the plight of an Emacs Lisp programmer who has to write and
> debug such programs.
>
> We have in Emacs gazillion lines of Lisp code, written, debugged, and
> tested during 4 decades.  We use those, almost without thinking, every
> day for writing Lisp programs.  What you suggest means throwing away
> most of that and starting from scratch.

There will indeed be a lot of work to make the range of Lisp functions
available for async code large enough. But it does not have to be done
all at once.

Of course, we first need to make sure that there are no hard blockers,
like global state. I do not think that Elisp code will be the blocker if
we find out how to deal with Emacs global state on C level.

> I mean, take the simplest thing, like save-buffer-excursion or
> with-selected-window, something that we use all the time, and look how
> much of the global state they access and change.  Then imagine that
> you don't have these and need to write programs that switch buffers
> and windows temporarily in thread-safe way.  Then reflect on what this
> means for all the other useful APIs and subroutines we have.

These examples are touching very basics aspects that we need to take
care of for async: (1) point/buffer; (2) unwind; (3) redisplay.
I think that (3) is not something that should be allowed as async. (1)
and (2) are to be discussed.

P.S. I am struggling to understand swap_in_symval_forwarding:

      /* Unload the previously loaded binding.  */
      tem1 = blv->valcell;

Is the above assignment redundant?

      if (blv->fwd.fwdptr)
        set_blv_value (blv, do_symval_forwarding (blv->fwd));

      /* Choose the new binding.  */
      {
        Lisp_Object var;
        XSETSYMBOL (var, symbol);
        tem1 = assq_no_quit (var, BVAR (current_buffer, local_var_alist));

This assignment always triggers after the first one, overriding it.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>
[Prev in Thread]
Current Thread
[Next in Thread]
Re: Concurrency via isolated process/thread, (continued)
Prev by Date: Re: Moving point after character when clicking latter half of it
Next by Date: Re: Moving point after character when clicking latter half of it
Previous by thread: Re: Concurrency via isolated process/thread
Next by thread: Re: Concurrency via isolated process/thread
Index(es):
- Date
- Thread