[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Consideration for Rust contributions in Emacs
From: |
Po Lu |
Subject: |
Re: Consideration for Rust contributions in Emacs |
Date: |
Sun, 22 Jan 2023 15:44:19 +0800 |
User-agent: |
Gnus/5.13 (Gnus v5.13) |
Troy Hinckley <comms@dabrev.com> writes:
> Let assume for the sake of this discussion that there was a some Rust
> code that someone wanted to contribute and the maintainers wanted the
> functionality it provided. What would be the consideration/objections?
It is hard to say for certain, because you have not said what code you
have in mind.
> 1 The Rust tool-chain is Apache licensed and so is LLVM. There is work
> on a GCC backend, but it is not production ready yet. Would Emacs
> allow the current Rust tool-chain?
No. Emacs code for a platform where GCC is available must be written so
that it can be compiled with GCC. We already have this policy in place,
and it prevents us from using Objective-C features that are only
supported by Clang in the Nextstep.
> 2 LLVM (and hence Rust) support fewer targets than GCC. Are there
> certain target that LLVM doesn’t support that are important to Emacs?
For example: MS-DOS, via DJGPP.
Or, Windows 9X. I use Emacs to code-convert files on a Windows 98
machine used to run government software.
And also the various Unix systems that currently do not support LLVM:
HP/UX, and AIX.
> 3 Many Rust libraries (crates) are MIT and/or Apache licensed. Do all
> Libraries used by GNU Emacs need to be GPL or is it sufficient to have
> a GPL compatible license?
I would be more concerned about how Rust libraries are distributed.
Isn't the Rust package manager one of those which download libraries
directly from a source code repository?
> 4 How sizable of a contribution would be needed for the maintainers to
> accept Rust in Emacs core? Would auxiliary functionality be considered
> (such as Rust in the Linux Kernel) or would it need to have major
> impact.
It will probably not be accepted.
> 5 Concerns over having more than one core language in GNU Emacs.
Yes.
> 6 Concerns over using such a new language. Rust still changes at a
> fast pace relative to C and it’s future is less certain then a more
> established language.
Yes. Rust is not widely available at all, while C is available for
every platform, from 8 bit microcontrollers, to 16 and 24-bit digital
signal processors, and 32-bit and 64-bit consumer computers, and will
remain that way for the foreseeable future.
Emacs is a portable program written in C. Thus, any code that is not
strictly a port to some other platform should also be written in
standard C99.
In the past, people wanted to rewrite Emacs in Scheme. Then, it was
C++. Then, it was Java. Now, it is Rust.
Part of the reason Emacs has existed for so long is that it has not
given in to those demands, and remains a highly portable program,
written in a standardized language, that has remained more or less
constant since it Emacs was written. Rust, on the other hand,
frequently releases breaking changes with new versions of the
programming language, and is not standardized in any way. This shows in
that even an operating system supposedly written in Rust provides a C
toolchain and C runtime library.
Now, judging by recent internet chatter, you probably think Rust will
magically allow Emacs Lisp to run on different threads.
Some people seem to have this idea that because the Rust compiler will
try to prevent two threads from having access to a variable at the same
time, writing Emacs in Rust will, as if by magic, allow multiple Lisp
threads to run at once.
That is not true. The Rust compiler does not try to avoid concurrency
pitfalls aside from the simple data race: for example, locking a
non-recursive mutex twice is not only a programming error, it is
undefined behavior!
In addition, locking (and not locking) needs to be carefully thought
out, or it will slow down single threaded execution. For example, a
common pattern found inside Emacs code is:
for (; CONSP (tem); tem = XCDR (tem))
/* Do something with XCAR (tem) */;
XCAR and XCDR work fine unchanged on most machines without needing any
kind of locking, as Lisp_Cons is always aligned to Lisp_Object.
However, it does not work on two machines, which either need explicit
memory barrier instructions (or in the case of vectors, mutexes):
On the (64-bit) Alpha, memory ordering across CPUs is extremely lax, and
without the appropriate barrier instructions, even aligned reads and
writes of 64 bit words are not atomic.
On x86 (and possibly other platforms) with --with-wide-int, reads and
writes of Lisp_Object require separate moves and stores to and from two
32 bit registers, which is obviously not atomic.
Then, if XCDR (tem) reads from a cons whose cdr cell is in the process
of being written to, then it may dereference a nonsense pointer.
And contrasting that, this code is perfectly safe in C, on x86. The
assert will never trigger:
static unsigned int foo;
thread_1 ()
{
foo++;
}
thread_2 ()
{
assert (foo <= 1);
}
main ()
{
/* start thread_1 and thread_2 at the same time. */
}
I believe the Rust compiler will force you to add some code in that
case. Imagine how much overhead it would add if Emacs had to lock a
Lisp_Cons before reading or writing to it.
But the Lisp interpreter is the easy part, especially since we already
have much of the necessary interpreter state moved off into struct
thread_state. A lot of other code which is algorithmically non
reentrant will have to be made reentrant, and papering mutexes and
atomics over them to satisfy compile-time checks will not do that. For
example, how do you propose to rewrite process.c to allow two threads to
enter wait_reading_process_output at the same time?
Who gets SIGCHLD? Who gets to read process output? In which thread do
process filters run?
In the end, you will have to do the same work you would need to in C,
with the added trouble of adding code in a new language, making everyone
else learn the new language, while throwing portability down the drain.
That doesn't sound like a good tradeoff to me.