On Thu, 3 Oct 2024 at 23:36, Alexey Ulasevich wrote:
03.10.2024 08:46, Jack Perry пишет:
...and judging from the
modula-2 reference at
....there is indeed no
NEWPROCESS procedure in SYSTEM.
Correct. In PIM, coroutines were provided by pseudo-module SYSTEM, and in ISO this was changed.
I expected two processes work parallel.
But they work sequential in same thread. It's doesn't look like `multithreading`. :-)
That is a misconception
First, there is nothing about the terminology "thread" and "multi-threading" that requires true parallelism.
There are two approaches: (1) user level threads and (2) kernel level threads.
With user level threads, context switching is initiated explicitly in user space. With kernel level threads, context switching is initiated preemptively in kernel space. Concurrency in Modula-2 is based on user level threads, also known as coroutines. Before kernel level threads became standardised and generally available in most operating system kernels, the default concurrency model was user level threads. Modula-2 is from that era and thus it uses user level threads.
Kernel level threads are not necessarily better. In fact, kernel level threads do not scale, lest you have hardware with a massive parallel architecture. User level threads can scale very high with negligible overhead on any type of hardware, even single-core/single-cpu hardware.
When I was working on software based telephony, we used two different telephony servers, one called Asterisk, the other FreeSwitch. With Asterisk you could do up to about 200 concurrent telephone calls NO MATTER HOW POWERFUL THE HARDWARE !!! because at that point the overhead of context switching using kernel level threads overwhelmed the system and the CPU (or CPUs) spent more time on switching than processing telephone audio. Now, Asterisk was extremely badly written with wide ranging MUTEX locks -- I called it Rocky Mountain locking, a pun on fine grained locking, as Asterisk was doing the exact opposite of fine grained. FreeSwitch is a better test environment as MUTEXes are fine grained. With FreeSwitch you could do about 1000-1200 concurrent telephone calls before the kernel spent more time on context switching overhead than processing telephone audio.
At the time I had a request for a highly scalable server, so did various experimental implementations with coroutines faked in C using a technique called Duff's Device and also using various open source coroutine libraries based on setjmp and longjmp. These experimental servers -- using user level threading -- scaled to 40.000 concurrent telephone calls running on a Mac Mini from about 2005 or 2006.
Unfortunately, Gaius has chosen to use the GNU portable threads library (aka pth) to implement coroutines in GM2. The pth library is built on top of kernel threads, it merely gives you a user interface that looks like user level threads, but the threads are actually kernel threads, which means there is no scalability gain.
I recommend reading a paper by Roberto Ierusalimschy of the Catholic University of Rio de Janeiro titled "Revisiting Coroutines". You can find it here:
BTW, Ierusalimschy is the designer and author of the LUA programming language, which not surprisingly, also uses user level threads, just like Modula-2, and it has a reputation for extraordinary performance in its class.
regards
benjamin