bug-hurd
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: RFC: Lightweight synchronization mechanism for gnumach v3


From: Agustina Arzille
Subject: Re: RFC: Lightweight synchronization mechanism for gnumach v3
Date: Thu, 28 Apr 2016 23:16:31 -0300

Hello, Samuel

On 04/25/2016 09:37 PM, Samuel Thibault wrote:
Well, I wouldn't say so. Our current implementation does yield to other
threads, which is not what is usually done by spin lock implementations:
usually they really spin on the value, without making system calls, so
as to acquire as fast as possible. Such kinds of locks are of course
delicate to use: you have to control where threads are running,
otherwise you could be spinning for a whole scheduling quantum.

It happens that the current use of spin locks from translators assumes
that the spin locks are somehow yielding: they really don't do control
where threads are running.  These were converted as such from the
cthreads library, which does yield.  Probably we should just turn them
into using mutexes, which should become very lightweight with gsync.
But let's do step by step, so for now we have to keep pthread_spin_lock
somehow yield.  But yielding blindly like currently is done is not
the best way to achieve things, especially when we have the gsync
facility which allows to exactly get an optimized behavior with not much
overhead. Since we'll want to turn __spin_lock_solid (which is really
supposed to be somehow yielding) into using gsync anyway, that'll make
our current pthread_spin_lock implementation block with gsync, and get
better performance.

We can turn translators into using mutexes and then fix
pthread_spin_lock into really spinning, but independently.

You make a very good point. I also hadn't considered that since hurd tasks are
always multithreaded (because of the signal thread), rewriting spin locks will
bring a performance increase even if libpthread isn't linked.

You mean the lock protecting __pthread_threads?  That's only used on
thread creation and pthread_self calls, which are really not that often,
actually, so that's not really the problem.

It's more than that. For each pthread_t type, we need the thread descriptor
structure, and in order to fetch that, a global rwlock must be acquired. That
is a totally unnecessary performance hit (That also wastes memory).

Still worth to rewrite, because the algorithm in hlpt is very fast :)
We could integrate that optimization, yes. That's just not the most
pressing thing to fix for performances :)

Samuel

Very well.

Anyway, here's my plan. Since it seems a good idea to post multiple patches
instead of a single large one, I have the following in mind:

Patch 1: Add gsync-based locks to glibc
- Add lowlevellock.* to mach directory
- Modify errno script to generate error codes for robust locks
- Rewrite __spin_lock and co so that they use lowlevellocks
- Rewrite all the __libc_lock* stuff to also use them
- (Maybe) Do the same for stdio locks

Patch 2: Basic pthread objects
- Rewrite pthread mutexes and condvars to use gsync
- These 2 need to go together because condvars depend on some internal
  details of mutexes in order to implement the wait morphing optimization
  that prevents the 'thundering herd' issue.
- Additionally correct some stuff, like cancellation handling

Patch 3: Semaphores and rwlocks
- Implement POSIX semaphores, needed by PostgreSQL
- Rewrite read-write locks, include the lockless algorithm from hlpt

What do you guys think?




reply via email to

[Prev in Thread] Current Thread [Next in Thread]