qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 00/19] Add Thread Sanitizer support to QEMU


From: Robert Foley
Subject: Re: [PATCH 00/19] Add Thread Sanitizer support to QEMU
Date: Tue, 26 May 2020 11:18:47 -0400

On Sat, 23 May 2020 at 17:36, Emilio G. Cota <address@hidden> wrote:
>
> On Fri, May 22, 2020 at 12:07:36 -0400, Robert Foley wrote:
> > This patch series continues the work done by Emilio Cota and others to add
> > Thread Sanitizer (TSan) support to QEMU.
> >
> > The starting point for this work was Emilio's branch here:
> > https://github.com/cota/qemu/commits/tsan
> > specifically this commit: 0be125fc0afd47218b34d2019abdd19b644f3199
> >
> > The purpose of this patch is not to fix all the TSan warnings, but to enable
> > the TSan support so that QEMU developers can start using the tool.
> > We found this tool useful and even ran it on our recent changes in
> > the cpu-locks series.
> > Clearly there is work to do here to clean up all the warnings. :)
> > We have made a start to cleaning up these warnings by getting a VM to boot
> > cleanly with no TSan warnings.
> > We have also made an effort to introduce enough of the TSan suppression
> > mechanisms, so that others can continue this work.
>
> Thank you for doing this work! Great to see this finally coming along.
>
> What are your plans wrt the per-cpu-lock branch? Given that some of
> the races reported here are fixed there, I think it would make sense to
> defer those patches to the per-cpu-lock series (i.e. patches 2/19, parts
> of 13/19, and 18/19) so that this series goes in first (it is a lot
> less likely to break anything).

I agree with you that we should defer my patches which overlap with
the per-cpu-lock patch series.

>
> Also, I would not worry too much about rushing to bring warnings to
> 0; what's important is that with the warnings we now know where to
> focus on, and then we can carefully fix races. In particular I think
> all annotations we add must have a comment, since otherwise we are
> at the risk of blindlessly silencing warnings of real races.

In order to re-focus the series a bit, we are planning to drop the
annotations from the series.  This allows for a bit more focus on
enabling TSan and less on bringing the warnings to 0 as you mentioned.

At the same time, we're also going to add in a bit more details to
testing.rst on how to use the various suppression mechanisms,
annotations, blacklist and suppressions.txt.

>
> > Issues:
> > - When running docker-test-quick under TSan there are several tests which 
> > hang
> >   - The unit tests which seem to hang under TSan:
> >     test-char, test-qdev-global-props, and test-qga.
> >   - If we comment out those tests, check-unit finishes, albeit with
> >     a couple of warnings. :)
>
> I've noticed another issue (that I did not notice on my previous
> machine), which is that tsan blows up when in qht we lock all
> of the bucket's locks. We then get this assert from tsan, since
> it has a static limit of 64 mutexes held at any given time:
>
> FATAL: ThreadSanitizer CHECK failed: 
> /build/llvm-toolchain-10-yegZYJ/llvm-toolchain-10-10.0.0/compiler-rt/lib/sanitizer_common/sanitizer_deadlock_detector.h:67
>  "((n_all_locks_)) < 
> (((sizeof(all_locks_with_contexts_)/sizeof((all_locks_with_contexts_)[0]))))" 
> (0x40, 0x40)
<snip>

> A workaround for now is to run qemu with TSAN_OPTIONS=detect_deadlocks=0,
> although selectively disabling tsan for qht_map_lock/unlock_buckets would
> be ideal (not sure if it's possible).

We have been using detect_deadlocks=0 to avoid this.
We will see if it is possible to disable tsan for just
qht_map_lock/unlock_buckets

>
> Another warning I'm reliably seen is:
> WARNING: ThreadSanitizer: double lock of a mutex (pid=489006)
>     #0 pthread_mutex_lock <null> (qemu-system-aarch64+0x457596)
>     #1 qemu_mutex_lock_impl 
> /home/cota/src/qemu/util/qemu-thread-posix.c:79:11 
> (qemu-system-aarch64+0xaf7e3c)
>
>   Location is heap block of size 328 at 0x7b4800114900 allocated by main 
> thread:
>     #0 calloc <null> (qemu-system-aarch64+0x438b80)
>     #1 g_malloc0 <null> (libglib-2.0.so.0+0x57d30)
>
>   Mutex M57 (0x7b4800114960) created at:
>     #0 pthread_mutex_init <null> (qemu-system-aarch64+0x43b74d)
>     #1 qemu_rec_mutex_init 
> /home/cota/src/qemu/util/qemu-thread-posix.c:119:11 
> (qemu-system-aarch64+0xaf815b)
>
> But this one seems safe to ignore.
>

This one we disabled in the suppressions.txt file.
TSan reports a double lock even when the mutex is flagged as recursive.


Thanks & Regards,
-Rob
> Thanks,
>                 E.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]