qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [kvm-unit-tests PATCH v2 2/2] run_tests: allow run test


From: Radim Krčmář
Subject: Re: [Qemu-devel] [kvm-unit-tests PATCH v2 2/2] run_tests: allow run tests in parallel
Date: Thu, 5 Jan 2017 20:44:02 +0100

2017-01-05 11:07+0800, Peter Xu:
> On Wed, Jan 04, 2017 at 04:09:39PM +0100, Radim Krčmář wrote:
>> 2017-01-03 18:10+0800, Peter Xu:
>> > run_task.sh is getting slow. This patch is trying to make it faster by
>> > running the tests concurrently.
>> > 
>> > We provide a new parameter "-j" for the run_tests.sh, which can be used
>> > to specify how many run queues we want for the tests. Default queue
>> > length is 1, which is the old behavior.
>> > 
>> > Quick test on my laptop (4 cores, 2 threads each) shows 3x speed boost:
>> > 
>> >    |-----------------+-----------|
>> >    | command         | time used |
>> >    |-----------------+-----------|
>> >    | run_test.sh     | 75s       |
>> >    | run_test.sh -j8 | 27s       |
>> >    |-----------------+-----------|
>> > 
>> > Suggested-by: Radim Krčmář <address@hidden>
>> > Signed-off-by: Peter Xu <address@hidden>
>> > ---
>> >  run_tests.sh           | 12 ++++++++++--
>> >  scripts/functions.bash | 15 ++++++++++++++-
>> >  scripts/global.bash    | 11 +++++++++++
>> >  3 files changed, 35 insertions(+), 3 deletions(-)
>> 
>> I like this diffstat a lot more, thanks :)
>> 
>> The script doesn't handle ^C well now (at least), which can be worked
>> around with
>> 
>>   trap exit SIGINT
>> 
>> but it would be nice to know if receiving signals in `wait` can't be
>> fixed.
> 
> When I send SIGINT to "run_tests.sh -j8", I see process hang dead. Not
> sure whether you see the same thing:
> 
> #0  0x00007f7af2e1559a in waitpid () from /lib64/libc.so.6
> #1  0x00005613edf8953e in waitchld.isra ()
> #2  0x00005613edf8aae5 in wait_for ()
> #3  0x00005613edf8b682 in wait_for_any_job ()
> #4  0x00005613edfc7e64 in wait_builtin ()
> #5  0x00005613edf616ea in execute_builtin.isra ()
> #6  0x00005613edf623ee in execute_simple_command ()
> #7  0x00005613edf79e77 in execute_command_internal ()
> #8  0x00005613edf7b972 in execute_command ()
> #9  0x00005613edf62aca in execute_while_or_until ()
> #10 0x00005613edf7a156 in execute_command_internal ()
> #11 0x00005613edf79d88 in execute_command_internal ()
> ...

I do.  And sometimes, I also caught it in a signal handler:

  #0  0x00007f7461bcd637 in kill () at ../sysdeps/unix/syscall-template.S:84
  #1  0x00000000004476b9 in wait_sigint_handler (sig=<optimized out>) at 
jobs.c:2504
  #2  <signal handler called>
  #3  0x00007f7461c674ca in __GI___waitpid (address@hidden, address@hidden, 
      address@hidden) at ../sysdeps/unix/sysv/linux/waitpid.c:29
  #4  0x0000000000449cfb in waitchld (address@hidden, wpid=-1) at jobs.c:3474
  #5  0x000000000044b2eb in wait_for (address@hidden) at jobs.c:2718
  #6  0x000000000044becd in wait_for_any_job () at jobs.c:3015
  #7  0x000000000048c38b in wait_builtin (list=0x0) at ./wait.def:154
  ...

> 
> If I change the "wait -n" into "wait" (this will work, but of course
> slower since we'll wait for all subprocesses end before we start
> another one), problem disappears.

Yeah, and just `wait` doesn't work with -j1 ... using `jobs -r` gets rid
of it.

>                                   Not sure whether that means a "wait
> -n" bug.

Most likely a bash bug:
 - doesn't happen if you SIGINT before the first job has finished
 - doesn't happen with normal wait

> Anyway, IMHO squashing you suggestion of "trap exit SIGINT" at the
> entry of for_each_unittest() is an acceptable solution - it works in
> all cases.

Seems like the best solution at this time ...
We actually want to "exit 130", as it would when using normal wait does,
and maybe it could be improved a bit by adding a wait:

  trap 'wait; exit 130' SIGINT



reply via email to

[Prev in Thread] Current Thread [Next in Thread]