[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-block] Combining synchronous and asynchronous IO
From: |
Kevin Wolf |
Subject: |
Re: [Qemu-block] Combining synchronous and asynchronous IO |
Date: |
Fri, 15 Mar 2019 16:50:10 +0100 |
User-agent: |
Mutt/1.11.3 (2019-02-01) |
Am 15.03.2019 um 16:33 hat Sergio Lopez geschrieben:
>
> Stefan Hajnoczi writes:
>
> > On Thu, Mar 14, 2019 at 06:31:34PM +0100, Sergio Lopez wrote:
> >> Our current AIO path does a great job at unloading the work from the VM,
> >> and combined with IOThreads provides a good performance in most
> >> scenarios. But it also comes with its costs, in both a longer execution
> >> path and the need of the intervention of the scheduler at various
> >> points.
> >>
> >> There's one particular workload that suffers from this cost, and that's
> >> when you have just 1 or 2 cores on the Guest issuing synchronous
> >> requests. This happens to be a pretty common workload for some DBs and,
> >> in a general sense, on small VMs.
> >>
> >> I did a quick'n'dirty implementation on top of virtio-blk to get some
> >> numbers. This comes from a VM with 4 CPUs running on an idle server,
> >> with a secondary virtio-blk disk backed by a null_blk device with a
> >> simulated latency of 30us.
> >
> > Can you describe the implementation in more detail? Does "synchronous"
> > mean that hw/block/virtio_blk.c makes a blocking preadv()/pwritev() call
> > instead of calling blk_aio_preadv/pwritev()? If so, then you are also
> > bypassing the QEMU block layer (coroutines, request tracking, etc) and
> > that might explain some of the latency.
>
> The first implementation, the one I've used for getting these numbers,
> it's just preadv/pwrite from virtio_blk.c, as you correctly guessed. I
> know it's unfair, but I wanted to take a look at the best possible
> scenario, and then measure the cost of the other layers.
>
> I'm working now on writing non-coroutine counterparts for
> blk_co_[preadv|pwrite], so we have SIO without bypassing the block layer.
Maybe try to keep the change local to file-posix.c? I think you would
only have to modify raw_thread_pool_submit() so that it doesn't go
through the thread pool, but just calls func directly.
I don't think avoiding coroutines is possible without bypassing the block
layer altogether because everything is really expecting to be run in
coroutine context.
Kevin
Re: [Qemu-block] [Qemu-devel] Combining synchronous and asynchronous IO, Fam Zheng, 2019/03/17