qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] qcow2 performance plan


From: Stefan Hajnoczi
Subject: Re: [Qemu-devel] qcow2 performance plan
Date: Tue, 14 Sep 2010 17:03:43 +0100

On Tue, Sep 14, 2010 at 4:47 PM, Kevin Wolf <address@hidden> wrote:
> Am 14.09.2010 17:20, schrieb Anthony Liguori:
>> On 09/14/2010 10:11 AM, Kevin Wolf wrote:
>>> Am 14.09.2010 15:43, schrieb Anthony Liguori:
>>>
>>>> Hi Avi,
>>>>
>>>> On 09/14/2010 08:07 AM, Avi Kivity wrote:
>>>>
>>>>>   Here's a draft of a plan that should improve qcow2 performance.  It's
>>>>> written in wiki syntax for eventual upload to wiki.qemu.org; lines
>>>>> starting with # are numbered lists, not comments.
>>>>>
>>>> Thanks for putting this together.  I think it's really useful to think
>>>> through the problem before anyone jumps in and starts coding.
>>>>
>>>>
>>>>> = Basics =
>>>>>
>>>>> At the minimum level, no operation should block the main thread.  This
>>>>> could be done in two ways: extending the state machine so that each
>>>>> blocking operation can be performed asynchronously
>>>>> (<code>bdrv_aio_*</code>)
>>>>> or by threading: each new operation is handed off to a worker thread.
>>>>> Since a full state machine is prohibitively complex, this document
>>>>> will discuss threading.
>>>>>
>>>> There's two distinct requirements that must be satisfied by a fast block
>>>> device.  The device must have fast implementations of aio functions and
>>>> it must support concurrent request processing.
>>>>
>>>> If an aio function blocks in the process of submitting the request, it's
>>>> by definition slow.  But even if you may the aio functions fast, you
>>>> still need to be able to support concurrent request processing in order
>>>> to achieve high throughput.
>>>>
>>>> I'm not going to comment in depth on your threading proposal.  When it
>>>> comes to adding concurrency, I think any approach will require a rewrite
>>>> of the qcow2 code and if the author of that rewrite is more comfortable
>>>> implementing concurrency with threads than with a state machine, I'm
>>>> happy with a threaded implementation.
>>>>
>>>> I'd suggest avoiding hyperbole like "a full state machine is
>>>> prohibitively complex".  QED is a full state machine.  qcow2 adds a
>>>> number of additional states because of the additional metadata and sync
>>>> operations but it's not an exponential increase in complexity.
>>>>
>>> It will be quite some additional states that qcow2 brings in, but I
>>> suspect the really hard thing is getting the dependencies between
>>> requests right.
>>>
>>> I just had a look at how QED is doing this, and it seems to take the
>>> easy solution, namely allowing only one allocation at the same time.
>>
>> One L2 allocation, not cluster allocations.  You can allocate multiple
>> clusters concurrently and you can read/write L2s concurrently.
>>
>> Since L2 allocation only happens every 2GB, it's a rare event.
>
> Then your state machine is too complicated for me to understand. :-)
>
> Let me try to chase function pointers for a simple cluster allocation:
>
> bdrv_qed_aio_writev
> qed_aio_setup
> qed_aio_next_io
> qed_find_cluster
> qed_read_l2_table
> ...
> qed_find_cluster_cb
>
> This function contains the code to check if the cluster is already
> allocated, right?
>
>    n = qed_count_contiguous_clusters(s, request->l2_table->table,
>                                      index, n, &offset);
>    ret = offset ? QED_CLUSTER_FOUND : QED_CLUSTER_L2;
>
> The callback called from there is qed_aio_write_data(..., ret =
> QED_CLUSTER_L2, ...) which means
>
>    bool need_alloc = ret != QED_CLUSTER_FOUND;
>    /* Freeze this request if another allocating write is in progress */
>    if (need_alloc) {
>    ...
>
> So where did I start to follow the path of a L2 table allocation instead
> of a simple cluster allocation?

qed_aio_write_main() writes the main body of data into the cluster.
Then it decides whether to update/allocate L2 tables if this is an
allocating write.

qed_aio_write_l2_update() is the function that gets called to touch
the L2 table (it also handles the allocation case).

Stefan



reply via email to

[Prev in Thread] Current Thread [Next in Thread]