qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] qcow2x


From: Frediano Ziglio
Subject: Re: [Qemu-devel] qcow2x
Date: Tue, 2 Aug 2011 17:29:31 +0200

2011/8/2 Kevin Wolf <address@hidden>:
> Am 02.08.2011 16:30, schrieb Frediano Ziglio:
>> Hi,
>>   I spent some time trying to find a way to speed up qcow2 performance
>> on allocation and snapshot so I branch from kevin/coroutine-block
>> branch a qcow2x branch. Currently it just write using different
>> algorithm (that is is fully compatible with qcow2, there is not
>> ICountAsZero(TM) method :) ). Mainly is a complete rewrite of
>> qcow2_co_writev. Some problems I encountered in the way current qcow2
>> code works
>
> Do you have a public git repo of your code?
>
>> - reference decrement are not optimized (well, this is easy to fix on
>> current code)
>> - any L2 update blocks all other L2 updates, this is a problem if
>> guest is writing large file sequentially cause most write needs to be
>> serialized
>
> Yes, I am well aware of that. In my old coroutine-devel branch (it's
> still online) I started pushing the locks down and parallelising things
> this way. The code that is there is broken because the locking isn't
> completely right (L2 allocation vs. cache users of that L2, and somthing
> with refcounts). The changes to the cache should be about right, though.
>
>> - L2 allocation can be done with relative data (this is not easy to do
>> with current code)
>
> What do you mean by that?
>

Let's take an example. By allocation I mean give a position to
data/l2/refcount_table. Usually you cannot update/write L2 before data
is allocated and written if you don't want to get a L2 entry pointing
to garbage or unwritten data (if physically you write to a sector you
get new data or old one on fail, not data changing to anything else).
The exception to this is when even L2 table is not allocated. In this
case you can write L2 table with data cause in case of failure this
new L2 table is just not attached to anything (cause L1 still point to
old L2 or is not allocated). My patch can collapse these two cluster
writes in a single one. The key point of the patch is mainly
collapsing all writes as you can not blocking other writes if not
needed.

>> - data/l2 are allocated sequentially (if there are not hole freed) but
>> written in another order. This cause excessive file fragmentation with
>> default cache mode, for instance on xfs file is allocated quite
>> sequentially on every write so any no-sequential write create a
>> different fragment.
>>
>> Currently I'm getting these times with iotests (my_cleanup branch is
>> another branch more conservative with a patch to collapse reference
>> decrement, note that 011 and 026 are missing, still not working)
>
> Note that qemu-iotests is often a good indicator, but the tools often
> show different behaviour from real guests, so you should also run
> benchmarks in a VM.
>

I know, one reason is that guest usually do a lot of small write/read
(probably this is how hardware work but I don't know this side that
much, usually I didn't see request longer than 128 sectors).

>>     X    C    B
>> 001 6    3    7
>> 002 3    3    4
>> 003 3    3    3
>> 004 0    1    0
>> 005 0    0    0
>> 007 35   32   36
>> 008 3    4    3
>> 009 1    0    0
>> 010 0    0    0
>> 012 0    0    2
>> 013 125  err  158
>> 014 189  err  203
>> 015 48   70   610
>> 017 4    4    4
>> 018 5    5    5
>> 019 4    4    4
>> 020 4    4    4
>> 021 0    0    0
>> 022 74   103  103
>> 023 75   err  95
>> 024 3    3    3
>> 025 3    3    6
>> 027 1    1    0
>> 028 1    1    1
>>
>> X qcow2x
>> C my_cleanup
>> B kevin/coroutine-block
>>
>> Currently code is quite "spaghetti" code (needs a lot of cleanup,
>> checks, better error handling and so on). Taking into account that
>> code require additional optimizations and is full of internal
>> debugging time times are quite good.
>>
>> Main questions are:
>> - are somebody interesting ?
>> - how can I send such a patch for review considering that is quite big
>> (I know, I have to clean a bit too) ?
>
> You'll need to split it up into reviewable pieces. But let me have a
> look at your git tree first.
>

I have an account on gitorious but still my repo is only local. Do you
suggest a different provider or gitorious is ok for you?

> Are you in the #qemu IRC channel? I think we should coordinate our qcow2
> work a bit in order to avoid conflicting or duplicate work.
>
> Kevin
>

No, I don't use irc that much (time shift problems and also connection
too). When are you online?

Frediano



reply via email to

[Prev in Thread] Current Thread [Next in Thread]