[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH v2 1/2] rbd: use the higher level librbd instead
From: |
Stefan Hajnoczi |
Subject: |
Re: [Qemu-devel] [PATCH v2 1/2] rbd: use the higher level librbd instead of just librados |
Date: |
Tue, 12 Apr 2011 22:14:31 +0100 |
On Tue, Apr 12, 2011 at 4:38 PM, Sage Weil <address@hidden> wrote:
> On Tue, 12 Apr 2011, Stefan Hajnoczi wrote:
>> On Tue, Apr 12, 2011 at 1:18 AM, Josh Durgin <address@hidden> wrote:
>> > On 04/08/2011 01:43 AM, Stefan Hajnoczi wrote:
>> >>
>> >> On Mon, Mar 28, 2011 at 04:15:57PM -0700, Josh Durgin wrote:
>> >>>
>> >>> librbd stacks on top of librados to provide access
>> >>> to rbd images.
>> >>>
>> >>> Using librbd simplifies the qemu code, and allows
>> >>> qemu to use new versions of the rbd format
>> >>> with few (if any) changes.
>> >>>
>> >>> Signed-off-by: Josh Durgin<address@hidden>
>> >>> Signed-off-by: Yehuda Sadeh<address@hidden>
>> >>> ---
>> >>> block/rbd.c | 785
>> >>> +++++++++++++++--------------------------------------
>> >>> block/rbd_types.h | 71 -----
>> >>> configure | 33 +--
>> >>> 3 files changed, 221 insertions(+), 668 deletions(-)
>> >>> delete mode 100644 block/rbd_types.h
>> >>
>> >> Hi Josh,
>> >> I have applied your patches onto qemu.git/master and am running
>> >> ceph.git/master.
>> >>
>> >> Unfortunately qemu-iotests fails for me.
>> >>
>> >>
>> >> Test 016 seems to hang in qemu-io -g -c write -P 66 128M 512
>> >> rbd:rbd/t.raw. I can reproduce this consistently. Here is the
>> >> backtrace of the hung process (not consuming CPU, probably deadlocked):
>> >
>> > This hung because it wasn't checking the return value of rbd_aio_write.
>> > I've fixed this in the for-qemu branch of
>> > http://ceph.newdream.net/git/qemu-kvm.git. Also, the existing rbd
>> > implementation is not 'growable' - writing to a large offset will not
>> > expand
>> > the rbd image correctly. Should we implement bdrv_truncate to support this
>> > (librbd has a resize operation)? Is bdrv_truncate useful outside of
>> > qemu-img
>> > and qemu-io?
>>
>> If librbd has a resize operation then it would be nice to wire up
>> bdrv_truncate() for completeness. Note that bdrv_truncate() can also
>> be called online using the block_resize monitor command.
>>
>> Since rbd devices are not growable we should fix qemu-iotests to skip
>> 016 for rbd.
>
> There is a resize operation, but it's expected that you'll use it for any
> bdev size change (grow or shrink). Does qemu grow a device by writing to
> the (new) highest offset, or is there another operation that should be
> wired up? We want to avoid a situation where RBD isn't aware of the qemu
> bdev resize and has to grow a bit each time we write to a larger offset,
> as resize is a somewhat expensive operation...
Good it sounds like RBD and QEMU have similar concepts here. The
bdrv_truncate() operation is a (rare) image resize operation. It is
not the extend-beyond-EOF grow operation which QEMU simply performs as
a write beyond bdrv_getlength() bytes.
Stefan