qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 3/5] dma-helpers: rewrite completion/cancellatio


From: Kevin Wolf
Subject: Re: [Qemu-devel] [PATCH 3/5] dma-helpers: rewrite completion/cancellation
Date: Fri, 09 Sep 2011 14:59:03 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:6.0) Gecko/20110816 Thunderbird/6.0

Am 09.09.2011 14:43, schrieb Paolo Bonzini:
>>> @@ -120,9 +132,8 @@ static void dma_bdrv_cb(void *opaque, int ret)
>>>      dbs->acb = dbs->io_func(dbs->bs, dbs->sector_num, &dbs->iov,
>>>                              dbs->iov.size / 512, dma_bdrv_cb, dbs);
>>>      if (!dbs->acb) {
>>> -        dma_bdrv_unmap(dbs);
>>> -        qemu_iovec_destroy(&dbs->iov);
>>> -        return;
>>> +        dbs->common.cb = NULL;
>>> +        dma_complete(dbs, -ENOMEM);
>>
>> Why don't we call the callback here? I know that it already was this
>> way before your patch, but isn't that a bug?
>>
>> Also, I think it should be -EIO instead of -ENOMEM (even though it
>> doesn't make any difference if we don't call the callback)
> 
> If I understood the code correctly, dbs->io_func can only fail if it
> fails to get an AIOCB, which is basically out-of-memory.  

Yeah, maybe you're right with the error code. Anyway, should we call the
callback?

> By the way, I
> remember reading (from you?) that this is bogus and it would be cleaner
> if callers of functions returning an AIOCB just assumed the return value
> to be non-NULL.
> 
> I checked now, and the following block drivers can return a NULL AIOCB
> even if qemu_aio_get succeeds:
> 
> * blkdebug (easily fixed ;))
> 
> * curl (seems like it also boils down to out-of-memory)
> 
> * rbd (can fail in librbd; might defer completion with an error to a
> bottom half instead)
> 
> * linux-aio (when io_submit fails; might fall back to posix-aio-compat
> instead).

I think it would make sense to require block drivers to return a valid
ACB (qemu_aio_get never returns NULL). If they have an error to report
they should schedule a BH that calls the callback.

>>> @@ -131,8 +142,12 @@ static void dma_aio_cancel(BlockDriverAIOCB
>>> *acb)
>>>      DMAAIOCB *dbs = container_of(acb, DMAAIOCB, common);
>>>
>>>      if (dbs->acb) {
>>> -        bdrv_aio_cancel(dbs->acb);
>>> +        BlockDriverAIOCB *acb = dbs->acb;
>>> +        dbs->acb = NULL;
>>> +        bdrv_aio_cancel(acb);
>>>      }
>>> +    dbs->common.cb = NULL;
>>> +    dma_complete(dbs, 0);
>>
>> Did you consider that there are block drivers that implement
>> bdrv_aio_cancel() as waiting for completion of outstanding requests? I
>> think in that case dma_complete() may be called twice. For most of it,
>> this shouldn't be a problem, but I think it doesn't work with the
>> qemu_aio_release(dbs).
> 
> Right.  But then what to do (short of inventing reference counting
> of some sort for AIOCBs) with those that don't?  Leaking should not
> be acceptable, should it?

Hm, not sure. This whole cancellation stuff is so broken...

Maybe we should really refcount dbs (actually it would be more like a
bool in_cancel that means that dma_complete doesn't release the AIOCB)

> In short, this patch can be dropped, but it shows more problems. :)

I'd rather have it fixed than dropped :-)

Kevin



reply via email to

[Prev in Thread] Current Thread [Next in Thread]