Re: [Qemu-devel] [PATCH] honor IDE_DMA_BUF

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH] honor IDE_DMA_BUF_SECTORS

From:	Avi Kivity
Subject:	Re: [Qemu-devel] [PATCH] honor IDE_DMA_BUF_SECTORS
Date:	Fri, 27 Mar 2009 12:52:15 +0300
User-agent:	Thunderbird 2.0.0.21 (X11/20090320)

Samuel Thibault wrote:

Ah, I thought you understood that the posix driver has the same kind of

limitation

It's not the same limitation. The posix driver has no limits on DMAsize, it will happily transfer a gigabyte of data if you ask it to.

(and qemu is actually _bugged_ in that regard).

It has a bug in that it does not correctly interpret the return value ofpread()/pwrite(). It's a minor bug since no system supported by qemuwill actually return a short read or write (I think) and in that wehope disk errors are rare. Nevertheless it should be fixed (it's aneasy fix too). However implementing DMA limits like you propose(IDE_DMA_BUF_SECTORS) will not fix the bug, only reduce performance.

I'm here just pointing out that the problem is not
_only_ in the xen-specific driver, but also in the posix driver, on any
OS that doesn't necessarily do all the work the caller asked for (which
is _allowed_ by POSIX).
But that's not limited DMA (or at least, not limited up-front). Andit's easily corrected, place a while loop around preadv/pwritev, no needto split a request a priori somewhere up the stack.


Sure, and I could do the same in the block-vbd driver, thus then my
original remark "it should be centralized in the block layer instead of
placing the burden on all block format drivers".  Just to make sure: I'm
_not_ saying that should be done in the DMA code.  I said it should be
done in the block layer, shared by all block drivers.

A generic fix will have to issue a new aio request. block-raw-posixneed not do that, just a while loop.

And it wouldn't be right for block-vbd - you should split your requestsas late as possible, IMO.


Why making it "late"?  Exposing the lower limits to let upper layers
decide how they should manage fragmentation usually gets better
performance.  (Note that in my case there is no system involved, so it's
really _not_ costly to do the fragmentation on the qemu side).

If ring entries can be more than a page (if the request is contiguous),then the limit can be expanded. In other words, it's a worst-caselimit, not a hard limit. Exposing the worst case limit will lead topessimistic choices.

That's how virtio-blk works, don't know about xen vbd (might not workdue to the need to transfer grants?)


--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [Qemu-devel] [PATCH] honor IDE_DMA_BUF_SECTORS, (continued)

Prev by Date: Re: [Qemu-devel] how can i mount a folder as a qemu disk?
Next by Date: Re: [Qemu-devel] [PATCH] honor IDE_DMA_BUF_SECTORS
Previous by thread: Re: [Qemu-devel] [PATCH] honor IDE_DMA_BUF_SECTORS
Next by thread: Re: [Qemu-devel] [PATCH] honor IDE_DMA_BUF_SECTORS
Index(es):
- Date
- Thread