I think the key points in Avi's message are this:
Avi Kivity writes:
You don't know afterwards either. Maybe read() is specced as you
say, but practical implementations will return the minimum bytes
read, not exact.
And this:
I really doubt that any guest will be affected by this. It's a tradeoff
between decent performance and needlessly accurate emulation. I don't
see how we can choose the latter.
I don't think this is the right way to analyse this situation. We are
trying to define a general-purpose DMA API for _all_ emulated devices,
not just the IDE emulation and block devices that you seem to be
considering.
If there is ever any hardware which behaves `properly' with partial
DMA, and any host kernel device which can tell us what succeeded and
what failed, then it is necessary for the DMA API we are now inventing
to allow that device to be properly emulated.
Even if we can't come up with an example right now of such a device
then I would suggest that it's very likely that we will encounter one
eventually. But actually I can think of one straight away: a SCSI
tapestreamer. Tapestreamers often give partial transfers at the end
of tapefiles; hosts (ie, qemu guests) talking to the SCSI controller
do not expect the controller to DMA beyond the successful SCSI
transfer length; and the (qemu host's) kernel's read() call will not
overwrite beyond the successful transfer length either.
If it is difficult for a block device to provide the faithful
behaviour then it might be acceptable for the block device to always
indicate to the DMA API that the entire transfer had taken place, even
though actually some of it had failed.
But personally I think you're mistaken about the behaviour of the
(qemu host's) kernel's {aio_,p,}read(2).