[Qemu-devel] [PATCH v3 00/29] block: Support for 512b-on-4k emulation

From: Kevin Wolf
Subject: [Qemu-devel] [PATCH v3 00/29] block: Support for 512b-on-4k emulation
Date: Fri, 17 Jan 2014 15:14:50 +0100

This patch series adds code to the block layer that allows performing
I/O requests in smaller granularities than required by the host backend
(most importantly, O_DIRECT restrictions). It achieves this for reads
by rounding the request to host-side block boundary, and for writes by
performing a read-modify-write cycle (and serialising requests
touching the same block so that the RMW doesn't write back stale data).

Originally I intended to reuse a lot of code from Paolo's previous
patch series, however as I tried to integrate pread/pwrite, which
already do a very similar thing (except for considering concurrency),
and because I wanted to implement zero-copy, most of this series ended
up being new code.

Zero-copy is possible in a common case because while XFS defauls to a
4k sector size and therefore 4k on-disk O_DIRECT alignment for 512E
disks, it still only has a 512 byte memory alignment requirement.
(Unfortunately the XFS_IOC_DIOINFO ioctl claims 4k even for memory, but
we know that the value is wrong and can probe it.)

Changes in v2 -> v3:
- Fixed I/O throttling bypass by converting to byte granularity [Wenchao]
- Made 'bytes' argument to tracked_request_overlaps() unsigned [Max]
- Fixed a corruption bug that came from using outdated RMW buffers after
  waiting for another request and added some assertions to check the
  assumptions [Peter]
- Fixed bytes vs. sectors error in zero-after-EOF code of
  bdrv_co_do_preadv [Max]
- Removed orphaned protoype in block.h [Max]
- A qemu-iotests case and some infrastructure to support it

Changes in v1 -> v2:
- Fixed overlap_bytes calculation in mark_request_serialising()
- Fixed wait_serialising_requests() deadlock
- iscsi: Set bs->request_alignment [Peter]
- iscsi: Query block limits only in iscsi_open() when no other request
  are in flight, and in iscsi_refresh_limits() copy the stored values
  into bs->bl [Peter]

Changes in RFC -> v1:
- Moved opt_mem_alignment into BlockLimits [Paolo]
- Changed BlockLimits in turn to work a bit more like the
  .bdrv_opt_mem_align() callback of the RFC; allows updating the
  BlockLimits later when the chain changes or bdrv_reopen() toggles
- Fixed a typo in a commit message [Eric]

Kevin Wolf (26):
  block: Move initialisation of BlockLimits to bdrv_refresh_limits()
  block: Inherit opt_transfer_length
  block: Update BlockLimits when they might have changed
  qemu_memalign: Allow small alignments
  block: Detect unaligned length in bdrv_qiov_is_aligned()
  block: Don't use guest sector size for qemu_blockalign()
  block: Introduce bdrv_aligned_preadv()
  block: Introduce bdrv_co_do_preadv()
  block: Introduce bdrv_aligned_pwritev()
  block: write: Handle COR dependency after I/O throttling
  block: Introduce bdrv_co_do_pwritev()
  block: Switch BdrvTrackedRequest to byte granularity
  block: Allow waiting for overlapping requests between begin/end
  block: Make zero-after-EOF work with larger alignment
  block: Generalise and optimise COR serialisation
  block: Make overlap range for serialisation dynamic
  block: Allow wait_serialising_requests() at any point
  block: Align requests in bdrv_co_do_pwritev()
  block: Assert serialisation assumptions in pwritev
  block: Change coroutine wrapper to byte granularity
  block: Make bdrv_pread() a bdrv_prwv_co() wrapper
  block: Make bdrv_pwrite() a bdrv_prwv_co() wrapper
  blkdebug: Make required alignment configurable
  qemu-io: New command 'sleep'
  qemu-iotests: Test pwritev RMW logic
  block: Switch bdrv_io_limits_intercept() to byte granularity

Paolo Bonzini (3):
  block: rename buffer_alignment to guest_block_size
  raw: Probe required direct I/O alignment
  iscsi: Set bs->request_alignment

 block.c                    | 644 +++++++++++++++++++++++++++++++--------------
 block/backup.c             |   7 +-
 block/blkdebug.c           |  24 ++
 block/iscsi.c              |  47 ++--
 block/qcow2.c              |  11 +-
 block/qed.c                |  11 +-
 block/raw-posix.c          | 102 +++++--
 block/raw-win32.c          |  41 +++
 block/stream.c             |   2 +
 block/vmdk.c               |  22 +-
 hw/block/virtio-blk.c      |   2 +-
 hw/ide/core.c              |   2 +-
 hw/scsi/scsi-disk.c        |   2 +-
 hw/scsi/scsi-generic.c     |   2 +-
 include/block/block.h      |  15 +-
 include/block/block_int.h  |  27 +-
 qemu-io-cmds.c             |  42 +++
 tests/qemu-iotests/077     | 278 +++++++++++++++++++
 tests/qemu-iotests/077.out | 202 ++++++++++++++
 tests/qemu-iotests/group   |   1 +
 util/oslib-posix.c         |   5 +
 21 files changed, 1234 insertions(+), 255 deletions(-)
 create mode 100755 tests/qemu-iotests/077
 create mode 100644 tests/qemu-iotests/077.out


