[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [Qemu-block] [PATCH RFC for-3.0-rc3 0/3] qemu-img: Disa
Re: [Qemu-devel] [Qemu-block] [PATCH RFC for-3.0-rc3 0/3] qemu-img: Disable copy offloading by default
Fri, 27 Jul 2018 16:40:05 +0300
On Fri, Jul 27, 2018 at 3:15 PM Fam Zheng <address@hidden> wrote:
> On Fri, Jul 27, 2018 at 6:29 PM Kevin Wolf <address@hidden> wrote:
> > Am 27.07.2018 um 05:33 hat Fam Zheng geschrieben:
> > > Kevin pointed out that both glibc and kernel provides a slow fallback
> > > copy_file_range which hurts thin provisioning. This is particularly
> true for
> > > thin LVs, because host_device driver cannot get allocation info from
> > > volume, and copy_file_range is called on every sectors, making the dst
> > > allocated.
> > >
> > > NFS mount points also doesn't support SEEK_DATA well, so the allocation
> > > information is unknown to QEMU.
NFS >= 4.2 supports SEEK_DATA/HOLE.
> > >
> > > That leaves only iscsi:// which seems to do what we want so far, but
> it is a
> > > smaller use case.
> > >
> > > Add an option to qemu-img convert, "-C", to enable (attempting) copy
> > > explicitly. And mark it incompatible with "-S" and "-c".
> > Reviewed-by: Kevin Wolf <address@hidden>
> > Not sure why you made this an RFC only, but I think we absolutely need
> > this. People are used to using 'qemu-img convert' to compact images and
> > this would regress with automatic copy offloading.
> > Do you think we need more discussion?
> I think merging this for 3.0 is the right thing do to.
> What worries me is the general usability of the feature. We could
> probably explore ideas about how we can better take advantage of copy
> offloading. I don't think counting on the user to make the right
> decision between disk efficiency (thin provisioning) and BW efficiency
(copy offloading) will ever work. Even if we don't care about breaking
> the default '-S 4k' behavior, the lack of SEEK_DATA/SEEK_HOLE support
> on host NFS and block devices will make it very hard to use. Making it
> worse, if the network to NFS server is good enough, convert with
> pread64/pwrite64 with host page cache
is also more efficient than
> copy_file_range, so we'll be slower by trying to play clever. :(
In oVirt we always disable caching (-t none -T none), so using
host page cache is not an option.
I think adding an option for copy offloading is the right thing. This way
we can introduce the feature early without breaking the rest of the stuck.
For long term it would be nicer if qemu could select the best way to do
the copy automatically, keeping the allocation policy specified by the user.
oVirt still have bugs related to converting preallocated images to sparse
and sparse images to preallocated. Users like to have control on this, so
qemu-img cannot change the policy.
Do we have some documentation on the useful cases for copy
offloading? Did we benchmark this with different kind of storage?
The interesting use case for oVirt are:
- block storage: copying rregular LV on shared storage to
another LV on same VG, but may be on a different PV.
- file storage: copying files on same NFS or GlusterFS mont.
- copying between different servers or file types, I guess
copy_file_range will not help in this case, right?