[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: backup_calculate_cluster_size does not consider source

From: Wolfgang Bumiller
Subject: Re: backup_calculate_cluster_size does not consider source
Date: Wed, 6 Nov 2019 11:34:50 +0100
User-agent: NeoMutt/20180716

On Wed, Nov 06, 2019 at 10:37:04AM +0100, Max Reitz wrote:
> On 06.11.19 09:32, Stefan Hajnoczi wrote:
> > On Tue, Nov 05, 2019 at 11:02:44AM +0100, Dietmar Maurer wrote:
> >> Example: Backup from ceph disk (rbd_cache=false) to local disk:
> >>
> >> backup_calculate_cluster_size returns 64K (correct for my local .raw image)
> >>
> >> Then the backup job starts to read 64K blocks from ceph.
> >>
> >> But ceph always reads 4M block, so this is incredibly slow and produces
> >> way too much network traffic.
> >>
> >> Why does backup_calculate_cluster_size does not consider the block size 
> >> from
> >> the source disk? 
> >>
> >> cluster_size = MAX(block_size_source, block_size_target)
> So Ceph always transmits 4 MB over the network, no matter what is
> actually needed?  That sounds, well, interesting.

Or at least it generates that much I/O - in the end, it can slow down
the backup by up to a multi-digit factor...

> backup_calculate_cluster_size() doesn’t consider the source size because
> to my knowledge there is no other medium that behaves this way.  So I
> suppose the assumption was always that the block size of the source
> doesn’t matter, because a partial read is always possible (without
> having to read everything).

Unless you enable qemu-side caching this only works until the
block/cluster size of the source exceeds the one of the target.

> What would make sense to me is to increase the buffer size in general.
> I don’t think we need to copy clusters at a time, and
> 0e2402452f1f2042923a5 has indeed increased the copy size to 1 MB for
> backup writes that are triggered by guest writes.  We haven’t yet
> increased the copy size for background writes, though.  We can do that,
> of course.  (And probably should.)
> The thing is, it just seems unnecessary to me to take the source cluster
> size into account in general.  It seems weird that a medium only allows
> 4 MB reads, because, well, guests aren’t going to take that into account.

But guests usually have a page cache, which is why in many setups qemu
(and thereby the backup process) often doesn't.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]