[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: backup_calculate_cluster_size does not consider source
From: |
Wolfgang Bumiller |
Subject: |
Re: backup_calculate_cluster_size does not consider source |
Date: |
Wed, 6 Nov 2019 11:34:50 +0100 |
User-agent: |
NeoMutt/20180716 |
On Wed, Nov 06, 2019 at 10:37:04AM +0100, Max Reitz wrote:
> On 06.11.19 09:32, Stefan Hajnoczi wrote:
> > On Tue, Nov 05, 2019 at 11:02:44AM +0100, Dietmar Maurer wrote:
> >> Example: Backup from ceph disk (rbd_cache=false) to local disk:
> >>
> >> backup_calculate_cluster_size returns 64K (correct for my local .raw image)
> >>
> >> Then the backup job starts to read 64K blocks from ceph.
> >>
> >> But ceph always reads 4M block, so this is incredibly slow and produces
> >> way too much network traffic.
> >>
> >> Why does backup_calculate_cluster_size does not consider the block size
> >> from
> >> the source disk?
> >>
> >> cluster_size = MAX(block_size_source, block_size_target)
>
> So Ceph always transmits 4 MB over the network, no matter what is
> actually needed? That sounds, well, interesting.
Or at least it generates that much I/O - in the end, it can slow down
the backup by up to a multi-digit factor...
> backup_calculate_cluster_size() doesn’t consider the source size because
> to my knowledge there is no other medium that behaves this way. So I
> suppose the assumption was always that the block size of the source
> doesn’t matter, because a partial read is always possible (without
> having to read everything).
Unless you enable qemu-side caching this only works until the
block/cluster size of the source exceeds the one of the target.
> What would make sense to me is to increase the buffer size in general.
> I don’t think we need to copy clusters at a time, and
> 0e2402452f1f2042923a5 has indeed increased the copy size to 1 MB for
> backup writes that are triggered by guest writes. We haven’t yet
> increased the copy size for background writes, though. We can do that,
> of course. (And probably should.)
>
> The thing is, it just seems unnecessary to me to take the source cluster
> size into account in general. It seems weird that a medium only allows
> 4 MB reads, because, well, guests aren’t going to take that into account.
But guests usually have a page cache, which is why in many setups qemu
(and thereby the backup process) often doesn't.
- Re: backup_calculate_cluster_size does not consider source, Stefan Hajnoczi, 2019/11/06
- Re: backup_calculate_cluster_size does not consider source, Max Reitz, 2019/11/06
- Re: backup_calculate_cluster_size does not consider source,
Wolfgang Bumiller <=
- Re: backup_calculate_cluster_size does not consider source, Max Reitz, 2019/11/06
- Re: backup_calculate_cluster_size does not consider source, Dietmar Maurer, 2019/11/06
- Re: backup_calculate_cluster_size does not consider source, Max Reitz, 2019/11/06
- Re: backup_calculate_cluster_size does not consider source, Max Reitz, 2019/11/06
- Re: backup_calculate_cluster_size does not consider source, Dietmar Maurer, 2019/11/06
- Re: backup_calculate_cluster_size does not consider source, Max Reitz, 2019/11/06
- Re: backup_calculate_cluster_size does not consider source, Dietmar Maurer, 2019/11/06
- Re: backup_calculate_cluster_size does not consider source, Max Reitz, 2019/11/06
- Re: backup_calculate_cluster_size does not consider source, Vladimir Sementsov-Ogievskiy, 2019/11/06
Re: backup_calculate_cluster_size does not consider source, Dietmar Maurer, 2019/11/06