[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: backup_calculate_cluster_size does not consider source

From: Dietmar Maurer
Subject: Re: backup_calculate_cluster_size does not consider source
Date: Wed, 6 Nov 2019 14:09:37 +0100 (CET)

> Let me elaborate: Yes, a cluster size generally means that it is most
> “efficient” to access the storage at that size.  But there’s a tradeoff.
>  At some point, reading the data takes sufficiently long that reading a
> bit of metadata doesn’t matter anymore (usually, that is).

Any network storage suffers from long network latencies, so it always
matters if you do more IOs than necessary.

> There is a bit of a problem with making the backup copy size rather
> large, and that is the fact that backup’s copy-before-write causes guest
> writes to stall. So if the guest just writes a bit of data, a 4 MB
> buffer size may mean that in the background it will have to wait for 4
> MB of data to be copied.[1]

We use this for several years now in production, and it is not a problem.
(Ceph storage is mostly on 10G (or faster) network equipment).

> Hm.  OTOH, we have the same problem already with the target’s cluster
> size, which can of course be 4 MB as well.  But I can imagine it to
> actually be important for the target, because otherwise there might be
> read-modify-write cycles.
> But for the source, I still don’t quite understand why rbd has such a
> problem with small read requests.  I don’t doubt that it has (as you
> explained), but again, how is it then even possible to use rbd as the
> backend for a guest that has no idea of this requirement?  Does Linux
> really prefill the page cache with 4 MB of data for each read?

No idea. I just observed that upstream qemu backups with ceph are 
quite unusable this way.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]