[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Qemu-block] [PATCH 1/1] qcow2: handle cluster leak happening with a gue
From: |
Denis V. Lunev |
Subject: |
[Qemu-block] [PATCH 1/1] qcow2: handle cluster leak happening with a guest TRIM command |
Date: |
Sun, 21 May 2017 17:21:46 +0300 |
qemu-img create -f qcow2 1.img 64G
qemu-io -c "write -P 0x32 0 64k" 1.img
results in
324 -rw-r--r-- 1 den den 393216 May 21 16:48 1.img
Subsequent
qemu-io -c "write -z 0 64k" 1.img
qemu-io -c "write -P 0x32 0 64k" 1.img
results in
388 -rw-r--r-- 1 den den 458752 May 21 16:50 1.img
which looks like we have 1 cluster leaked.
Indeed, qcow2_co_pwrite_zeroes calls qcow2_zero_clusters/zero_single_l2,
which does not update refcount for the host cluster and keep the offset
as used. Later on handle_copied() does not take into account
QCOW2_CLUSTER_ZERO type of the cluster.
For now we comes into a very bad situation after qcow2_zero_cluster.
We have a hole in the host file and we have the offset allocated for
that guest cluster. Now there are 2 options:
1) allocate new offset once the write will come into this guest
cluster (actually happens, but original cluster offset is leaked)
2) re-use host offset, i.e. fix handle_copied() to allow to reuse
offset not only for QCOW2_CLUSTER_NORMAL but for QCOW2_CLUSTER_ZERO
too
Option 2) seems worse to me as in this case we can easily have host
fragmentation in that cluster if writes will come in small pieces.
This is not a very big deal if we have filesystem with PUCH_HOLE support,
but without this feature the cluster is actually leaked forever.
The patch replaces zero_single_l2 with discard_single_l2 and removes
now unused zero_single_l2 to fix the situation.
Signed-off-by: Denis V. Lunev <address@hidden>
CC: Kevin Wolf <address@hidden>
CC: Max Reitz <address@hidden>
---
block/qcow2-cluster.c | 46 ++--------------------------------------------
1 file changed, 2 insertions(+), 44 deletions(-)
diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index 100398c..1e53a7c 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -1548,49 +1548,6 @@ fail:
return ret;
}
-/*
- * This zeroes as many clusters of nb_clusters as possible at once (i.e.
- * all clusters in the same L2 table) and returns the number of zeroed
- * clusters.
- */
-static int zero_single_l2(BlockDriverState *bs, uint64_t offset,
- uint64_t nb_clusters, int flags)
-{
- BDRVQcow2State *s = bs->opaque;
- uint64_t *l2_table;
- int l2_index;
- int ret;
- int i;
-
- ret = get_cluster_table(bs, offset, &l2_table, &l2_index);
- if (ret < 0) {
- return ret;
- }
-
- /* Limit nb_clusters to one L2 table */
- nb_clusters = MIN(nb_clusters, s->l2_size - l2_index);
- assert(nb_clusters <= INT_MAX);
-
- for (i = 0; i < nb_clusters; i++) {
- uint64_t old_offset;
-
- old_offset = be64_to_cpu(l2_table[l2_index + i]);
-
- /* Update L2 entries */
- qcow2_cache_entry_mark_dirty(bs, s->l2_table_cache, l2_table);
- if (old_offset & QCOW_OFLAG_COMPRESSED || flags & BDRV_REQ_MAY_UNMAP) {
- l2_table[l2_index + i] = cpu_to_be64(QCOW_OFLAG_ZERO);
- qcow2_free_any_clusters(bs, old_offset, 1, QCOW2_DISCARD_REQUEST);
- } else {
- l2_table[l2_index + i] |= cpu_to_be64(QCOW_OFLAG_ZERO);
- }
- }
-
- qcow2_cache_put(bs, s->l2_table_cache, (void **) &l2_table);
-
- return nb_clusters;
-}
-
int qcow2_zero_clusters(BlockDriverState *bs, uint64_t offset, int nb_sectors,
int flags)
{
@@ -1609,7 +1566,8 @@ int qcow2_zero_clusters(BlockDriverState *bs, uint64_t
offset, int nb_sectors,
s->cache_discards = true;
while (nb_clusters > 0) {
- ret = zero_single_l2(bs, offset, nb_clusters, flags);
+ ret = discard_single_l2(bs, offset, nb_clusters,
+ QCOW2_DISCARD_REQUEST, false);
if (ret < 0) {
goto fail;
}
--
2.7.4
- [Qemu-block] [PATCH 1/1] qcow2: handle cluster leak happening with a guest TRIM command,
Denis V. Lunev <=