[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH 1/1] Fix qcow2 corruption on discard
From: |
Kevin Wolf |
Subject: |
Re: [PATCH 1/1] Fix qcow2 corruption on discard |
Date: |
Mon, 23 Nov 2020 17:09:41 +0100 |
Am 23.11.2020 um 16:49 hat Maxim Levitsky geschrieben:
> Commit 205fa50750 ("qcow2: Add subcluster support to zero_in_l2_slice()")
> introduced a subtle change to code in zero_in_l2_slice:
>
> It swapped the order of
>
> 1. qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_slice);
> 2. set_l2_entry(s, l2_slice, l2_index + i, QCOW_OFLAG_ZERO);
> 3. qcow2_free_any_clusters(bs, old_offset, 1, QCOW2_DISCARD_REQUEST);
>
> To
>
> 1. qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_slice);
> 2. qcow2_free_any_clusters(bs, old_offset, 1, QCOW2_DISCARD_REQUEST);
> 3. set_l2_entry(s, l2_slice, l2_index + i, QCOW_OFLAG_ZERO);
>
> It seems harmless, however the call to qcow2_free_any_clusters
> can trigger a cache flush which can mark the L2 table as clean,
> and assuming that this was the last write to it,
> a stale version of it will remain on the disk.
>
> Now we have a valid L2 entry pointing to a freed cluster. Oops.
>
> Fixes: 205fa50750 ("qcow2: Add subcluster support to zero_in_l2_slice()")
> Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
> ---
> block/qcow2-cluster.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
> index 485b4cb92e..267b46a4ca 100644
> --- a/block/qcow2-cluster.c
> +++ b/block/qcow2-cluster.c
> @@ -2010,11 +2010,11 @@ static int zero_in_l2_slice(BlockDriverState *bs,
> uint64_t offset,
> continue;
> }
>
> - qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_slice);
> if (unmap) {
> qcow2_free_any_cluster(bs, old_l2_entry, QCOW2_DISCARD_REQUEST);
> }
> set_l2_entry(s, l2_slice, l2_index + i, new_l2_entry);
> + qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_slice);
Good catch, but I think your order is wrong, too. We need the original
order from before 205fa50750:
1. qcow2_cache_entry_mark_dirty()
set_l2_entry() + set_l2_bitmap()
2. qcow2_free_any_cluster()
The order between qcow2_cache_entry_mark_dirty() and set_l2_entry()
shouldn't matter, but it's important that we update the refcount table
only after the L2 table has been updated to not reference the cluster
any more.
Otherwise a crash could lead to a situation where the cluster is
allocated (because it has refcount 0), but it was still in use in an L2
table. This is a classic corruption scenario.
Kevin
- [PATCH 0/1] Fix qcow2 corruption after addition of subcluster support, Maxim Levitsky, 2020/11/23
- [PATCH 1/1] Fix qcow2 corruption on discard, Maxim Levitsky, 2020/11/23
- Re: [PATCH 1/1] Fix qcow2 corruption on discard,
Kevin Wolf <=
- Re: [PATCH 1/1] Fix qcow2 corruption on discard, Kevin Wolf, 2020/11/23
- Re: [PATCH 1/1] Fix qcow2 corruption on discard, Maxim Levitsky, 2020/11/23
- Re: [PATCH 1/1] Fix qcow2 corruption on discard, Kevin Wolf, 2020/11/24
- Re: [PATCH 1/1] Fix qcow2 corruption on discard, Alberto Garcia, 2020/11/24
- Re: [PATCH 1/1] Fix qcow2 corruption on discard, Maxim Levitsky, 2020/11/24
- Re: [PATCH 1/1] Fix qcow2 corruption on discard, Maxim Levitsky, 2020/11/24
- Re: [PATCH 1/1] Fix qcow2 corruption on discard, Alberto Garcia, 2020/11/25