qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC PATCH] block/migration: Disable cache invalidate f


From: Alexey Kardashevskiy
Subject: Re: [Qemu-devel] [RFC PATCH] block/migration: Disable cache invalidate for incoming migration
Date: Fri, 03 Oct 2014 14:12:06 +1000
User-agent: Mozilla/5.0 (X11; Linux i686 on x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.1.2

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 10/03/2014 12:52 AM, Stefan Hajnoczi wrote:
> On Thu, Oct 02, 2014 at 06:52:52PM +1000, Alexey Kardashevskiy wrote:
>> When migrated using libvirt with "--copy-storage-all", at the end
>> of migration there is race between NBD mirroring task trying to do
>> flush and migration completion, both end up invalidating cache.
>> Since qcow2 driver does not handle this situation very well, random
>> crashes happen.
>> 
>> This disables the BDRV_O_INCOMING flag for the block device being
>> migrated and restores it when NBD task is done.
>> 
>> Signed-off-by: Alexey Kardashevskiy <address@hidden> ---
>> 
>> 
>> The commit log is not full and most likely incorrect as well as the
>> patch :) Please, help. Thanks!
>> 
>> The patch seems to fix the initial problem though.
>> 
>> 
>> btw is there any easy way to migrate one QEMU to another using NBD
>> (i.e. not using "migrate -b") and not using libvirt? What would the
>> command line be? Debugging with libvirt is real pain :(
>> 
>> 
>> --- block.c     | 17 ++++------------- migration.c |  1 - nbd.c
>> | 11 +++++++++++ 3 files changed, 15 insertions(+), 14 deletions(-)
>> 
>> diff --git a/block.c b/block.c index c5a251c..ed72e0a 100644 ---
>> a/block.c +++ b/block.c @@ -5073,6 +5073,10 @@ void
>> bdrv_invalidate_cache_all(Error **errp) QTAILQ_FOREACH(bs,
>> &bdrv_states, device_list) { AioContext *aio_context =
>> bdrv_get_aio_context(bs);
>> 
>> +        if (!(bs->open_flags & BDRV_O_INCOMING)) { +
>> continue; +        } +
> 
> We shouldn't touch bs before acquiring the AioContext.  Acquiring the 
> AioContext is basically the "lock" for the BDS.
> 
> It needs to be moved...
> 
>> aio_context_acquire(aio_context);
> 
> ...in here.
> 
>> bdrv_invalidate_cache(bs, &local_err); 
>> aio_context_release(aio_context); @@ -5083,19 +5087,6 @@ void
>> bdrv_invalidate_cache_all(Error **errp) } }
>> 
>> -void bdrv_clear_incoming_migration_all(void) -{ -
>> BlockDriverState *bs; - -    QTAILQ_FOREACH(bs, &bdrv_states,
>> device_list) { -        AioContext *aio_context =
>> bdrv_get_aio_context(bs); - -
>> aio_context_acquire(aio_context); -        bs->open_flags =
>> bs->open_flags & ~(BDRV_O_INCOMING); -
>> aio_context_release(aio_context); -    } -} - int
>> bdrv_flush(BlockDriverState *bs) { Coroutine *co; diff --git
>> a/migration.c b/migration.c index 8d675b3..c49a05a 100644 ---
>> a/migration.c +++ b/migration.c @@ -103,7 +103,6 @@ static void
>> process_incoming_migration_co(void *opaque) } qemu_announce_self();
>> 
>> -    bdrv_clear_incoming_migration_all(); /* Make sure all file
>> formats flush their mutable metadata */ 
>> bdrv_invalidate_cache_all(&local_err);
> 
> BDRV_O_INCOMING needs to be cleared, otherwise the block drivers will 
> think that incoming migration is still taking place and treat the
> file as effectively read-only during open.
> 
> On IRC I suggested doing away with the bdrv_invalidate_cache_all()
> name since it no longer calls bdrv_invalidate_cache() on all BDSes.
> 
> Combine both clearing BDRV_O_INCOMING and calling 
> bdrv_invalidate_cache() if BDRV_O_INCOMING was previously set into
> one function - you could reuse bdrv_clear_incoming_migration_all() for
> that.


I reread why bdrv_invalidate_cache() was added in the first place and
what BDRV_O_INCOMING is, every time I look at the code it plays in new
colors (not sure if it is a correct figure of speech :) ).

BDRV_O_INCOMING is only set when QEMU is about to receive migration and
we do not want QEMU to check the file at opening time as there is likely
garbage. Is there any other use of BDRV_O_INCOMING? There must be some as
bdrv_clear_incoming_migration_all() is called at the end of live
migration and I believe there must be a reason for it (cache does not
migrate, does it?).

bdrv_invalidate_cache() flushes cache as it could be already initialized
even if QEMU is receiving migration - QEMU could have cached some of real
disk data. Is that correct? I do not really understand why it would
happen if there is BDRV_O_INCOMING set but ok. I also thought that
bdrv_invalidate_cache() is called more often as its name suggests, not
during migration only, I was wrong here.

The patch below then should solve the initial problem.

bdrv_clear_incoming_migration_all() is unclear though...


diff --git a/block.c b/block.c
index c5a251c..6314af7 100644
- --- a/block.c
+++ b/block.c
@@ -5048,6 +5048,11 @@ void bdrv_invalidate_cache(BlockDriverState *bs,
Error **errp)
         return;
     }

+    if (!(bs->open_flags & BDRV_O_INCOMING)) {
+        return;
+    }
+    bs->open_flags &= ~(BDRV_O_INCOMING);
+
     if (bs->drv->bdrv_invalidate_cache) {
         bs->drv->bdrv_invalidate_cache(bs, &local_err);
     } else if (bs->file) {
@@ -5083,19 +5088,6 @@ void bdrv_invalidate_cache_all(Error **errp)
     }
 }

- -void bdrv_clear_incoming_migration_all(void)
- -{
- -    BlockDriverState *bs;
- -
- -    QTAILQ_FOREACH(bs, &bdrv_states, device_list) {
- -        AioContext *aio_context = bdrv_get_aio_context(bs);
- -
- -        aio_context_acquire(aio_context);
- -        bs->open_flags = bs->open_flags & ~(BDRV_O_INCOMING);
- -        aio_context_release(aio_context);
- -    }
- -}
- -
 int bdrv_flush(BlockDriverState *bs)
 {
     Coroutine *co;
diff --git a/migration.c b/migration.c
index 8d675b3..c49a05a 100644
- --- a/migration.c
+++ b/migration.c
@@ -103,7 +103,6 @@ static void process_incoming_migration_co(void *opaque)
     }
     qemu_announce_self();

- -    bdrv_clear_incoming_migration_all();
     /* Make sure all file formats flush their mutable metadata */
     bdrv_invalidate_cache_all(&local_err);
     if (local_err) {
diff --git a/nbd.c b/nbd.c
index e9b539b..953c378 100644
- --- a/nbd.c
+++ b/nbd.c
@@ -972,6 +972,7 @@ NBDExport *nbd_export_new(BlockDriverState *bs, off_t
dev_offset,
     exp->ctx = bdrv_get_aio_context(bs);
     bdrv_ref(bs);
     bdrv_add_aio_context_notifier(bs, bs_aio_attached, bs_aio_detach, exp);
+    bdrv_invalidate_cache(bs, NULL);
     return exp;
 }




>> if (local_err) { diff --git a/nbd.c b/nbd.c index e9b539b..7b479c0
>> 100644 --- a/nbd.c +++ b/nbd.c @@ -106,6 +106,7 @@ struct NBDExport
>> { off_t dev_offset; off_t size; uint32_t nbdflags; +    bool
>> restore_incoming;
> 
> Not needed, it does not make sense to restore BDRV_O_INCOMING because 
> once we've written to a file it cannot be in use by another host at
> the same time.
> 
>> QTAILQ_HEAD(, NBDClient) clients; QTAILQ_ENTRY(NBDExport) next;
>> 
>> @@ -972,6 +973,13 @@ NBDExport *nbd_export_new(BlockDriverState *bs,
>> off_t dev_offset, exp->ctx = bdrv_get_aio_context(bs); 
>> bdrv_ref(bs); bdrv_add_aio_context_notifier(bs, bs_aio_attached,
>> bs_aio_detach, exp); + +    if (bs->open_flags & BDRV_O_INCOMING) {
> 
> I think the flag has to be cleared before calling 
> bdrv_invalidate_cache() because the .bdrv_open() function looks at
> the flag.
> 


- -- 
Alexey
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1

iQIcBAEBAgAGBQJULiIVAAoJEIYTPdgrwSC57ZIP+gI2ckqbMTf2LFl4MVP1oOEZ
IkbZGiWJRfVsUitKD4+fjYB7ysnkt1hlrqdaaoTaPPzst676cGek91KdkpxqgorU
L80e5TaRO/Ltx65yN1gMHhGJJnck9KvnzSp9atqUZNBGiMbYDPj8lN5NeyBadwdk
37uR/SvFrVRUWMVpgs7XF7q60C6YRnoGy+8qqmUSlwx3aYtFZpOfSyQYWkY5mT5j
3nsymm1ieFAQOd255K6X1oZW6GjwNCXa5MsUspK5gokPufjZQguGorRUMdBDmKa+
L9dvvMDG0I88027W3eTI0a2WI0D3BllizNuThacDHQEiLTIVGwd9VIDPYUJiz85W
7p88H6c8AozhUlcQiNK1o185HFCC+bcxlX0rHHRTLG+WfEWPMJ0Y4iewBvVa/IPZ
Dzje5mEvdOm5MeU1/BiyTA6fc5nN5CpVth5mu7stMnnVvEAjK+gttKqTMMcR4BWa
bzbREsrw97eqgGZcqH6T8X69Y9kEjgofdvs9uOlH4XlCD5CjJZNoylF8bzcwAFnO
jlDNPAHCJATKLxIH6SClwaVf74lRn6Jv1WEKwKX9zuAq8rN34mSVT4VauEvRSsP0
RKnkqbUl37MZcf/dpiqr3M2W0pXDArvPZpme42IBJkTtpgFH5jC5Cry9OipOPqNc
dFVR8sIgTQ4HLktVx4Fz
=3bSl
-----END PGP SIGNATURE-----



reply via email to

[Prev in Thread] Current Thread [Next in Thread]