[Qemu-devel] [Bug 1689499] Re: copy-storage-all/inc does not easily conv

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-devel] [Bug 1689499] Re: copy-storage-all/inc does not easily conv

From:	Dr. David Alan Gilbert
Subject:	[Qemu-devel] [Bug 1689499] Re: copy-storage-all/inc does not easily converge with load going on
Date:	Tue, 09 May 2017 18:41:28 -0000

OK, it'll be interesting if you get something repeatable, but sometimes
these type of bugs go and hide in a dark corner until someone trips over
them in a more critical situation.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1689499

Title:
  copy-storage-all/inc does not easily converge with load going on

Status in QEMU:
  Incomplete

Bug description:
  Hi,
  for now this is more a report to discuss than a "bug", but I wanted to be 
sure if there are things I might overlook.

  I'm regularly testing the qemu's we have in Ubuntu which currently are
  2.0, 2.5, 2.6.1, 2.8 plus a bunch of patches. And for all sorts of
  verification upstream every now and then.

  I recently realized that the migration options around 
--copy-storage-[all/inc] seem to have got worse at converging on migration. 
Although it is not a hard commit that is to be found, it just seems more likely 
to occur the newer the qemu versions is. I assume that is partially due to 
guest performance optimization that keep it busy.
  To a user it appears as a hanging migration being locked up.

  But let me outline what actually happens:
  - Setup without shared storage
  - Migration using --copy-storage-all/--copy-storage-inc
  - Working fine with idle guests
  - If the guests is busy the migration does take like forever (1 vCPU that are 
busy with 1 CPU, 1 memory and one disk hogging processes)
  - statistically seems to trigger more likely on newer qemu's (might be a red 
herring)

  The background workloads are most trivial burners:
  - cpu: md5sum /dev/urandom
  - memory: stress-ng -m 1 --vm-keep --vm-bytes 256M
  - disk: while /bin/true; do dd if=/dev/urandom of=/var/tmp/mjb.1 bs=4M 
count=100; done

  We are talking about ~1-2 minutes on qemu 2.5 (4 tries x 3
  architectures) and 2-10+ hours on >=qemu 2.6.1.

  I say it is likely not a bug, but more a discussion as I can easily avoid 
hanging via either:
  - timeouts (--timeout, ...) to abort or suspend to migrate it
  - --auto-converge ( I had only one try, but it seemed to help by slowing down 
the load generators)

  So you might say "that is all as it should be, and the users can use
  the further options to mitigate" and I'm all fine with that. In that
  case the bug still serves as a "searchable" document of some kind for
  others triggering the same case. But if anything comes to your mind
  that need better handling around this case lets start to discuss more
  deeply about it.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1689499/+subscriptions

[Prev in Thread]

Current Thread

[Next in Thread]

[Qemu-devel] [Bug 1689499] [NEW] copy-storage-all/inc does not easily converge with load going on, ChristianEhrhardt, 2017/05/09
- [Qemu-devel] [Bug 1689499] Re: copy-storage-all/inc does not easily converge with load going on, Dr. David Alan Gilbert, 2017/05/09
  - Re: [Qemu-devel] [Bug 1689499] Re: copy-storage-all/inc does not easily converge with load going on, ChristianEhrhardt, 2017/05/09
- [Qemu-devel] [Bug 1689499] Re: copy-storage-all/inc does not easily converge with load going on, ChristianEhrhardt, 2017/05/09
- [Qemu-devel] [Bug 1689499] Re: copy-storage-all/inc does not easily converge with load going on, ChristianEhrhardt, 2017/05/09
- [Qemu-devel] [Bug 1689499] Re: copy-storage-all/inc does not easily converge with load going on, Dr. David Alan Gilbert <=

Prev by Date: [Qemu-devel] VhostUserRequest # 20
Next by Date: Re: [Qemu-devel] [PATCH 03/17] tests: remove alt num-int cases
Previous by thread: [Qemu-devel] [Bug 1689499] Re: copy-storage-all/inc does not easily converge with load going on
Next by thread: [Qemu-devel] [PATCH 0/3] target/s390x: misc patches
Index(es):
- Date
- Thread