qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 0/6] migration/ram: Optimize for virtio-mem via RamDiscard


From: David Hildenbrand
Subject: Re: [PATCH v2 0/6] migration/ram: Optimize for virtio-mem via RamDiscardManager
Date: Thu, 29 Jul 2021 18:19:31 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0

On 29.07.21 18:12, Peter Xu wrote:
On Thu, Jul 29, 2021 at 10:14:47AM +0200, David Hildenbrand wrote:
The thing is I still think this extra operation during sync() can be ignored by
simply clear dirty log during bitmap init, then.. why not? :)

I guess clearing the dirty log (especially in KVM) might be more expensive.

If we send one ioctl per cb that'll be expensive for sure.  I think it'll be
fine if we send one clear ioctl to kvm, summarizing the whole bitmap to clear.

The other thing is imho having overhead during bitmap init is always better
than having that during sync(). :)

Oh, right, so you're saying, after we set the dirty bmap to all ones and
excluded the discarded parts, setting the respective bits to 0, we simply
issue clearing of the whole area?

For now I assumed we would have to clear per cb.

Hmm when I replied I thought we can pass in a bitmap to ->log_clear() but I
just remembered memory API actually hides the bitmap interface..

Reset the whole region works, but it'll slow down migration starts, more
importantly that'll be with mmu write lock so we will lose most clear-log
benefit for the initial round of migration and stuck the guest #pf at the
meantime...

Let's try do that in cb()s as you mentioned; I think that'll still be okay,
because so far the clear log block size is much larger (1gb), 1tb is worst case
1000 ioctls during bitmap init, slightly better than 250k calls during sync(),
maybe? :)

Just to get it right, what you propose is calling
migration_clear_memory_region_dirty_bitmap_range() from each cb().

Right.  We can provide a more complicated memory api for passing in bitmap but
I think that can be an overkill and tricky.

Due to the clear_bmap, we will end up clearing each chunk (e.g., 1GB) at most
once.

But if our layout is fragmented, we can actually end up clearing all chunks
(1024 ioctls for 1TB), resulting in a slower migration start.

Any gut feeling how much slower migration start could be with largish (e.g.,
1 TiB) regions?

I had a vague memory of KVM_GET_DIRTY_LOG that I used to measure which took
~10ms for 1g guest mem, supposing that's mostly used to protect the pages or
clearing dirties in the EPT pgtables.  Then the worst case is ~1 second for
1tb.

But note that it's still during setup phase, so we should expect to see a
somehow large setup time and longer period that migration stays in SETUP state,
but I think it's fine.  Reasons:

   - We don't care too much about guest dirtying pages during the setup process
     because we haven't migrated anything yet, meanwhile we should not block any
     other thread either (e.g., we don't hold BQL).

   - We don't block guest execution too.  Unlike KVM_GET_DIRTY_LOG without CLEAR
     we won't hold the mmu lock for a huge long time but do it only in 1g chunk,
     so guest page faults can still be serviced.  It'll be affected somehow
     since we'll still run with the mmu write lock critical sections for each
     single ioctl(), but we do that for 1gb each time so we frequently yield it.


Please note that we are holding the iothread lock while setting up the bitmaps + syncing the dirty log. I'll have to make sure that that code runs outside of the BQL, otherwise we'll block guest execution.

In the meantime I adjusted the code but it does the clearing under the iothread lock, which should not be what we want ... I'll have a look.

--
Thanks,

David / dhildenb




reply via email to

[Prev in Thread] Current Thread [Next in Thread]