qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 0/6] migration/ram: Optimize for virtio-mem via RamDiscard


From: David Hildenbrand
Subject: Re: [PATCH v2 0/6] migration/ram: Optimize for virtio-mem via RamDiscardManager
Date: Wed, 28 Jul 2021 21:46:09 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0

On 28.07.21 21:42, Peter Xu wrote:
On Wed, Jul 28, 2021 at 07:39:39PM +0200, David Hildenbrand wrote:
Meanwhile, I still have no idea how much overhead the "loop" part could bring.
For a large virtio-mem region with frequent plugged/unplugged mem interacted,
it seems possible to take a while to me..  I have no solid idea yet.

Let's do some math. Assume the worst case on a 1TiB device with a 2MiB block
size: We have 524288 blocks == bits. That's precisely a 64k bitmap in
virtio-mem. In the worst case, every second bit would be clear
("discarded"). For each clear bit ("discarded"), we would have to clear 512
bits (64 bytes) in the dirty bitmap. That's storing 32 MiB.

So scanning 64 KiB, writing 32 MiB. Certainly not perfect, but I am not sure
if it will really matter doing that once on every bitmap sync. I guess the
bitmap syncing itself is much more expensive -- and not syncing the
discarded ranges (b ) above) would make a bigger impact I guess.

I'm not worried about the memory size to be accessed as bitmaps; it's more
about the loop itself.  500K blocks/bits means the cb() worse case can be
called 500K/2=250k times, no matter what's the hook is doing.

But yeah that's the worst case thing and for a 1TB chunk, I agree that can also
be too harsh.  It's just that if it's very easy to be done in bitmap init then
still worth thinking about it.



The thing is I still think this extra operation during sync() can be ignored by
simply clear dirty log during bitmap init, then.. why not? :)

I guess clearing the dirty log (especially in KVM) might be more expensive.

If we send one ioctl per cb that'll be expensive for sure.  I think it'll be
fine if we send one clear ioctl to kvm, summarizing the whole bitmap to clear.

The other thing is imho having overhead during bitmap init is always better
than having that during sync(). :)

Oh, right, so you're saying, after we set the dirty bmap to all ones and excluded the discarded parts, setting the respective bits to 0, we simply issue clearing of the whole area?

For now I assumed we would have to clear per cb.


--
Thanks,

David / dhildenb




reply via email to

[Prev in Thread] Current Thread [Next in Thread]