qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH-for-9.1 v2 2/3] migration: Remove RDMA protocol handling


From: Michael Galaxy
Subject: Re: [PATCH-for-9.1 v2 2/3] migration: Remove RDMA protocol handling
Date: Thu, 16 May 2024 12:29:19 -0500
User-agent: Mozilla Thunderbird

These are very compelling results, no?

(40gbps cards, right? Are the cards active/active? or active/standby?)

- Michael

On 5/14/24 10:19, Yu Zhang wrote:
Hello Peter and all,

I did a comparison of the VM live-migration speeds between RDMA and
TCP/IP on our servers
and plotted the results to get an initial impression. Unfortunately,
the Ethernet NICs are not the
recent ones, therefore, it may not make much sense. I can do it on
servers with more recent Ethernet
NICs and keep you updated.

It seems that the benefits of RDMA becomes obviously when the VM has
large memory and is
running memory-intensive workload.

Best regards,
Yu Zhang @ IONOS Cloud

On Thu, May 9, 2024 at 4:14 PM Peter Xu <peterx@redhat.com> wrote:
On Thu, May 09, 2024 at 04:58:34PM +0800, Zheng Chuan via wrote:
That's a good news to see the socket abstraction for RDMA!
When I was developed the series above, the most pain is the RDMA migration has 
no QIOChannel abstraction and i need to take a 'fake channel'
for it which is awkward in code implementation.
So, as far as I know, we can do this by
i. the first thing is that we need to evaluate the rsocket is good enough to 
satisfy our QIOChannel fundamental abstraction
ii. if it works right, then we will continue to see if it can give us 
opportunity to hide the detail of rdma protocol
     into rsocket by remove most of code in rdma.c and also some hack in 
migration main process.
iii. implement the advanced features like multi-fd and multi-uri for rdma 
migration.

Since I am not familiar with rsocket, I need some times to look at it and do 
some quick verify with rdma migration based on rsocket.
But, yes, I am willing to involved in this refactor work and to see if we can 
make this migration feature more better:)
Based on what we have now, it looks like we'd better halt the deprecation
process a bit, so I think we shouldn't need to rush it at least in 9.1
then, and we'll need to see how it goes on the refactoring.

It'll be perfect if rsocket works, otherwise supporting multifd with little
overhead / exported APIs would also be a good thing in general with
whatever approach.  And obviously all based on the facts that we can get
resources from companies to support this feature first.

Note that so far nobody yet compared with rdma v.s. nic perf, so I hope if
any of us can provide some test results please do so.  Many people are
saying RDMA is better, but I yet didn't see any numbers comparing it with
modern TCP networks.  I don't want to have old impressions floating around
even if things might have changed..  When we have consolidated results, we
should share them out and also reflect that in QEMU's migration docs when a
rdma document page is ready.

Chuan, please check the whole thread discussion, it may help to understand
what we are looking for on rdma migrations [1].  Meanwhile please feel free
to sync with Jinpu's team and see how to move forward with such a project.

[1] 
https://urldefense.com/v3/__https://lore.kernel.org/qemu-devel/87frwatp7n.fsf@suse.de/__;!!GjvTz_vk!QnXDo1zSlYecz7JvJky4SOQ9I8V5MoGHbINdAQAzMJQ_yYg_8_BSUXz9kjvbSgFefhG0wi1j38KaC3g$

Thanks,

--
Peter Xu




reply via email to

[Prev in Thread] Current Thread [Next in Thread]