|
From: | Steven Sistare |
Subject: | Re: [PATCH V2 00/11] Live update: cpr-exec (reconnections) |
Date: | Tue, 20 Aug 2024 12:28:39 -0400 |
User-agent: | Mozilla Thunderbird |
On 8/13/2024 4:12 PM, Peter Xu wrote:
On Wed, Aug 07, 2024 at 03:47:47PM -0400, Steven Sistare wrote:On 8/4/2024 12:10 PM, Peter Xu wrote:On Sat, Jul 20, 2024 at 05:26:07PM -0400, Steven Sistare wrote:On 7/18/2024 11:56 AM, Peter Xu wrote:
[...]
Lastly, there is no loss of connectivity to the guest, because chardev descriptors remain open and connected.Again, I raised the question on why this would matter, as after all mgmt app will need to coop with reconnections due to the fact they'll need to support a generic live migration, in which case reconnection is a must. So far it doesn't sound like a performance critical path, for example, to do the mgmt reconnects on the ports. So this might be an optimization that most mgmt apps may not care much?Perhaps. I view the chardev preservation as nice to have, but not essential. It does not appear in this series, other than in docs. It's easy to implement given the CPR foundation. I suggest we continue this discussion when I post the chardev series, so we can focus on the core functionality.It's just that it can affect our decision on choosing the way to go. For example, do we have someone from Libvirt or any mgmt layer can help justify this point? As I said, I thought most facilities for reconnection should be ready, but I could miss important facts in mgmt layers..I will more deeply study reconnects in the mgmt layer, run some experiments to see if it is seamless for the end user, and get back to you, but it will take some time.
See below. [...]
Could I ask what management code you're working on? Why that management code doesn't need to already work out these problems with reconnections (like pre-CPR ways of live upgrade)?OCI - Oracle Cloud Infrastructure. Mgmt needs to manage reconnections for live migration, and perhaps I could leverage that code for live update, but happily I did not need to. Regardless, reconnection is the lesser issue. The bigger issue is resource management and the container environment. But I cannot justify that statement in detail without actually trying to implement cpr-transfer in OCI.
[...]
The use case is the same for both modes, but they are simply different transport methods for moving descriptors from old QEMU to new. The developer of the mgmt agent should be allowed to choose.It's out of my capability to review the mgmt impact on this one. This is all based on the idea that I think most mgmt apps supports reconnections pretty well. If that's the case, I'd definitely go for the transfer mode.
Closing the loop here on reconnections -- The managers I studied do not reconnect QEMU chardevs such as the guest console after live migration. In all cases, the old console goes dark and the user must manually reconnect to the console on the target. OCI does not auto reconnect. libvirt does not, one must reconnect through libvirtd on the target. kubevirt does not AFAICT; one must reconnect on the target using virtctl console. Thus chardev preservation does offer an improved user experience in this regard. chardevs can be preserved using either cpr-exec or cpr-transfer. But, if QEMU runs in a containerized environment that has agents that proxy connections between QEMU chardevs and the outside world, then only cpr-exec (which preserves the existing container) preserves connections end-to-end. OCI has such agents. I believe kubevirt does also. - Steve
[Prev in Thread] | Current Thread | [Next in Thread] |