qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] migrate/ram: let ram_save_target_page_legacy() return if qem


From: Fabiano Rosas
Subject: Re: [PATCH] migrate/ram: let ram_save_target_page_legacy() return if qemu file got error
Date: Tue, 15 Aug 2023 19:42:24 -0300

Peter Xu <peterx@redhat.com> writes:

> On Tue, Aug 15, 2023 at 09:35:19AM -0300, Fabiano Rosas wrote:
>> Guoyi Tu <tugy@chinatelecom.cn> writes:
>> 
>> > When the migration process of a virtual machine using huge pages is 
>> > cancelled,
>> > QEMU will continue to complete the processing of the current huge page
>> > through the qemu file object got an error set. These processing, such as
>> > compression and encryption, will consume a lot of CPU resources which may
>> > affact the the performance of the other VMs.
>> >
>> > To terminate the migration process more quickly and minimize unnecessary
>> > resource occupancy, it's neccessary to add logic to check the error status
>> > of qemu file object in the beginning of ram_save_target_page_legacy 
>> > function,
>> > and make sure the function returns immediately if qemu file got an error.
>> >
>> > Signed-off-by: Guoyi Tu <tugy@chinatelecom.cn>
>> > ---
>> >   migration/ram.c | 4 ++++
>> >   1 file changed, 4 insertions(+)
>> >
>> > diff --git a/migration/ram.c b/migration/ram.c
>> > index 9040d66e61..3e2ebf3004 100644
>> > --- a/migration/ram.c
>> > +++ b/migration/ram.c
>> > @@ -2133,6 +2133,10 @@ static int ram_save_target_page_legacy(RAMState 
>> > *rs, PageSearchStatus *pss)
>> >       ram_addr_t offset = ((ram_addr_t)pss->page) << TARGET_PAGE_BITS;
>> >       int res;
>> >
>> > +    if (qemu_file_get_error(pss->pss_channel)) {
>> > +        return -1;
>> > +    }
>> 
>> Where was the error set? Is this from cancelling via QMP? Or something
>> from within ram_save_target_page_legacy? We should probably make the
>> check closer to where the error happens. At the very least moving the
>> check into the loop.
>
> Fabiano - I think it's in the loop (of all target pages within a same host
> page), and IIUC Guoyi mentioned it's part of cancelling.

Yep, I see that. I meant explicitly move the code into the loop. Feels a
bit weird to check the QEMUFile for errors first thing inside the
function when nothing around it should have touched the QEMUFile.

About cancelling, QMP is not the only way to cancel. I was trying to
probe whether the cancelling itself is what causes the perceived issue
or if something else went wrong that caused the migration code to cancel
itself. We might be missing an error check somewhere else.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]