[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Error handling for KVM_GET_DIRTY_LOG

From: Janosch Frank
Subject: Re: [Qemu-devel] Error handling for KVM_GET_DIRTY_LOG
Date: Mon, 20 Feb 2017 15:33:15 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.7.0

On 20.02.2017 14:46, Paolo Bonzini wrote:
> On 16/02/2017 15:51, Janosch Frank wrote:
>> While trying to fix a bug in the s390 migration code, I noticed that
>> QEMU ignores practically all errors returned from that VM ioctl. QEMU
>> behaves as specified in the KVM api and only processes -1 (-EPERM) as an
>> error.
>> Unfortunately the documentation is wrong/old and KVM may return -EFAULT,
>> -EINVAL, -ENOTSUPP (BookE) and -ENOENT. This bugs me, as I found a case
>> where I want to return -EFAULT because of guest memory problems and QEMU
>> will still happily migrate the VM.
> Guest memory problems should not return EFAULT, which corresponds to a
> wrong address passed to KVM_GET_DIRTY_LOG.  In fact, EFAULT is probably
> the only case where an assertion is warranted---just like you passed a
> wrong pointer to KVM_GET_DIRTY_LOG, who knows who else is going to get
> that pointer.
> ENOENT and EINVAL should not kill the source guest, though they should
> terminate migration.  But then I would like to know more about this
> case, because they should never happen unless KVMMemoryListener is buggy.

It is currently possible to start a hugetlbfs guest on s390 although we
don't have any huge page support. When QEMU starts the VM, it will get a
lot of errors back and pause the VM. When this VM is then migrated, the
host will do pte dirty handling on huge pages in

Running into such a huge page would be a guest memory error, so EINVAL
it is.

I'll post the patches in a bit to give a bit more context.

> Paolo
>> I currently don't see a reason why we continue to migrate on EFAULT and
>> EINVAL. But returning -error from kvm_physical_sync_dirty_bitmap might
>> also a bit hard, as it kills QEMU.
>> Do we want to fix this and if, how do we want it done?
>> If not we at least have a definitive mail to point to when the next one
>> comes around. I also have a KVM patch to update the api documentation if
>> wanted (maybe we should dust that off a bit anyhow).
>> This has been brought up in 2009 [1] the first time and was more or less
>> fixed and then reverted in 2014 [2].
>> The reason in [1] was that PPC hadn't settled yet on a valid return code.
>> In [2] it was too close to the v2 to handle it properly.
>> [1] https://lists.nongnu.org/archive/html/qemu-devel/2009-07/msg01772.html
>> [2] https://lists.nongnu.org/archive/html/qemu-devel/2014-04/msg01993.html
>> Cheers,
>> Janosch

reply via email to

[Prev in Thread] Current Thread [Next in Thread]