[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH 1/2] Postcopy: Force allocation of all-zero prec
From: |
Christian Borntraeger |
Subject: |
Re: [Qemu-devel] [PATCH 1/2] Postcopy: Force allocation of all-zero precopy pages |
Date: |
Fri, 28 Apr 2017 15:19:03 +0200 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0 |
On 04/27/2017 03:47 PM, Andrea Arcangeli wrote:
> On Thu, Apr 27, 2017 at 08:44:03AM +0200, Christian Borntraeger wrote:
>> I have started instrumenting the kernel. I can see a set_pte_at for this
>> address
>> and I see an (to be understood) invalidation shortly after that which
>> explains
>> why I get a fault.
>
> Sounds great that you can see an invalidation shortly after, that is
> the real source of the problem. Can you get a stack trace of such
> invalidation?
>
> Thanks!
> Andrea
>
Finally got it. I had a test module in that guest, which triggered a storage
key
operation. Normally we no longer use the storage keys in Linux. Therefore KVM
disables storage key support and intercepts all storage key instructions to
enable
the support for that lazily.This makes paging easier and faster to not worry
about those.
When we enable storage keys, we must not use shared pages as the storage key
is a property of the physical page frame (and not of the virtual page).
Therefore, this enablement makes mm_forbids_zeropage return true and removes
all existing zero pages.
(see commit 2faee8ff9dc6f4bfe46f6d2d110add858140fb20
s390/mm: prevent and break zero page mappings in case of storage keys)
In this case it was called while migrating the storage keys (via kvm ioctl)
resulting in zero page mappings going away. (see qemu hw/s390x/s390-skeys.c)
Apr 28 14:48:43 s38lp08 kernel: ([<000000000011218a>] show_trace+0x62/0x78)
Apr 28 14:48:43 s38lp08 kernel: [<0000000000112278>] show_stack+0x68/0xe0
Apr 28 14:48:43 s38lp08 kernel: [<000000000066f82e>] dump_stack+0x7e/0xb0
Apr 28 14:48:43 s38lp08 kernel: [<0000000000123b2c>]
ptep_xchg_direct+0x254/0x288
Apr 28 14:48:43 s38lp08 kernel: [<0000000000127cfe>]
__s390_enable_skey+0x76/0xa0
Apr 28 14:48:43 s38lp08 kernel: [<00000000002e5278>]
__walk_page_range+0x270/0x500
Apr 28 14:48:43 s38lp08 kernel: [<00000000002e5592>]
walk_page_range+0x8a/0x148
Apr 28 14:48:43 s38lp08 kernel: [<0000000000127bc6>]
s390_enable_skey+0x116/0x140
Apr 28 14:48:43 s38lp08 kernel: [<000000000013fd92>]
kvm_arch_vm_ioctl+0x11ea/0x1c70
Apr 28 14:48:43 s38lp08 kernel: [<0000000000131aa2>] kvm_vm_ioctl+0xca/0x710
Apr 28 14:48:43 s38lp08 kernel: [<00000000003460e8>] do_vfs_ioctl+0xa8/0x608
Apr 28 14:48:43 s38lp08 kernel: [<00000000003466ec>] SyS_ioctl+0xa4/0xb8
Apr 28 14:48:43 s38lp08 kernel: [<0000000000923460>] system_call+0xc4/0x23c
As a result a userfault on this virtual address will indeed go back to QEMU
and asks again for that page. And then QEMU "oh I have that page already
transferred"
(even if it was detected as zero page and just faulted in by reading from it)
So I will not write it again.
Several options:
- let postcopy not discard a page, even it if must already be there (patch from
David)
- change s390-skeys to register_savevm_live and do the skey enablement
very early (but this will be impossible for incoming data from old versions)
- let kernel s390_enable_skey actually fault in (might show big memory
consumption)
- let qemu hw/s390x/s390-skeys.c tell the migration code that pages might need
retransmissions
....
- [Qemu-devel] [PATCH 0/2] Postcopy fix and traces, Dr. David Alan Gilbert (git), 2017/04/26
- [Qemu-devel] [PATCH 2/2] migration: Extra tracing, Dr. David Alan Gilbert (git), 2017/04/26
- [Qemu-devel] [PATCH 1/2] Postcopy: Force allocation of all-zero precopy pages, Dr. David Alan Gilbert (git), 2017/04/26
- Re: [Qemu-devel] [PATCH 1/2] Postcopy: Force allocation of all-zero precopy pages, Christian Borntraeger, 2017/04/26
- Re: [Qemu-devel] [PATCH 1/2] Postcopy: Force allocation of all-zero precopy pages, Dr. David Alan Gilbert, 2017/04/26
- Re: [Qemu-devel] [PATCH 1/2] Postcopy: Force allocation of all-zero precopy pages, Andrea Arcangeli, 2017/04/26
- Re: [Qemu-devel] [PATCH 1/2] Postcopy: Force allocation of all-zero precopy pages, Peter Xu, 2017/04/26
- Re: [Qemu-devel] [PATCH 1/2] Postcopy: Force allocation of all-zero precopy pages, Christian Borntraeger, 2017/04/27
- Re: [Qemu-devel] [PATCH 1/2] Postcopy: Force allocation of all-zero precopy pages, Andrea Arcangeli, 2017/04/27
- Re: [Qemu-devel] [PATCH 1/2] Postcopy: Force allocation of all-zero precopy pages,
Christian Borntraeger <=
- Re: [Qemu-devel] [PATCH 1/2] Postcopy: Force allocation of all-zero precopy pages, Dr. David Alan Gilbert, 2017/04/28
- Re: [Qemu-devel] [PATCH 1/2] Postcopy: Force allocation of all-zero precopy pages, Christian Borntraeger, 2017/04/26
- Re: [Qemu-devel] [PATCH 1/2] Postcopy: Force allocation of all-zero precopy pages, Christian Borntraeger, 2017/04/26
Re: [Qemu-devel] [PATCH 1/2] Postcopy: Force allocation of all-zero precopy pages, Juan Quintela, 2017/04/26