[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH 0/4] target/arm: Improvement on memory error handling
From: |
Jonathan Cameron |
Subject: |
Re: [PATCH 0/4] target/arm: Improvement on memory error handling |
Date: |
Fri, 14 Feb 2025 09:53:53 +0000 |
On Fri, 14 Feb 2025 14:16:31 +1000
Gavin Shan <gshan@redhat.com> wrote:
> Currently, there is only one CPER buffer (entry), meaning only one
> memory error can be reported. In extreme case, multiple memory errors
> can be raised on different vCPUs. For example, a singile memory error
> on a 64KB page of the host can results in 16 memory errors to 4KB
> pages of the guest. Unfortunately, the virtual machine is simply aborted
> by multiple concurrent memory errors, as the following call trace shows.
> A SEA exception is injected to the guest so that the CPER buffer can
> be claimed if the error is successfully pushed by acpi_ghes_memory_errors(),
> Otherwise, abort() is triggered to crash the virtual machine.
>
> kvm_vcpu_thread_fn
> kvm_cpu_exec
> kvm_arch_on_sigbus_vcpu
> kvm_cpu_synchronize_state
> acpi_ghes_memory_errors (a)
> kvm_inject_arm_sea | abort
>
> It's arguably to crash the virtual machine in this case. The better
> behaviour would be to retry on pushing the memory errors, to keep the
> virtual machine alive so that the administrator has chance to chime
> in, for example to dump the important data with luck. This series
> adds one more parameter to acpi_ghes_memory_errors() so that it will
> be tried to push the memory error until it succeeds.
Hi Gavin,
+CC Mauro given:
https://lore.kernel.org/all/cover.1738345063.git.mchehab+huawei@kernel.org/
is more or less reviewed subject to some requested patch reordering and
whilst I haven't checked, seems unlikely that there won't be a
clash with this series (might just be some fuzz)
Jonathan
>
> Gavin Shan (4):
> acpi/ghes: Make ghes_record_cper_errors() static
> acpi/ghes: Use error_report() in ghes_record_cper_errors()
> acpi/ghes: Allow retry to write CPER errors
> target/arm: Retry pushing CPER error if necessary
>
> hw/acpi/ghes-stub.c | 3 ++-
> hw/acpi/ghes.c | 45 +++++++++++++++++++++---------------------
> include/hw/acpi/ghes.h | 5 ++---
> target/arm/kvm.c | 31 +++++++++++++++++++++++------
> 4 files changed, 51 insertions(+), 33 deletions(-)
>
- Re: [PATCH 1/4] acpi/ghes: Make ghes_record_cper_errors() static, (continued)
Re: [PATCH 0/4] target/arm: Improvement on memory error handling,
Jonathan Cameron <=
Re: [PATCH 0/4] target/arm: Improvement on memory error handling, Jonathan Cameron, 2025/02/14
Re: [PATCH 0/4] target/arm: Improvement on memory error handling, Mauro Carvalho Chehab, 2025/02/14