qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] 答复: using which notification for guest about GHES err


From: James Morse
Subject: Re: [Qemu-devel] 答复: using which notification for guest about GHES error
Date: Mon, 16 Oct 2017 17:59:41 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Icedove/31.6.0

Hi gengdongjiu, Igor,

On 16/10/17 15:33, gengdongjiu wrote:
>> On Mon, 16 Oct 2017 14:10:05 +0800
>> gengdongjiu <address@hidden> wrote:
>>> Now we use Qemu to create APEI table and record CPER for guest, After
>>> QEMU recorded a asynchronous CPER error, we needs to notify guest using 
>>> interrupt or Polled notification.
>>> For the asynchronous error. I think using GPIO-signaled notification may be 
>>> better in the Qemu, and also which is suggested by APEI spec.
>>> James worried that old guest OS may not support GPIO or GSIV notification 
>>> for GHES, because GPIO or GSIV notification is supported in OS
>> since about kernel version 4.10.
>>
>> How APEI support is fairly new on ARM (kernel), isn't it still in state of 
>> development?

The NMI-like notifications, (SEA, SEI, SDEI) are still being worked on, but the
less exotic Polled and many-flavours-of-interrupt should have exactly the same
meaning/behaviour as on x86. (it should be possible to emulate/configure these
with common user-space code too)


>> Do we really care about old guests in this case?

I think the scenario here is the host kernel has some RAS support, Qemu has RAS
support and has advertised its CPER regions via the HEST, but the guest doesn't
doesn't support RAS. (booted via DT, wasn't configured for APEI, the kernel
pre-dates the support etc).

What should Qemu do in response to 'action optional' memory errors?

My suggestion is whatever action Qemu takes, it shouldn't kill a guest that
doesn't support RAS. Using NOTIFY_SEI for an action-optional memory error will
do this. A guest that doesn't know about NOTIFY_SEI will take this as a fatal
SError.


> How APEI support is new feature on ARM64, because it mainly exists in ARMv8.2 
> architecture.

ARMv8.2 isn't relevant here: The host kernel has some RAS support.
(My ARMv8.0 AMD Seattle has a HEST with NOTIFY_POLL entries).


> May be we cannot very care about old guest.
> Because even we use the old notification(such as polled notification), the 
> guest OS may still not
> have APEI support, so it is still not useless.

The aim is to not kill the guest with the notification. Writing CPER records to
the polled buffer for action-optional signals will be found by a guest that
supports RAS, and ignored by a guest that doesn't.

Similarly if we report Action-Required signals as Synchronous-External-Abort, we
could make these NOTIFY_SEA. A guest that has RAS support will find the CPER
records, a guest that doesn't will still do the right thing.
(I think we need more information from KVM to support this one)


> I checked the patches history, the APEI support is only enabled recent. 
> 
> As we can see APEI/GHES is only enabled in "2017-06-21 12:30:44 -0500", the 
> old version OS even does not have APEI support.
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v4.14-rc5&id=c792e5e644fd8cd38b963fd3b38f6bf57c530966
> 
> James, what do you think about it?

I think you shouldn't expect host kernel, guest kernel and Qemu version to all
pair up nicely.


>> We'd like to stick to ACPI spec as much as possible and also to
>> http://infocenter.arm.com/help/topic/com.arm.doc.den0044b/DEN0044B_Server_Base_Boot_Requirements.pdf
>> which mandates GPIO in platform (QEMU)
>> "
>> 4.5 Hardware Requirements Imposed on the Platform by ACPI ...
>> Platforms compliant with this specification must provide the following 
>> GPIO-Signaled platform events:
>> ...
>> "
>>
>>> and suggested using Polled notification. About above two notifications, do 
>>> you think
>>> which is better? and could you give us some suggestion? thanks.

Which is better? Surely polled is simplest:

>> how polling is supposed to be implemented in QEMU?

(I'm not familiar with Qemu's internals, but,)
For any of the GHES notifications you must have to reserve memory for CPER
records, advertise where they are to the guest via UEFI+ACPI and describe which
regions are notified by which method.

When Qemu takes a RAS signal it generates CPER records and 'does' the
notification. NOTIFY_POLL is the simplest, you don't do anything for the
notification. The guest is expected to read the interval value from the HEST and
check the buffer that often. Qemu just needs to generate the CPER records into
the appropriate location in guest memory.


Thanks,

James


>>>   Below is APEI spec, From the spec, it suggested using GPIO interrupt or 
>>> GPIO-signaled events in ARM64 [1]. If using Polled notification
>> for GHES, I do not sure whether it is reasonable.
>>> In the Qemu, X86 does not using Polled notification. it mainly use
>>> SCI. Until now, I do not found there is peopled using Polled notification 
>>> in qemu. if implemented polled notification, I do not know how
>> much work effort need to do. Now I have already implemented the GPIO-Signal 
>> notification using GPIO pin.

>>> [1]
>>> HW-reduced ACPI platforms signal the error using a GPIO interrupt or
>>> another interrupt declared under a generic event device (Section 5.6.9). In 
>>> the case of GPIO-signaled events, an _AEI object lists the
>> appropriate GPIO pin, while for Interrupt-signaled events a _CRS object is 
>> used to list the interrupt:
>>>     • The OSPM evaluates the control method associated with this event as 
>>> indicated in Section 5.6.5.3 and Section 5.6.9.3.
>>>     • OSPM responds to this notification by checking the error status block 
>>> of all generic error sources with the GPIO-Signal notification or
>> Interrupt-signaled notification types to identify the
>>>       source reporting the error.
>>>




reply via email to

[Prev in Thread] Current Thread [Next in Thread]