qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 4/9] s390x: refactor error handling for SSCH and


From: Halil Pasic
Subject: Re: [Qemu-devel] [PATCH 4/9] s390x: refactor error handling for SSCH and RSCH
Date: Thu, 7 Sep 2017 13:32:41 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.2.0


On 09/07/2017 12:24 PM, Cornelia Huck wrote:
> On Thu, 7 Sep 2017 16:58:31 +0800
> Dong Jia Shi <address@hidden> wrote:
> 
>> * Halil Pasic <address@hidden> [2017-09-06 16:43:42 +0200]:
>>
>>>
>>>
>>> On 09/06/2017 04:20 PM, Cornelia Huck wrote:  
>>>> On Wed, 6 Sep 2017 14:25:13 +0200
>>>> Halil Pasic <address@hidden> wrote:
>>>>   
>>>>> We have basically two possibilities/options which ask for different
>>>>> handling:
>>>>> 1) EFAULT is due to a bug in the vfio-ccw implementation
>>>>> (can be QEMU or kernel).
>>>>> 2) EFAULT is due to buggy channel program.
>>>>>
>>>>> Option 2) is basically to be handled with a channel-program check and
>>>>> setting primary secondary and alert status. For reference see PoP page
>>>>> 15-59 ("Designation of Storage Area").  An exception may be an invalid
>>>>> channel program address in the ORB. There the channel-program check ain't
>>>>> explicitly stated (although) I would expect one. It may be implied by the
>>>>> things on page 15-59 though.
>>>>>
>>>>> Option 1) is however very similar to other we have figured out that the
>>>>> implementation is broken situations and should be handled consequently.
>>>>> The current state of the discussion is with a unit exception.
>>>>>
>>>>> Does that make sense?  
>>>>
>>>> I think the situation is slightly different here, though. For the orb
>>>> flags, we reject something out of hand because we have not implemented
>>>> it, and for that, unit exception sounds like a good fit. Processing
>>>> errors, however, are more similar to errors in the hardware, and as
>>>> such can probably be reported via something like equipment check.
>>>>   
>>>
>>> Noted. Let's see what Dong Jia has to say, before we continuing a
>>> discussion on something (option 1) what may be irrelevant anyway.
>>>   
>>>>>
>>>>> Now, Dong Jia, I need your help to figure out do we have option 1 or
>>>>> option 2 here? After quick look at the kernel code, it appears to me that
>>>>> I've seen both option 1 and option 2 (I'm afraid) -- but my assessment
>>>>> was really very superficial.  
>> There are three cases (all in the kernel) that generate a -EFAULT ret
>> code:
>> a. vfio_ccw_mdev_write: copy_from_user() fails.
>>   This is option 1.
>>
>> b. ccwchain_fetch_tic
>>   It's mostly likely that the vfio-ccw driver processed the ccw chains
>>   wrongly. (Actually I can not think of any other reason.)
> 
> Me neither, I'd consider hitting this a bug in the implementation.

Nod.

> 
>>   This is option 1.
>>
>> c. ccwchain_fetch_idal
>>   When we find that an IDAW contents an invalid address
>>   This is option 2.
>>
>>>>>
>>>>> I would expect option 2 to be handled differently (kernel provides the
>>>>> SCSW) though.
> 
> So we would do an equipment check for the first two ("equipment", i.e.
> the software, is malfunctioning) and use a more appropriate way for the
> malformed idaw?
> 

You have probably missed my previous email where I state something very
similar (MID  <address@hidden>).
There I say: if we change the kernel code, yes, if we don't I prefer a
program check.


Halil




reply via email to

[Prev in Thread] Current Thread [Next in Thread]