qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 1/5] Fix segmentation fault when qemu_signal_ini


From: Fei Li
Subject: Re: [Qemu-devel] [PATCH 1/5] Fix segmentation fault when qemu_signal_init fails
Date: Thu, 6 Sep 2018 14:31:39 +0800
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1



On 09/05/2018 10:38 PM, Fam Zheng wrote:
On Wed, 09/05 19:20, Fei Li wrote:

On 09/05/2018 04:36 PM, Daniel P. Berrangé wrote:
On Wed, Sep 05, 2018 at 12:17:24PM +0800, Fei Li wrote:
Thanks for the review! :)


On 09/04/2018 07:26 PM, Daniel P. Berrangé wrote:
On Tue, Sep 04, 2018 at 07:08:18PM +0800, Fei Li wrote:

... snip ...
            free(info);
            return -1;
        }
@@ -94,17 +97,21 @@ static int qemu_signalfd_compat(const sigset_t *mask)
        return fds[0];
    }
-int qemu_signalfd(const sigset_t *mask)
+int qemu_signalfd(const sigset_t *mask, Error **errp)
    {
-#if defined(CONFIG_SIGNALFD)
        int ret;
+    Error *local_err = NULL;
+#if defined(CONFIG_SIGNALFD)
        ret = syscall(SYS_signalfd, -1, mask, _NSIG / 8);
        if (ret != -1) {
            qemu_set_cloexec(ret);
            return ret;
        }
    #endif
-
-    return qemu_signalfd_compat(mask);
+    ret = qemu_signalfd_compat(mask, &local_err);
+    if (local_err) {
+        error_propagate(errp, local_err);
+    }
Using a local_err is not required - you can just pass errp stright
to qemu_signalfd_compat() and then check

      if (ret < 0)
For the use of a local error object & error_propagate call, I'd like to
explain here. :)
In our code, the initial caller passes two kinds of Error to the call trace,
one is
something like &error_abort and &error_fatal, the other is NULL.

For the former, the exit() occurs in the functions where
error_handle_fatal() is called
(e.g. called by error_propagate/error_setg/...). The patch3: qemu_init_vcpu
is the case,
that means the system will exit in the final callee: qemu_thread_create(),
instead of
the initial caller pc_new_cpu(). In such case, I think propagating seems
more reasonable.
I don't really agree. It is preferrable to abort immediately at the deepest
place which raises the error. The stack trace will thus show the full call
chain leading upto the problem.
Sorry for the above example, it is not exactly correct: for the patch3 case,
the
system will exit in device_set_realized(), where the first error_propagate()
is called
if we pass errp directly, but not in the final callee.. Sorry for the
misleading.

For another example, its call trace:
qemu_thread_create(, NULL)
<= iothread_complete(, NULL)
<== user_creatable_complete(, NULL)
<=== object_new_with_propv(, errp)
<==== object_new_with_props(, errp) {... error_propagate(errp, local_err);
...}
<===== iothread_create(, &error_abort)
The exit occurs in object_new_with_props where the first error_propagate is
called.

Either the device_set_realized() or object_new_with_props() is a middle
caller, thus
we can only see the top half stack trace until where error_handle_fatal() is
called.

In other words, the exit() occurs neither in the final callee nor the
initial caller.
Sorry for the misleading example again..
This means using error_propagate can potentially lose the final callee in the
error_abort cases. That is why it's preferrable to just pass errp down the
calling chain when possible. The reason why object_new_with_propv uses
error_propagate is because both object_property_add_child and
user_creatable_complete return void, thus cannot flag success/failure to its
caller via their return values. To check whether they succeed,
object_new_with_propv wants a non-NULL err parameter. But like you said, errp
passed to object_new_with_propv may or may not be NULL, so a local_err local
variable is defined to cope with that.

Alternatively it could do this instead:

{

     ...

     if (errp) {
         object_property_add_child(parent, id, obj, errp);
         if (*errp) {
             goto error;
         }
     } else {
         Error *local_err = NULL;
         object_property_add_child(parent, id, obj, &local_err);
         if (local_err) {
             goto error;
         }
     }
     ...

}

This way if error_abort was passed and object_property_add_child failed, the
abort point would be in the innermost function. But this is boilerplate code so
it's not used.

On the contrary, using error_propagate when not necessary also means more lines
of code but gives less info on the call trace when aborted.

So I fully agree with Dan.

Fam
Got, thanks for the review. :)

Have a nice day
Fei
How do you think passing errp straightly for the latter case, and use a
local error object &
error_propagate for the former case? This is a distinct treatment, but would
shorten the code.
It is inappropriate to second-guess whether the caller is a passing in
NULL or &error_abort, or another Error object. What is passed in can
change at any time in the future.
ok.
We should only ever use a local error where the local method has a need
to look at the error contents before returning to the caller. Any other
case should just use the errp directly.

Regards,
Daniel
Have a nice day, thanks
Fei






reply via email to

[Prev in Thread] Current Thread [Next in Thread]