qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Error handling in realize() methods


From: Markus Armbruster
Subject: Re: [Qemu-devel] Error handling in realize() methods
Date: Thu, 10 Dec 2015 12:06:25 +0100
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux)

Paolo Bonzini <address@hidden> writes:

> On 09/12/2015 10:30, Markus Armbruster wrote:
>> My current working assumption is that passing &error_fatal to
>> memory_region_init_ram() & friends is okay even in realize() methods and
>> their supporting code, except when the allocation can be large.
>
> I suspect a lot of memory_region_init_ram()s could be considered
> potentially large (at least in the 16-64 megabytes range).  Propagation
> of memory_region_init_ram() failures is easy enough, thanks to Error**,
> that we should just do it.

Propagating an out-of-memory error right in realize() is easy.  What's
not so easy is making realize() fail cleanly (all side effects undone;
we get that wrong in many places), and finding and propagating
out-of-memory errors hiding deeper in the call tree.

However, genuinely "large" allocations should be relatively few, and
handling them gracefully in hot-pluggable devices is probably feasible.

I doubt ensuring *all* allocations on behalf of a hot-pluggable device
are handled gracefully is a good use of our reseources, or even
feasible.

Likewise, graceful error handling for devices that cannot be hot-plugged
feels like a waste of resources we can ill afford.  I think we should
simply document their non-gracefulness by either setting hotpluggable =
false or cannot_instantiate_with_device_add_yet = true with a suitable
comment.

> Even if we don't, we should use &error_abort, not &error_fatal
> (programmer error---due to laziness---rather than user error).
> &error_fatal should really be restricted to code that is running very
> close to main().

"Very close to main" is a question of dynamic context.

Consider a device that can only be created during machine initialization
(cannot_instantiate_with_device_add_yet = true or hotpluggable = false).
&error_fatal is perfectly adequate there.  &error_abort would be
misleading, because when it fails, it's almost certainly because the
user tried to create too big a machine.

Now consider a hot-pluggable device.  Its recognized "large" allocations
all fail gracefully.  What about its other allocations?  Two kinds: the
ones visible in the device model code, and the ones hiding elsewhere,
which include "a few" of the 2300+ uses of GLib memory allocation.  The
latter exit().  Why should the former abort()?

Now use that hot-pluggable device during machine initialization.
abort() is again misleading.

Let's avoid a fruitless debate on when to exit() and when to abort() on
out-of-memory, and just stick to exit().  We don't need a core dump to
tell a developer to fix his lazy error handling.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]