Re: [Qemu-devel] Error handling in realize() methods

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Error handling in realize() methods

From:	Paolo Bonzini
Subject:	Re: [Qemu-devel] Error handling in realize() methods
Date:	Thu, 10 Dec 2015 12:26:28 +0100
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0

On 10/12/2015 12:06, Markus Armbruster wrote:
> Paolo Bonzini <address@hidden> writes:
> 
>> On 09/12/2015 10:30, Markus Armbruster wrote:
>>> My current working assumption is that passing &error_fatal to
>>> memory_region_init_ram() & friends is okay even in realize() methods and
>>> their supporting code, except when the allocation can be large.
>>
>> I suspect a lot of memory_region_init_ram()s could be considered
>> potentially large (at least in the 16-64 megabytes range).  Propagation
>> of memory_region_init_ram() failures is easy enough, thanks to Error**,
>> that we should just do it.
> 
> Propagating an out-of-memory error right in realize() is easy.  What's
> not so easy is making realize() fail cleanly (all side effects undone;
> we get that wrong in many places), and finding and propagating
> out-of-memory errors hiding deeper in the call tree.

grep is your friend.  We're talking of a subset of these:

hw/block/onenand.c:    memory_region_init_ram(&s->ram, OBJECT(s), "onenand.ram",
hw/block/pflash_cfi01.c:    memory_region_init_rom_device(
hw/block/pflash_cfi02.c:    memory_region_init_rom_device(&pfl->orig_mem, 
OBJECT(pfl), pfl->be ?
hw/display/cg3.c:    memory_region_init_ram(&s->rom, obj, "cg3.prom", 
FCODE_MAX_ROM_SIZE,
hw/display/cg3.c:    memory_region_init_ram(&s->vram_mem, NULL, "cg3.vram", 
s->vram_size,
hw/display/g364fb.c:    memory_region_init_ram_ptr(&s->mem_vram, NULL, "vram",
hw/display/qxl.c:    memory_region_init_ram(&qxl->rom_bar, OBJECT(qxl), 
"qxl.vrom",
hw/display/qxl.c:    memory_region_init_ram(&qxl->vram_bar, OBJECT(qxl), 
"qxl.vram",
hw/display/qxl.c:    memory_region_init_ram(&qxl->vga.vram, OBJECT(dev), 
"qxl.vgavram",
hw/display/sm501.c:    memory_region_init_ram(&s->local_mem_region, NULL, 
"sm501.local",
hw/display/tc6393xb.c:    memory_region_init_ram(&s->vram, NULL, 
"tc6393xb.vram", 0x100000,
hw/display/tcx.c:    memory_region_init_ram(&s->rom, obj, "tcx.prom", 
FCODE_MAX_ROM_SIZE,
hw/display/tcx.c:    memory_region_init_ram(&s->vram_mem, OBJECT(s), "tcx.vram",
hw/display/vga.c:    memory_region_init_ram(&s->vram, obj, "vga.vram", 
s->vram_size,
hw/display/vmware_vga.c:    memory_region_init_ram(&s->fifo_ram, NULL, 
"vmsvga.fifo", s->fifo_size,
hw/dma/rc4030.c:    memory_region_init_rom_device(&s->dma_tt, o,
hw/i386/pci-assign-load-rom.c:    memory_region_init_ram(&dev->rom, owner, 
name, st.st_size, &error_abort);
hw/input/milkymist-softusb.c:    memory_region_init_ram(&s->pmem, OBJECT(s), 
"milkymist-softusb.pmem",
hw/input/milkymist-softusb.c:    memory_region_init_ram(&s->dmem, OBJECT(s), 
"milkymist-softusb.dmem",
hw/net/dp8393x.c:    memory_region_init_ram(&s->prom, OBJECT(dev),
hw/net/milkymist-minimac2.c:    memory_region_init_ram(&s->buffers, 
OBJECT(dev), "milkymist-minimac2.buffers",
hw/pci/pci.c:    memory_region_init_ram(&pdev->rom, OBJECT(pdev), name, size, 
&error_fatal);
hw/s390x/sclp.c:            memory_region_init_ram(standby_ram, NULL, id, 
this_subregion_size,

Some of them are in sysbus devices and can thus be ignored.  Some of the
others may indeed need introduction of additional error propagation or may
be non-trivial.

> I doubt ensuring *all* allocations on behalf of a hot-pluggable device
> are handled gracefully is a good use of our reseources, or even
> feasible.

I agree.  I don't think we have large allocations in device models,
other than from memory region allocation (e.g. VRAM).

> Consider a device that can only be created during machine initialization
> (cannot_instantiate_with_device_add_yet = true or hotpluggable = false).
> &error_fatal is perfectly adequate there.  &error_abort would be
> misleading, because when it fails, it's almost certainly because the
> user tried to create too big a machine.

I agree here.

> Now consider a hot-pluggable device.  Its recognized "large" allocations
> all fail gracefully.  What about its other allocations?  Two kinds: the
> ones visible in the device model code, and the ones hiding elsewhere,
> which include "a few" of the 2300+ uses of GLib memory allocation.  The
> latter exit().  Why should the former abort()?

We've already ruled that the latter simply won't happen, so we can say at
the same time that g_malloc() does not exit at all. :)

&error_fatal is a user error, &error_abort is an assertion failure.  This
is much closer to an assertion failure.

> Now use that hot-pluggable device during machine initialization.
> abort() is again misleading.

Perhaps, but---because it's hotpluggable---what happens during machine
initialization is much less important.  It doesn't lead to lost data.
An abort() is much more likely to catch the developers' and the users'
attention and much more likely to be reported (compare "huh, it went
away?!?" with "abrt is asking me to open a Fedora bug, interesting").

> Let's avoid a fruitless debate on when to exit() and when to abort() on
> out-of-memory, and just stick to exit().  We don't need a core dump to
> tell a developer to fix his lazy error handling.

Perhaps we do...

Paolo

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [Qemu-devel] Error handling in realize() methods, (continued)

Prev by Date: Re: [Qemu-devel] Error handling in realize() methods
Next by Date: Re: [Qemu-devel] [PATCH 00/16] pc: Eliminate struct PcGuestInfo
Previous by thread: Re: [Qemu-devel] Error handling in realize() methods
Next by thread: Re: [Qemu-devel] Error handling in realize() methods
Index(es):
- Date
- Thread