[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH 10/10] qdev: fix create in place obj's life cycl
From: |
Avi Kivity |
Subject: |
Re: [Qemu-devel] [PATCH 10/10] qdev: fix create in place obj's life cycle problem |
Date: |
Sat, 01 Sep 2012 02:30:50 -0700 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:14.0) Gecko/20120717 Thunderbird/14.0 |
On 09/01/2012 01:57 AM, Jan Kiszka wrote:
> On 2012-09-01 10:31, Avi Kivity wrote:
> > On 08/29/2012 10:49 AM, Jan Kiszka wrote:
> >>>
> >>> Let's experiment with refcounting MemoryRegion. We can move the entire
> >>> contents of MemoryRegion to MemoryRegionImpl, add a reference count (to
> >>> MemoryRegionImpl), and change MemoryRegion to contain a pointer to the
> >>> refcounted MemoryRegionImpl:
> >>>
> >>> struct MemoryRegion {
> >>> struct MemoryRegionImpl *impl;
> >>> };
> >>>
> >>> struct MemoryRegionImpl {
> >>> atomic int refs;
> >>> ...
> >>> };
> >>>
> >>> The memory core can then store MemoryRegion copies (with elevated
> >>> refcounts) instead of pointers. Devices can destroy MemoryRegions at
> >>> any time, the implementation will not go away. However, what of the
> >>> opaque stored in MemoryRegionImpl? It becomes a dangling pointer.
> >>>
> >>> One way out is to add a lock to MemoryRegionImpl. Dispatch takes the
> >>> lock, examines the ->enabled member, and bails out if it is cleared.
> >>> The (MemoryRegion, not MemoryRegionImpl) destructor also takes the lock,
> >>> clears ->enabled, releases the lock, and drops the reference.
> >>
> >> That means holding the MemoryRegionImpl lock across the handler call?
> >
> > Blech. As you said on the other side of this thread, we must not take a
> > coarse grained lock within a fine grained one, and
> > MemoryRegionImpl::lock is as fine as they get.
>
> Not sure what you compare here. MemoryRegionImpl::lock would be per
> memory region, so with finer scope than the BQL but with similar scope
> like a per-device lock.
The dispatch may acquire the bql (in fact most will, unless we convert
everything). A design that doesn't allow this is broken. Even the
device lock is more coarse grained (since it covers all regions) and is
taken in a deadlock scenario as region manipulation will be done under
the device lock. You say we can detect this, but I dislike it.
> >
> >>>
> >>> The advantage to this scheme is that all changes are localized to the
> >>> memory core, no need for a full sweep. It is a little complicated, but
> >>> we may be able to simplify it (or find another one).
> >>
> >> May work. We just need to detect if memory region tries to delete itself
> >> from its handler to prevent the deadlock.
> >
> > Those types of hacks are fragile. IMO it just demonstrates what I said
> > earlier (then tried to disprove with this): if we call an opaque's
> > method, we must refcount or otherwise lock that opaque. No way around it.
>
> But that still doesn't solve the problem that we need to lock down the
> *state* of the opaque during dispatch /wrt to memory region changes.
> Just ensuring its existence is not enough unless we declare memory
> region transactions to be asynchronous - and adapt all users.
I expect no users will need change.
Changing the region offset has no effect on dispatch (in fact the region
itself doesn't change). Changing subregions is fine. Disabling a
region concurrently with access is the only potential problem, but this
is rare, and I expect it to just work. We will have to audit all users,
yes.
> >>>> + wait for refcount(object) == 0 in deregister(object). That's what I'm
> >>>> proposing.
> >>>
> >>> Consider timer_del() called from a timer callback. It's often not doable.
> >>
> >> This is inherently synchronous already (when working against the same
> >> alarm timer backend). We can detect this in timer_del and skip waiting,
> >> as in the above scenario.
> >
> > It can always be broken. The timer callback takes the device lock to
> > update the device. The device mmio handler, holding the device lock,
> > takes the timer lock to timer_mod. Deadlock.
>
> Well, how is this solved in Linux? By waiting on the callback in
> hrtimer_cancel. Not by wait_on_magic_opaque (well, there is even no
> opaque in the hrtimer API).
I don't consider that busy loop an elegant solution.
The original proposal mirrors dcache. You can perform operations on a
dentry even after it is unlinked.
--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.
- Re: [Qemu-devel] [PATCH 10/10] qdev: fix create in place obj's life cycle problem, Avi Kivity, 2012/09/01
- Re: [Qemu-devel] [PATCH 10/10] qdev: fix create in place obj's life cycle problem, Avi Kivity, 2012/09/01
- Re: [Qemu-devel] [PATCH 10/10] qdev: fix create in place obj's life cycle problem, liu ping fan, 2012/09/03
- Re: [Qemu-devel] [PATCH 10/10] qdev: fix create in place obj's life cycle problem, liu ping fan, 2012/09/03
- Re: [Qemu-devel] [PATCH 10/10] qdev: fix create in place obj's life cycle problem, liu ping fan, 2012/09/05
- Re: [Qemu-devel] [PATCH 10/10] qdev: fix create in place obj's life cycle problem, Avi Kivity, 2012/09/05
- Re: [Qemu-devel] [PATCH 10/10] qdev: fix create in place obj's life cycle problem, Jan Kiszka, 2012/09/05