qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 10/10] qdev: fix create in place obj's life cycl


From: Jan Kiszka
Subject: Re: [Qemu-devel] [PATCH 10/10] qdev: fix create in place obj's life cycle problem
Date: Sat, 01 Sep 2012 10:57:16 +0200
User-agent: Mozilla/5.0 (X11; U; Linux i686 (x86_64); de; rv:1.8.1.12) Gecko/20080226 SUSE/2.0.0.12-1.1 Thunderbird/2.0.0.12 Mnenhy/0.7.5.666

On 2012-09-01 10:31, Avi Kivity wrote:
> On 08/29/2012 10:49 AM, Jan Kiszka wrote:
>>>
>>> Let's experiment with refcounting MemoryRegion.  We can move the entire
>>> contents of MemoryRegion to MemoryRegionImpl, add a reference count (to
>>> MemoryRegionImpl), and change MemoryRegion to contain a pointer to the
>>> refcounted MemoryRegionImpl:
>>>
>>> struct MemoryRegion {
>>>     struct MemoryRegionImpl *impl;
>>> };
>>>
>>> struct MemoryRegionImpl {
>>>     atomic int refs;
>>>     ...
>>> };
>>>
>>> The memory core can then store MemoryRegion copies (with elevated
>>> refcounts) instead of pointers.  Devices can destroy MemoryRegions at
>>> any time, the implementation will not go away.  However, what of the
>>> opaque stored in MemoryRegionImpl?  It becomes a dangling pointer.
>>>
>>> One way out is to add a lock to MemoryRegionImpl.  Dispatch takes the
>>> lock, examines the ->enabled member, and bails out if it is cleared. 
>>> The (MemoryRegion, not MemoryRegionImpl) destructor also takes the lock,
>>> clears ->enabled,  releases the lock, and drops the reference.
>>
>> That means holding the MemoryRegionImpl lock across the handler call?
> 
> Blech.  As you said on the other side of this thread, we must not take a
> coarse grained lock within a fine grained one, and
> MemoryRegionImpl::lock is as fine as they get.

Not sure what you compare here. MemoryRegionImpl::lock would be per
memory region, so with finer scope than the BQL but with similar scope
like a per-device lock.

> 
>>>
>>> The advantage to this scheme is that all changes are localized to the
>>> memory core, no need for a full sweep.  It is a little complicated, but
>>> we may be able to simplify it (or find another one).
>>
>> May work. We just need to detect if memory region tries to delete itself
>> from its handler to prevent the deadlock.
> 
> Those types of hacks are fragile.  IMO it just demonstrates what I said
> earlier (then tried to disprove with this): if we call an opaque's
> method, we must refcount or otherwise lock that opaque.  No way around it.

But that still doesn't solve the problem that we need to lock down the
*state* of the opaque during dispatch /wrt to memory region changes.
Just ensuring its existence is not enough unless we declare memory
region transactions to be asynchronous - and adapt all users.

> 
>>
>>>
>>>>
>>>>>
>>>>>> Besides
>>>>>> MMIO and PIO dispatching, it will haunt us for file or event handlers,
>>>>>> any kind of callbacks etc.
>>>>>>
>>>>>> Context A                               Context B
>>>>>> ---------                               ---------
>>>>>>                                         object = lookup()
>>>>>> deregister(object)
>>>>>> modify(object) -> invalid state
>>>>>> ...                                     use(object)
>>>>>> modify(object) -> valid state
>>>>>> register(object)
>>>>>>
>>>>>> And with "object" I'm not talking about QOM but any data structure.
>>>>>>
>>>>>
>>>>>
>>>>> Context B
>>>>> ---------
>>>>> rcu_read_lock()
>>>>> object = lookup()
>>>>> if (object) {
>>>>>     ref(object)
>>>>> }
>>>>> rcu_read_unlock()
>>>>>
>>>>> use(object)
>>>>>
>>>>> unref(object)
>>>>>
>>>>> (substitute locking scheme to conform to taste and/or dietary
>>>>> restrictions)   
>>>>>
>>>>
>>>> + wait for refcount(object) == 0 in deregister(object). That's what I'm
>>>> proposing.
>>>
>>> Consider timer_del() called from a timer callback.  It's often not doable.
>>
>> This is inherently synchronous already (when working against the same
>> alarm timer backend). We can detect this in timer_del and skip waiting,
>> as in the above scenario.
> 
> It can always be broken.  The timer callback takes the device lock to
> update the device.  The device mmio handler, holding the device lock,
> takes the timer lock to timer_mod.  Deadlock.

Well, how is this solved in Linux? By waiting on the callback in
hrtimer_cancel. Not by wait_on_magic_opaque (well, there is even no
opaque in the hrtimer API).

Jan

Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]