|
From: | Chuang Xu |
Subject: | Re: [RFC v4 2/3] memory: add depth assert in address_space_to_flatview |
Date: | Thu, 12 Jan 2023 15:59:55 +0800 |
User-agent: | Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.6.1 |
Hi, Peter, Paolo, On 2023/1/10 δΈε10:45, Peter Xu wrote:
On Tue, Jan 10, 2023 at 12:09:41AM -0800, Chuang Xu wrote:Hi, Peter and Paolo,Hi, Chuang, Paolo,I'm sorry I didn't reply to your email in time. I was infected with COVID-19 two weeks ago, so I couldn't think about the problems discussed in your email for a long time.π· On 2023/1/4 δΈε1:43, Peter Xu wrote:Hi, Paolo, On Wed, Dec 28, 2022 at 09:27:50AM +0100, Paolo Bonzini wrote:Il ven 23 dic 2022, 16:54 Peter Xu ha scritto:This is not valid because the transaction could happen in *another*thread.In that case memory_region_transaction_depth() will be > 0, but RCU is needed.Do you mean the code is wrong, or the comment? Note that the code has checked rcu_read_locked() where introduced in patch 1, but maybesomethingelse was missed?The assertion is wrong. It will succeed even if RCU is unlocked in this thread but a transaction is in progress in another thread.IIUC this is the case where the context: (1) doesn't have RCU read lock held, and, (2) doesn't have BQL held. Is it safe at all to reference any flatview in such a context? The thing is I think the flatview pointer can be freed anytime if both locks arenottaken.Perhaps you can check (memory_region_transaction_depth() > 0 && !qemu_mutex_iothread_locked()) || rcu_read_locked() instead?What if one thread calls address_space_to_flatview() with BQL held butnotRCU read lock held? I assume it's a legal operation, but it seems to be able to trigger the assert already? Thanks,I'm not sure whether I understand the content of your discussion correctly, so here I want to explain my understanding firstly. From my perspective, Paolo thinks that when thread 1 is in a transaction, thread 2 will trigger the assertion when accessing the flatview without holding RCU read lock, although sometimes the thread 2's access to flatview is legal. So Paolo suggests checking (memory_region_transaction_depth() > 0 && !qemu_mutex_iothread_locked()) || rcu_read_locked() instead. And Peter thinks that as long as no thread holds the BQL or RCU read lock, the old flatview can be released (actually executed by the rcu thread with BQL held). When thread 1 is in a transaction, if thread 2 access the flatview with BQL held but not RCU read lock held, it's a legal operation. In this legal case, it seems that both my code and Paolo's code will trigger assertion.IIUC your original patch is fine in this case (BQL held, RCU not held), as long as depth==0. IMHO what we want to trap here is when BQL held (while RCU is not) and depth>0 which can cause unpredictable side effect of using obsolete flatview.
I don't quite understand the side effects of depth>0 when BQL is held (while RCU is not). In my perspective, both BQL and RCU can ensure that the flatview will not be released when the worker thread accesses the flatview, because before the rcu thread releases the flatview, it will make sure itself holding BQL and the worker thread not holding RCU. So, whether the depth is 0 or not, as long as BQL or RCU is held, the worker thread will not access the obsolete flatview (IIUC 'obsolete' means that flatview is released).
To summarize, the original check didn't consider BQL, and if to consider BQL I think it should be something like: /* Guarantees valid access to the flatview, either lock works */ assert(BQL_HELD() || RCU_HELD()); /* * Guarantees any BQL holder is not reading obsolete flatview (e.g. when * during vm load) */ if (BQL_HELD()) assert(depth==0); IIUC it can be merged into: assert((BQL_HELD() && depth==0) || RCU_HELD());
IMHO assert(BQL_HELD() || RCU_HELD()) is enough.. Or you think that once a mr transaction is in progress, the old flatview has been obsolete? If we regard flatview as obsolete when a mr transaction is in progress, How can RCU ensure that flatview is not obsolete? What does Paolo think of this check? Thanks!
I'm not sure if I have a good understanding of your emails? I think checking(memory_region_transaction_get_depth() == 0 || rcu_read_locked() || qemu_mutex_iothread_locked()) should cover the case you discussed.This seems still problematic too? Since the assert can pass even if neither BQL nor RCU is held (as long as depth==0). Thanks,
[Prev in Thread] | Current Thread | [Next in Thread] |