|
From: | Vladimir Sementsov-Ogievskiy |
Subject: | Re: What prevents discarding a cluster during rewrite? |
Date: | Wed, 24 Feb 2021 09:04:16 +0300 |
User-agent: | Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.7.1 |
23.02.2021 19:29, Kevin Wolf wrote:
Am 23.02.2021 um 16:23 hat Vladimir Sementsov-Ogievskiy geschrieben:On 23.02.2021 13:35, Kevin Wolf wrote:Am 22.02.2021 um 22:42 hat Vladimir Sementsov-Ogievskiy geschrieben:23.02.2021 00:30, Vladimir Sementsov-Ogievskiy wrote:Thinking of how to prevent dereferencing to zero (and discard) of host cluster during flush of compressed cache (which I'm working on now), I have a question.. What prevents it for normal writes?I have no idea about why didn't it fail for years.. May be, I'm missing something?Ouch. I suppose the reason why we never ran into it is that afaik Linux drains the queue before sending discard requests. Of course, we still need to protect against this in QEMU. We can't just unref a cluster that is still in use.I have idea of fixing: increase the refcount of host cluster before write to data_file (it's important to increase refacount in same s->lock critical section where we get host_offset) and dereference it after write.. It should help. Any thoughts?It would cause metadata updates for rewrites. I don't know whether the intermediate value would ever be written to disk, but at least we'd rewrite the same refcounts as before. I don't think we want that.Hmm, if that can provoke extra refcount cache flush that's bad.. May be we need something like of additional "dynamic" refcounts, so that total_refcount = normal_refcount + dynamic_refcount.. And for in-flight clusters dynamic_refcount is > 0. We can store dynamic refcounts in GHashTable(cluster -> refcount).Do you think it's worth the complexity? The qcow2 driver is already relatively complicated today.
I started to implement it and actually it doesn't seem much more complicated than additional CoRwLock. So, I think, I will make two patches (refcounts vs CoRwMutex) and we'll compare.
Discard requests might be rare enough that not considering the cluster at all could work. We could then take a reader CoRwlock during most operations, and a writer lock for discard. Actually, maybe s->lock should be this CoRwlock, and instead of dropping it temporarily like we do now we would just upgrade and downgrade it as needed. Maybe this would allow finer grained locking in other places, too.In this case many operations will be blocked during data writing, like allocation of another cluster.. That doesn't seem good.You're right, that would be too restrictive.Separate CoRwLock looks more feasible.Maybe that then. Kevin
-- Best regards, Vladimir
[Prev in Thread] | Current Thread | [Next in Thread] |