[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-block] [PATCH v4 00/11] block: Rework bdrv_close_all()

From: Max Reitz
Subject: Re: [Qemu-block] [PATCH v4 00/11] block: Rework bdrv_close_all()
Date: Mon, 02 Mar 2015 11:03:12 -0500
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.5.0

On 2015-03-02 at 10:57, Kevin Wolf wrote:
Am 02.03.2015 um 16:15 hat Max Reitz geschrieben:
On 2015-03-02 at 04:25, Kevin Wolf wrote:
Am 27.02.2015 um 17:43 hat Max Reitz geschrieben:
Currently, bdrv_close_all() force-closes all BDSs with a BlockBackend,
which can lead to data corruption (see the iotest added in the final
patch of this series) and is most certainly very ugly.

This series reworks bdrv_close_all() to instead eject the BDS trees from
all BlockBackends and then close the monitor-owned BDS trees, which are
the only BDSs without a BB. In effect, all BDSs are closed just by
getting closed automatically due to their reference count becoming 0.

The benefit over the approach taken in v1 and v2 is that in device
models we often cannot simply drop the reference to a BB because there
may be some user which we forgot about. By ejecting the BDS trees from
the BB, the BB itself becomes unusable, but in a clean way (it will
return errors when accessed, but nothing will crash). Also, it is much
simpler (no reference tracking necessary).

The only disadvantage (I can see) is that the BBs are leaked; but this
does not matter because the qemu process is about to exit anyway.
I haven't looked at the actual patches yet, but just from this
description and the diffstat: We need to make sure that the refcount
really drops to 0.
If you mean the BBs: Tried that in v2, doesn't seem to work. I had
my fun tracing around where the handles to the BBs for device models
stay and who might have access to them, but when I say "fun" I mean
that ironically.

If you mean the BDSs: See below (below below).
Yes, I mean BDSes.

That is, if there are NBD servers, block jobs, etc.
that hold an additional reference, we must make sure to stop them. It
doesn't look like this series takes care of this, does it?
Everyone using BBs is fine; for instance, NBD servers are stopped
(they register a notifier for when the BDS tree is ejected from a
Good point. Even if the NBD server were not stopped, it still would
hold a reference only to the BB and not to the BDS, so the BDS would
correctly be closed.

So block jobs are not handled, once because I still didn't get
around to make them work on BBs instead of BDSs (and I probably
never will, although they really should), and because I missed the
bdrv_ref() in block_job_create()

I guess that means manually cancelling all block jobs in bdrv_close_all()...
Even in the far future, will block jobs really always work on a BB? Even
if they modify some node in the middle of the chain?

I think they should create an unnamed BB for that case.

But if we start to work on that, we'll have to create a BB for every block job, and that means supporting multiple BBs per BDS (if a block job is started on a root BDS).

Anyway, for now, we probably need to cancel them, yes. Essentially you
end up calling everything manually that had to register a notifier in
the earlier versions.

Well, not absolutely everything, only everything that does not use a BB (and as far as I'm aware, those are only the monitor (which can hold references to BDSs without BBs) and block jobs).

Hm... Perhaps we could even install an atexit handler that asserts that
all BDSes are really gone in the end?
I had that for BBs in v1 and v2, it was just a function that was
called at the end of bdrv_close_all(). For that to work for BDSs,
however, we'd need a list of all BDSs and assert that it's empty. I
found that a bit too much.
The list of all relevant BDSes should be the union of bdrv_states and
bdrv_graph_states, and I think it doesn't have to be empty, but all of
its entries have to be closed (bs->drv == NULL).

The idea of this series is that bdrv_close() should not be called outside of bdrv_delete() (and bdrv_open(), if that failed), so whenever a BDS is closed, it should be deleted.

It's not 100% complete because we have these stupid QMP commands that
address BDSes using filenames instead of node or BB names, but it's
probably a useful subset to check.

A simpler but uglier way would be a counter that's incremented every
time a BDS is created and decremented every time one is deleted (or
incremented once per refcount increment and drecrement once per
refcount decrement). We'd then assert that this counter is 0.
Ugh. Well, it does the job, so holding my nose while applying should



reply via email to

[Prev in Thread] Current Thread [Next in Thread]