[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH 06/10] qemu-img: Prepare for locked images
From: |
Markus Armbruster |
Subject: |
Re: [Qemu-devel] [PATCH 06/10] qemu-img: Prepare for locked images |
Date: |
Thu, 14 Jan 2016 14:07:12 +0100 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux) |
Kevin Wolf <address@hidden> writes:
> Am 13.01.2016 um 09:44 hat Markus Armbruster geschrieben:
>> Kevin Wolf <address@hidden> writes:
>>
>> > Am 12.01.2016 um 16:20 hat Markus Armbruster geschrieben:
>> >> Kevin Wolf <address@hidden> writes:
>> >>
>> >> > Am 11.01.2016 um 16:49 hat Markus Armbruster geschrieben:
>> >> >> Eric Blake <address@hidden> writes:
>> >> >>
>> >> >> > On 12/22/2015 09:46 AM, Kevin Wolf wrote:
>> >> >> >> This patch extends qemu-img for working with locked
>> >> >> >> images. It prints a
>> >> >> >> helpful error message when trying to access a locked image
>> >> >> >> read-write,
>> >> >> >> and adds a 'qemu-img force-unlock' command as well as a
>> >> >> >> 'qemu-img check
>> >> >> >> -r all --force' option in order to override a lock left
>> >> >> >> behind after a
>> >> >> >> qemu crash.
>> >> >> >>
>> >> >> >> Signed-off-by: Kevin Wolf <address@hidden>
>> >> >> >> ---
>> >> >> >> include/block/block.h | 1 +
>> >> >> >> include/qapi/error.h | 1 +
>> >> >> >> qapi/common.json | 3 +-
>> >> >> >> qemu-img-cmds.hx | 10 ++++--
>> >> >> >> qemu-img.c | 96
>> >> >> >> +++++++++++++++++++++++++++++++++++++++++++--------
>> >> >> >> qemu-img.texi | 20 ++++++++++-
>> >> >> >> 6 files changed, 113 insertions(+), 18 deletions(-)
>> >> >> >>
>> >> >> >
>> >> >> >> +++ b/include/qapi/error.h
>> >> >> >> @@ -102,6 +102,7 @@ typedef enum ErrorClass {
>> >> >> >> ERROR_CLASS_DEVICE_NOT_ACTIVE =
>> >> >> >> QAPI_ERROR_CLASS_DEVICENOTACTIVE,
>> >> >> >> ERROR_CLASS_DEVICE_NOT_FOUND = QAPI_ERROR_CLASS_DEVICENOTFOUND,
>> >> >> >> ERROR_CLASS_KVM_MISSING_CAP = QAPI_ERROR_CLASS_KVMMISSINGCAP,
>> >> >> >> + ERROR_CLASS_IMAGE_FILE_LOCKED =
>> >> >> >> QAPI_ERROR_CLASS_IMAGEFILELOCKED,
>> >> >> >> } ErrorClass;
>> >> >> >
>> >> >> > Wow - a new ErrorClass. It's been a while since we could
>> >> >> > justify one of
>> >> >> > these, but I think you might have found a case.
>> >> >>
>> >> >> Spell out the rationale for the new ErrorClass, please.
>> >> >
>> >> > Action to be taken for this error class: Decide whether the lock is a
>> >> > leftover from a previous qemu run that ended in an unclean shutdown. If
>> >> > so, retry with overriding the lock.
>> >> >
>> >> > Currently used by qemu-img when ordered to override a lock. libvirt
>> >> > will need to do the same.
>> >>
>> >> Let's see whether I understand the intended use:
>> >>
>> >> open image
>> >> if open fails with ImageFileLocked:
>> >> guess whether the lock is stale
>> >> if guessing not stale:
>> >> error out
>> >> open image with lock override
>> >>
>> >> Correct?
>> >
>> > Yes. Where "guess" is more or less "check whether the management tool
>> > started qemu with this image, but didn't cleanly shut it down". This can
>> > guess wrong if, and only if, some other user used a different algorithm
>> > and forced an unlock even though the image didn't belong to them before
>> > the crash.
>> >
>> >> Obvious troublespots:
>> >>
>> >> 1. If you guess wrong, you destroy the image. No worse than before, so
>> >> okay, declare documentation problem.
>> >>
>> >> 2. TOCTTOU open to open with lock override
>> >> [...]
>> >>
>> >> 3. TOCTTOU within open (hypothetical, haven't read your code)
>> >> [...]
>> >
>> > Yes, these exist in theory. The question is what scenarios you want to
>> > protect against and whether improving the mechanism to cover these cases
>> > is worth the effort.
>> >
>> > The answer for what I wanted to protect is a manual action on an image
>> > that is already in use. The user isn't quick enough to manually let two
>> > processes open the same image at the same time, so I didn't consider
>> > that scenario relevant.
>> >
>> > But assuming that everyone (including the human user) follows the above
>> > protocol (force-unlock only what was yours before the crash), at least
>> > cases 1 and 2 don't happen anyway.
>>
>> "Force-unlock only what you locked yourself" is easier to stipulate than
>> to adhere to when the tools can't give you a hint on who did the
>> locking. This is particularly true when "you" is a human, with human
>> imperfect memory.
>>
>> I understand that this locking can't provide complete protection, and
>> merely aims to catch certain common accidents.
>>
>> However, to avoid a false sense of security, its limitations need to be
>> clearly documented. This very much includes the rule "force-unlock only
>> what you locked yourself". In my opinion, it should also include the
>> raciness.
>>
>> Sometimes, solving a problem is easier than documenting it.
>
> Maybe I'll just merge the migration fixes and shelve the rest of the
> series until I'm bored enough to implement the "real thing" with an
> incompatible feature flag, lock IDs with an autogenerated part and
> another part from the user, saved host name and PID, and a qcow2 driver
> that refuses to write anything to an image it doesn't hold the lock for
> even in corner cases. For now, I've already used more time for this than
> I intended (didn't expect all that live migration fun initially...).
>
> And after all, it's not my VMs that get corrupted.
>
> If you think that solving the problem "for real" is too easy to take
> shortcuts, I'll happily create a BZ and assign it to you if you'd like
> me to.
I'm not sure what to do with this message. Did I tick you off?
I took the trouble to understand what you're trying to do, in particular
the limitations. I also explored a possible alternative that might be
less limited. I didn't ask you to adopt this alternative. I didn't
call your patch a shortcut. I did point out that its limitations need
to be documented, and encouraged you to weigh documentation effort
against implementation effort. Additionally, I wondered how this
affects libvirt, and whether we need to cooperate more closely on
locking.
If that's review gone too far, then I fail to see why we bother with it.