qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Image probing: how it can be insecure, and what we coul


From: Max Reitz
Subject: Re: [Qemu-devel] Image probing: how it can be insecure, and what we could do about it
Date: Thu, 06 Nov 2014 13:53:35 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.2.0

On 2014-11-06 at 13:26, Markus Armbruster wrote:
Max Reitz <address@hidden> writes:

On 2014-11-04 at 19:45, Markus Armbruster wrote:
I'll try to explain all solutions fairly.  Isn't easy when you're as
biased towards one of them as I am.  Please bear with me.


= The trust boundary between image contents and meta-data =

A disk image consists of image contents and meta-data.

Example: all of a raw image's contents is image contents.  Leaves just
file name and attributes for meta-data.

Example: QCOW2 meta-data includes header, header extensions, L1 table,
L2 tables, ...  The meta-data defines where in the image the actual
contents is stored.

A guest can access the image contents, not the meta-data.

Image contents you've let an untrusted guest write is untrusted.

Therefore, there's a trust boundary between image contents and
meta-data.  QEMU has to trust image meta-data, but shouldn't trust image
contents.  The exact location of the trust boundary depends on the image
format.


= How we instruct QEMU what to trust =

By configuring QEMU to use an image, the user instructs QEMU to trust
the image's meta-data.

When the user's configuration specifies the image format explicitly, the
trust boundary is clear.

Else, the trust boundary is ambigous when more than one format is
possible.

QEMU resolves this ambiguity by picking the first format with the
highest "score".  Raw format is always possible, and always has the
lowest score.


= How this lets the guest escape isolation =

Unfortunately, this lets the guest shift the trust boundary and escape
isolation, as follows:

* Expose a raw image to the guest (whether you specify the format=raw or
    let QEMU guess it doesn't matter).  The complete contents becomes
    untrusted.

* Reuse the image *without* specifying the raw format.  QEMU guesses the
    format based on untrusted image contents.  Now QEMU guesses a format
    chosen by the guest, with meta-data chosen by the guest.  By
    controlling image meta-data, the malicious guest can access arbitrary
    files as QEMU, enlarge its storage, and more.  A non-malicious guest
    can accidentally DoS itself, by writing a pattern probing recognizes.
Thank you for bringing that to my attention. This means that I'm even
more in favor of using Kevin's patches because in fact they don't
break anything.
They break things differently.  The difference may or may not matter.

Example: innocent guest writes a recognized pattern.

   Now: next restart fails, guest DoSed itself until host operator gets
   around to adding format=raw to the configuration.  Consequence:
   downtime (probably lengthy), but no data corruption.

   With Kevin's patch: write fails, guest may or may not handle the
   failure gracefully.  Consequences can range from "guest complains to
   its logs (who cares)" via "guest stops whatever it's doing and refuses
   to continue until its hardware gets fixed (downtime as above)" to
   "data corruption".

You somehow seem convinced that writing to sector 0 is a completely normal operation. For x86, it isn't, though.

There are only a couple of programs which do that, I can only think of partitioning and setting up boot loaders. There's not a myriad of programs which would increase the probability of one both writing a recognizable pattern *and* not handling EPERM correctly.

I see the probability of both happening at the same time as extremely low, not least because there are only a handful of programs which access that sector.

Example: innocent guest first writes a recognized pattern, then
overwrites it with a non-recognized pattern.

   Now: works.

   With Kevin's patch: as above.

Not really, if the guest overwrites the data with a non-recognized pattern after EPERM it works as well. The difference here is that you won't have the intended data in the meantime.

Again, I'm not claiming the differences are serious in practice, only
that they exist.

True.

This is CVE-2008-2004.


= Aside: other trust boundaries =

Of course, this is not the only trust boundary that matters.  For
instance, there's normally one between your host and somebody else's
computers.  Telling QEMU to trust meta-data of some image you got "from
the internet" violates it.  There's nothing QEMU can do about that.


= Insecure usage is easy, secure usage is hard =

The oldest stratum of user interfaces doesn't let you specify the image
format.  Use of raw images with these is insecure by design.  These
interfaces are still recommended for human users.

Example of insecure usage: -hda foo.img, where foo.img is raw.

With the next generation of interfaces, specifying the image format is
optional.  Use of raw images with these is insecure by default.

Example of insecure usage: -drive file=foo.img,index=0,media=cdrom,
where foo.img is raw.  The -hda above is actually sugar for this.

Equivalent secure usage: add format=raw.

Note that specifying just the top image's format is not enough, you also
have to specify any backing images' formats.  QCOW2 can optionally store
the backing image format in the image.  The other COW formats can't.
Well, they can, with "json:". *cough*
Point coughingly taken.

Example of insecure usage: -hda bar.vmdk, where bar.vmdk is a VMDK image
with a raw backing file.
Yesterday I found out that doesn't seem possible. You apparently can
only use VMDK with VMDK backing files.
I figure you're referring to this code in vmdk_create():

         if (strcmp(bs->drv->format_name, "vmdk")) {
             bdrv_unref(bs);
             ret = -EINVAL;
             goto exit;
         }

Yes. Of course it does depend on probing, which figures, I guess.

                                        Other than that, we only have
qcow1 and qed as COW formats which should not be used anyway.
qemu-doc.texi calls them "old image format", and qemu-img.texi has them
under "Other", "for compatibility with older QEMU versions".  I guess we
could do better job telling users they "should not be used anyway".

Even in old stuff retained just for compatibility, we should make an
effort to plug security holes, make secure usage easy, and guide users
away from insecure usage.

Of course, I'm just pointing out that it seems to be only qcow1 which is fully subject to this. qcow2 is partially, as you're pointing out next.

Now back to the point I was trying to make in my original message.

Replacement example of insecure usage: -hda bar.qcow2, where bar.qcow2
is a QCOW2 image with a raw backing file and no backing image format,
i.e. created without "-o backing_format=".

Then the next paragraph applies:

Equivalent secure usage: Beats me.  Maybe there's a funky -drive
backing.whatever to specify the backing image's format.
See Kevin's reply for equivalent secure usage.

With the latest interface blockdev-add, specifying the format is
mandatory.  Secure, but not really suitable for humans.

Example of secure usage:

      { "execute": "blockdev-add",
        "arguments": {'options': {
            'driver': 'raw', 'id':'foo',
            'file': { 'driver': 'file', 'filename': 'foo.img' } } } }

Insecure usage is easy, secure usage is *hard*.  Even for sophisticated
users like libvirt developers.  Evidence: libvirt CVE-2010-2237,
CVE-2010-2238, CVE-2010-2239, and more that didn't get a CVE, like the
recent accidental probing in drive-mirror.


= How can we better guard the trust boundary in QEMU? =

The guest can violate the trust boundary only because

(a) QEMU supports both raw images and image formats, and

(b) QEMU guesses image format from raw image contents, and

(c) given a raw image, guests can change its contents and control a
future QEMU's format guess.

We can attack one ore more of these three conditions:
I'd like to attack more because any of these steps might be carried
out in another program which thus either becomes vulnerable itself
(which we don't really have to care about, but I'd like to either way)
or which makes qemu vulnerable.

Having an external program with (a) and (c), this makes qemu
vulnerable if we don't try to forbid (b) or at least make it work
better. Having an external program with (a) and (b), not doing
anything against (c) in qemu makes that external program vulnerable.

(a) Outlaw raw images

(b) Don't guess format from untrusted image contents

(c) Prevent "bad" guest writes

Nobody is seriously suggesting we do (a).  It's clearly too late for
that.  Let's explore the other two in more detail.
And thus I prefer to find and implement solutions for *both* (b) and (c).
Quoting myself: "We can attack one ore more of these three conditions".

Yes, and I referred to that by saying "I'd like to attack more".

== Don't guess format from untrusted image contents ==

Several variations of the theme.

Guessing only happens when the user doesn't specify a format, so the
simplest way to avoid it would be requiring users to always specify the
format.

PRO: Simple, plugs the hole.

CON: Breaks a lot of existing usage.  Bye -hda, hello extra typing.

Variation: command line option to opt out of probing completely.

PRO: Simple, plugs the hole.

CON: Insecure by default.

CON: In addition to opting out, you have to update your usage
accordingly.  I guess libvirt would do it anyway, to guard against
accidental probing once and for all.

Variation: Stefan proposed to make format= mandatory just for -drive.  I
guess he rather meant mandatory for anything but -hda and other selected
convenience interfaces.

PRO: No extra typing in the cases where it makes the most difference.

CON: Breaks existing usage in the other cases, extra typing.

CON: Doesn't fully plug the hole.  Users of convenience interfaces may
      remain insecure out of ignorance.  We could add a warning to guide
      them to more secure usage, but then that warning would annoy users
      who don't care for security (sometimes with reason), and we can't
      have that.  So we silently leave the users who would care if they
      knew insecure.

I proposed something less radical, namely to keep guessing the image
format, but base the guess on trusted meta-data only: file name and
attributes.
You actually want to completely abolish probing? I thought we just
wanted to use the filename as a second source of information and warn
if the contents and the extension don't seem to match; and in the
future, maybe make it an error (but we don't have to discuss that
second part now, I think).
Yes, I propose to ditch it completely, after a suitable grace period.  I
tried to make that quite clear in my PATCH RFC 2/2.

Here's why.

Now: 1. probe
      4. open, error out if meta-data is bad

With my proposed patch:
      1. probe
      2. guess from trusted meta-data
      3. warn unless 1 and 2 match
      4. open, error out if meta-data is bad

Now make the warning an error:
      1. probe
      2. guess from trusted meta-data
      3. error out unless 1 and 2 match
      4. open, error out if meta-data is bad

I figure the following is equivalent, but simpler:

      2. guess from trusted meta-data
      4. open, error out if meta-data is bad

because open should detect all the errors the previous step 3 caught.
If not, things are broken with explicit format=.

You're right, it seems be equivalent. One difference will probably be error messages ("Bad signature" vs. "Filename extension and format do not match").

The other difference is that you have a problem if you cannot distinguish between two formats by extensions, as we've seen already. .qcow could mean both qcow1 and qcow2; a similar problem appeared with .vhd. Opening the image using all possible formats seems bad to me. Better swap 1 and 2: Guess from commonly trusted metadata (i.e. the filename extension) and then probe all possible formats.

Block and character special files are raw.  For other
files, find the file name extension, and look up the format claiming it.

PRO: Plugs the hole.
You mean "plugs hole (b)".
What I (airily) call "the hole" is the scenario I described above: guest
escaping isolation by subverting qemu-system-*.

(b) is not a hole, it's a condition for "the hole".

Your "the hole" is only one hole. There are more holes. You only consider the case where there's only qemu, which is technically fine for discussion on qemu-devel, but I'd like to consider having other tools beside qemu as well.

You're right, though, it should be "prevents condition (b)".

I guess what you want to say is "attacking just condition (b) doesn't
plug some other holes I care about".  That's true.  There are indeed
other holes.

For instance, guest attacking QEMU's utility programs qemu-img, ...  Or
guest attacking other host software.  I'm not trying to discount any of
them.  But tangling all problems up into a hairball won't do us any
good.

We have two approaches and I think using both will help to plug more holes. I'm not saying we should consider every possible hole with every host configuration there may be. I'm just saying using only of the approaches clearly only plugs "the hole" if there's only qemu.

So your point that my analysis is narrow is taken.  In my defense, I can
say that my narrow analysis was difficult enough to write up, and
probably produced enough text to deter readers.

CON: Breaks existing usage when the new guess differs from the old
      guess.  Common usage should be fine:

      * -hda test.qcow2

        Fine as long as test.qcow2 is really QCOW2 (as it should!), and
        either specifies a backing format (as it arguably should), or the
        backing file name is sane.

      * -hda disk.img

        Fine as long as disk.img is really a disk image (as it should).

      * -hda /dev/mapper/vg0-virtdisk

        Fine as long as the logical volume is raw.

      Less common usage can break:

      * -hda nbd://localhost

        Socket provides no clue, so no guess.
nbd should be raw. If it isn't, you're most likely doing something
wrong. See https://bugzilla.redhat.com/show_bug.cgi?id=1090713 what
happens when you're doing it wrong.
Okay.

My RFC PATCH is too simplistic to exploit that, because it only looks at
file name extensions.  But "user asked for nbd protocol" is quite
obviously trusted meta-data.  We could have a less simplistic patch
putting it to use.  Or adopt the variation below.

      Weird usage can conceivably break hard:

      * -hdd disk.img

        Breaks hard when disk.img is actually QCOW2, the guest boots
        anyway from another drive, then proceeds to overwrite this one.
Then why not continue to do probing and using the extension as a safeguard?

Mitigation: lengthy transition period where we warn "this usage is
insecure, and we'll eventually break it; here's a hint on secure usage".

CON: We delay plugging the hole one more time.  But at least we no
longer expose our users to it silently.

Jeff pointed out that we want to retain probing in things like qemu-img
info.

Richard asked for a way for users to ask for insecure probing, say
format=insecure-probe.  I have no problem with giving users rope when
they ask for it.

Variation: if file name and attributes are unavailable or provide no
clue, guess raw.  Same PRO and CON as above, only it avoids breaking a
few more cases.  For instance, "-hda nbd://localhost" keeps working as
long as the server serves a raw image.
Which it should be.

Smells a bit like too much magic
to me.

My proposal replaces probing wholesale.  I like that because it results
in simple, predictable guessing.  Here's a hybrid approach: first guess
raw vs. non-raw based on trusted meta-data only, and if non-raw, probe.

Nothing the guest writes can affect the raw vs. non-raw decision.  Once
an image is raw, only the user can make it non-raw, by changing its name
or attributes.

Two variations: 1. guess raw without a clue, and 2. guess non-raw then.

Again, same PRO and CON as above, only it doesn't break when users give
their non-raw images weird names.

== Prevent "bad" guest writes ==

Again, several variations, but this time, only the last one is serious,
the others are just for illustration.

Fail guest writes to those parts of the image that probing may examine
Can fail only writes to the first few sectors (at worst) of raw images.

PRO: Plugs the hole.

CON: The virtual hardware is defective.  Breaks common guest software
that writes to the first few sectors, such as boot loaders and
partitioning tools.  Breaks guest software using the whole device, which
isn't common, but certainly not unheard of.

Variation: fail only writes of patterns that actually can make probing
guess something other than raw.

PRO: Still plugs the hole.
You mean "plugs hole (c)".
My reply to "plugs hole (b)" applies.

CON: Except when you upgrade to a version that recognizes more patterns.
Which is better than not plugging hole (c) at all.

CON: The virtual hardware is still defective, but the defects are
minimized.
As you pointed out to us it's already defective and I don't think
anybody ever noticed accidentally.
You're right in that probed raw is already defective, with defects
delayed to the next restart.  Preventing "bad" guest writes changes the
nature of the defects subtly, as I described above.

Sector 0 is rarely ever written. It's not that people write some recognizable sequences there all the time but don't notice because they are quickly overwritten again.

If nobody hit the problem accidentally until now, I'm certain that means that nobody ever wrote any recognizable sequence there while using qemu and raw (which is not too rare a combination).

This and the previous variation also extends them to non-probed raw.
The following variations avoid the extension.

We can hope that partition tables, boot sectors and such
won't match the patterns, so common guest software hopefully works.
It's worked in the past, that's good enough for me.

Guest software using the whole device still breaks, only now it breaks
later rather than sooner.

Variation: fail writes only on *probed* raw images.

CON: Doesn't fully plug the hole: mixing probed usage (user doesn't
specify format) with non-probed usage (user specifies format) remains
insecure.  The guest's write succeeds in non-probed usage, and the guest
escapes isolation in the next probed usage.

CON: The virtual hardware is still defective, but it now comes with a
"defective on/off" switch, factory default "defective on".  We could add
a warning to guide users to switch defective off but then that warning
would annoy people who don't care to switch it off (sometimes with
reason), and we can't have that.  So we leave users who would care if
they knew in the dark.

The two variations can be combined.  This is Kevin's proposal.

CON: Doesn't fully plug the hole: union of both variations' flaws.

CON: The virtual hardware is still defective: interesection of both
variations' defects.


= Conclusion =

This is *my* conclusion.  Yours may be different, and that's okay.  It's
business, not personal ;)

Secure usage should not be hard.

If we permit insecure usage for convenience or compatibility, we should
warn the user, unless he clearly asked for it.  Even if that warning
annoys Kevin and Max.
A warning does not annoy me as long as I know what it means.
Good :)

If you want to suppress it with configure
--reckless or something, no objections.

Same for defective virtual hardware.

Kevin's proposal to prevent "bad" guest writes has relatively small
compatibility issues.  It improves things from "insecure by default" to
"slightly defective by default" as long as you use raw images either
always probed or always non-probed.  It doesn't help at all when you
alternate probed and non-probed usage.  Improvement of sorts, but it
still fails my "secure usage should not be hard" requirement, and that
requirement is a hard one for me.

My proposal to ditch image contents probing entirely has more serious
compatibility issues.  In particular, we'd have to forgo sugared
convenience syntax for a number of less common things.  It definitely
needs a grace period where all usage we're going to break warns.  On the
up side, it will actually be secure by default when it's done.

If this is not acceptable, my second choice is actually the command line
option to opt out of probing completely.  This doesn't address "insecure
by default" (sadly), but it does at least satisfy my "secure usage
should not be hard" requirement.

If we should adopt Kevin's proposal against my objections, then I very
badly want the opt out option on top of it, opting out of both probing
and "bad" write prevention.

Thanks for hearing me out.
My conclusion: Don't ditch probing. It increases entropy, why would
you ditch probing? Just combine it with the extension and if both
don't seem to match, that's an error.
How does probing add value?

Compare

      1. probe
      2. guess from trusted meta-data
      3. error out unless 1 and 2 match
      4. open, error out if meta-data is bad

to just

      2. guess from trusted meta-data
      4. open, error out if meta-data is bad

Let P be the driver recommended by probe, and G be the driver
recommended by guess.

If P == G, same result: we open with the same driver.

Else, if open with G fails, equivalent result: error out in step 3
vs. error out in step 4.

Else, we have an odd case: one driver's probe accepts (P's), yet a
different driver's open succeeds (G's).

     If G's probe rejects: is this a bug?  Shouldn't open always fail
     when probe rejects?

     If G's probe accepts, then probing chose P over G.  Maybe it
     shouldn't have.  Or maybe the probing functions are imprecise.
     Anyway, keeping probing around makes this an error.  Should it be
     one?

Am I missing something?

No, see my reply above. I failed to consider that opening an image basically is advanced probing.

The only thing I really see which could be missing is the problem of having ambiguous filename extensions, and that problem can easily be solved by keeping probing.

Also, holes (b) and (c) are two different holes. We should fix
both. We should fix (b) so qemu isn't vulnerable and we should fix (c)
so qemu doesn't make other programs which do probe vulnerable.
Adjusting terminology, but hopefully not your intent: "the hole" isn't
the only hole that matters.  The conditions enable other holes.
Therefore, we should attack both (b) and (c).  Attacking (a) is not
practical.

Yes. that's what I meant. There are various holes and preventing only one of the conditions (b) and (c) only plugs one (or a subset) of them, and I can easily see unplugged holes which can be fixed by preventing both.

Points taken, except I think we could attack (a) if we really wanted.

Of course, but for that we'd either need some flat qcow2 mode or better support for an image format that does have such a flat mode.

Max

So, for fixing (b): Just use the extensions as a safeguard and issue a
warning for now. We can discuss about making it an error later.

And for fixing (c): As you pointed out, if guests wrote some
probe-matching pattern in the past, it would break qemu (which is what
we're trying to fix). Since noone ever said that some guest did that
by accident, I think we can safely assume that prohibiting such writes
will not hurt anyone in the future either; at least there are no
compatibility issues, so if someone notices a problem, he/she can just
explicitly specify the format and it'll work (which you should be
doing anyway, as we all know, though many of us, including me, don't
want to do it all the time).
Thanks for stating your conclusion concisely, it's really helpful.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]