[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v2 0/5] q35: Remove old machines and unused comp

From: Markus Armbruster
Subject: Re: [Qemu-devel] [PATCH v2 0/5] q35: Remove old machines and unused compat code
Date: Mon, 08 Feb 2016 12:59:56 +0100
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux)

"Michael S. Tsirkin" <address@hidden> writes:

> On Fri, Feb 05, 2016 at 09:55:08AM +0100, Markus Armbruster wrote:
>> "Michael S. Tsirkin" <address@hidden> writes:
>> > On Thu, Feb 04, 2016 at 03:16:17PM -0200, Eduardo Habkost wrote:
>> >> On Thu, Feb 04, 2016 at 06:01:50PM +0200, Michael S. Tsirkin wrote:
>> >> > On Sat, Jan 23, 2016 at 02:02:08PM -0200, Eduardo Habkost wrote:
>> >> > > This is another attempt to remove old q35 machine code. Now I am
>> >> > > also removing unused compat code to demonstrate the benefit of
>> >> > > throwing away the old code that nobody uses.
>> >> > 
>> >> > The same thing I said applies - we don't know that nobody uses old q35
>> >> > machine types.
>> >> > We do know we don't need to migrate to/from them,
>> >> > so we can drop compat code.
>> >> > But please add aliases so people can still start these machines.
>> >> 
>> >> If people use them, they can easily update their configurations.
>> >> I will copy and paste the reply Markus sent 4 months ago below.
>> >> 
>> >> On Mon, Sep 14, 2015 at 09:18:47AM +0200, Markus Armbruster wrote:
>> >> > We've been through this before, but we can go through it once more.
>> >> > Choices:
>> >> > 
>> >> > A. Remove old machine type
>> >> > 
>> >> >    A guest using it can't be started.  Easy to understand on the host.
>> >> >    An error message advising to switch to a newer machine type would be
>> >> >    a nice touch.
>> >> > 
>> >> >    This is a clean break in backward compatibility.  To be mentioned in
>> >> >    release notes, obviously.
>> >> > 
>> >> > B. Change old machine type in a guest-visible way
>> >> > 
>> >> >    Depending on the nature of the change and the guest, a guest using it
>> >> >    either doesn't notice, copes with it successfully, or fails in
>> >> >    guest-specific ways.  If the latter, the failure can be "guest
>> >> >    hangs", which is much harder to figure out than A.
>> >> > 
>> >> >    Unless we can *demonstrate* that nothing bad happens for all the
>> >> >    guests people actually use with the old machine types, this is a
>> >> >    different kind of backward compatibility break.
>> >> > 
>> >> >    Demonstrating this is feels infeasible to me, but you're welcome to
>> >> >    try.
>> >> > 
>> >> > I could call the difference between the two a tradeoff, but since we've
>> >> > been through this before, I'll be more blunt: choosing B robs Peter (the
>> >> > guy with guests where badness happens) to pay Paul (the guy with guests
>> >> > that cope).  Paul is saved the inconvenience of having to read release
>> >> > notes or his logs, and change machine types.  Peter pays for that with
>> >> > figuring out WTF his guests are doing now.
>> >> > 
>> >> > As a user, I'd pick a clean break in backward compatibility over a hack
>> >> > that preserves effective compatibility when it works, but breaks it
>> >> > uncleanly when it doesn't.
>> >> > 
>> >> > As a developer, I'm insisting on it.
>> >> > 
>> >> > So, if you want B, the onus is on *you* to show us why nothing bad will
>> >> > happen.
>> >> > 
>> >
>> > I agree with the conclusion for option B.  But I think the correct
>> > solution is not A, it is to analyse changes, maybe even test, and show
>> > that nothing bad can happen.
>> What exactly are you proposing to do?
> I'm not sure. But if someone says "we drop machine X, people can use
> machine Y instead" then clearly, there should be some data
> included about how well does Y function in place of X:
> based on code analysis, testing, or both.

No, that's not what we say.  We say: machine type X is no longer
supported in new versions of QEMU.  You can either stay on older QEMU
versions, or you can migrate to newer virtual hardware.

Migrating to new hardware (virtual or physical) is a well understood
problem.  You can go with a fresh install on new hardware.  You can also
try to move your old disk or disk contents to new hardware.  The latter
usually works when the new hardware isn't too different.  But if it
breaks, you get to keep the pieces.

Since "stay on old QEMU or migrate to another machine type"
inconveniences users, we make an effort[*] to keep those machine types
alive that people actually use seriously.

The old q35 machine types are experimental and barely work.  There is no
evidence of non-experimental use.  Keeping them alive would therefore be
a misuse of scarce resources.

> Some ideas for things that seem rather safe based on code analysis:
> 1. We should be able to explicitly disable migration for old machine
> types that could not migrate historically so users don't try to migrate
> them now that they can.

Nobody should should be migrating these machine types for the simple
reason that nobody should be using them.

> 2. Drop stuff that isn't guest OS visible, that we
> keep around for migration or old bios support.
> has_acpi_build - old bios support will allow removing the dsdt AML.
> option_rom_has_mr - not guest visible.
> Nothing is 100% safe - e.g. people that use their own bios will get
> bitten by removal of AML.

Keeping these types is a pointless waste of maintenance resources.
Messing with them additionally wastes development resources.

>>  Who should be doing it?
> Whoever's interested.

I'm having a hard time believing what I read, so I'm asking you to
correct my reading: as the PC maintainer, you demand that "somebody"
goes on an unbounded, ill-defined, "I'm not sure" research trip before
we can get rid of experimental machine types that barely worked and
nobody should be using.  Without any evidence of actual use, let alone
harm.  What am I misunderstanding?

>>  And what
>>  other work are you willing to sacrifice to get this task done?
> I don't much care about getting this done - most stuff has to
> be there for piix anyway.
>> > Because A suffers from exactly the same problem if people just blindly
>> > switch to a new machine type.
>> Users switch machine types for any number of reasons, good and bad.  You
>> can't stop the ones switching for bad reasons by keeping some barely
>> working machine type around forever.
> If you tell people to switch to a new machine type, then switching is
> not PEBKAC and you need a tested migration path that does not involve
> reinstalling the gust OS for most users.

You're missing my point entirely.

Switching machine types without understanding the risks is PEBKAC.

I'm not proposing to tell our users to blindly switch machine types.  I'm
proposing to tell users of the old q35 machine types (if they exist at
all, which I doubt) to migrate to a suitable new machine type.

[*] "Make an effort" because we lack the resources to systematically
catch regressions before they happen.  If you need stability, use a
downstream that does the necessary work.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]