qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 1/3] hw/net: e1000: Correct the initial value of VET regis


From: Jason Wang
Subject: Re: [PATCH v2 1/3] hw/net: e1000: Correct the initial value of VET register
Date: Wed, 14 Jul 2021 17:17:12 +0800
User-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:78.0) Gecko/20100101 Thunderbird/78.11.0


在 2021/7/14 下午5:05, Bin Meng 写道:
On Wed, Jul 14, 2021 at 4:40 PM Jason Wang <jasowang@redhat.com> wrote:

在 2021/7/14 下午2:04, Bin Meng 写道:
On Wed, Jul 14, 2021 at 12:53 PM Jason Wang <jasowang@redhat.com> wrote:
在 2021/7/14 上午11:42, Bin Meng 写道:
On Wed, Jul 14, 2021 at 11:10 AM Jason Wang <jasowang@redhat.com> wrote:
在 2021/7/13 下午5:11, Bin Meng 写道:
On Tue, Jul 13, 2021 at 5:02 PM Jason Wang <jasowang@redhat.com> wrote:
在 2021/7/13 下午4:36, Bin Meng 写道:
On Tue, Jul 13, 2021 at 3:03 PM Jason Wang <jasowang@redhat.com> wrote:
在 2021/7/13 上午7:06, Bin Meng 写道:
On Mon, Jul 5, 2021 at 1:57 PM Bin Meng <bmeng.cn@gmail.com> wrote:
On Mon, Jul 5, 2021 at 12:21 PM Jason Wang <jasowang@redhat.com> wrote:
在 2021/7/2 下午5:24, Bin Meng 写道:
From: Christina Wang <christina.wang@windriver.com>

The initial value of VLAN Ether Type (VET) register is 0x8100, as per
the manual and real hardware.

While Linux e1000 driver always writes VET register to 0x8100, it is
not always the case for everyone. Drivers relying on the reset value
of VET won't be able to transmit and receive VLAN frames in QEMU.

Reported-by: Markus Carlstedt <markus.carlstedt@windriver.com>
Signed-off-by: Christina Wang <christina.wang@windriver.com>
Signed-off-by: Bin Meng <bin.meng@windriver.com>
---

(no changes since v1)

        hw/net/e1000.c | 2 ++
        1 file changed, 2 insertions(+)

diff --git a/hw/net/e1000.c b/hw/net/e1000.c
index 4f75b44cfc..20cbba6411 100644
--- a/hw/net/e1000.c
+++ b/hw/net/e1000.c
@@ -29,6 +29,7 @@
        #include "hw/pci/pci.h"
        #include "hw/qdev-properties.h"
        #include "migration/vmstate.h"
+#include "net/eth.h"
        #include "net/net.h"
        #include "net/checksum.h"
        #include "sysemu/sysemu.h"
@@ -254,6 +255,7 @@ static const uint32_t mac_reg_init[] = {
            [MANC]    = E1000_MANC_EN_MNG2HOST | E1000_MANC_RCV_TCO_EN |
                        E1000_MANC_ARP_EN | E1000_MANC_0298_EN |
                        E1000_MANC_RMCP_EN,
+    [VET]     = ETH_P_VLAN,
I wonder if we need a compat flag for this, since we change the behavior.

(See e1000_properties[])

No we don't need to since it does not break migration.
Ping?
I admit migration "works" but it doesn't mean it's not broken. It
changes the guest visible default value of VET register, so it may break
things silently for the guest.

For old machine types, we should stick the value to the one without this
fix.
Could you please propose a solution on how to handle such a scenario
in a generic way in QEMU? (+Peter)
Well, I think I've suggested you to have a look at how things is done in
for handling such compatibility in e1000_properties.


The POR reset value is wrong in QEMU and has carried forward the wrong
value for years, and correcting it to its right value needs to do
what?
We should stick to the wrong behavior for old machine types.

That's all.
So that means the following SD patch is also wrong (+Philippe) which
changes the default value of capability register.
http://patchwork.ozlabs.org/project/qemu-devel/patch/20210623185921.24113-1-joannekoong@gmail.com/
It should compat capareg for the old value for old machine types.
Yeah, it's already a property for the SD controller model but someone
views it as a bug because the model implements 64-bit but not
reporting it in the capability register.

Can we get some agreement among maintainers?
It's not about the agreement but about to have a stable ABI. I don't
know the case for sd but e1000 is used in various  and we work hard to
unbreak the migration compatibility among downstream versions. Git log
on e1000.c will tell you more.
Agreement or stable ABI, whatever we call, but we should be in some consistency.

IMHO maintainers should reach an agreement to some extent on how
compatibility should be achieved. I just found silly to add a property
to fix a real bug in the model, and we preserve the bug all over
releases.
That's the price for the stable ABI. See one of my recent fix -
d83f46d189 virtio-pci: compat page aligned ATS. It keeps the "buggy"
behavior to unbreak the migration.

But this series does not break the migration, as we discussed in the
previous thread.

It actually did,

(qemu) qemu-kvm: get_pci_config_device: Bad config data: i=0x104 read:
      0 device: 20 cmask: ff wmask: 0 w1cmask:0

Since the register is RO.
Did you mean the VET register is read-only?


No, I meant for ATS.


I don't understand this as
the manual says it's RW and Linux driver programs it to 0x8100.


So ATS cap is read only so it can be validated by qemu.

VET is RW so qemu can't do similar checks.




I can find plenty of examples in the current QEMU tree that were
accepted that changed the bugous register behavior, but it was not
asked to add new properties to keep the bugos behavior.

e.g.: commit ce8e43760e8e ("hw/net: fsl_etsec: Reverse the RCTRL.RSF logic")
I guess it's simply because fsl_etsec is not used in any
distributions/production environments or the maintainer may just not
notice things like this.

But for e1000(e), we should stick to a stable ABI for consistency.
Otherwise it would be very tricky to fix them after we saw real issues.
We had learnt a lot during the past decade.

Okay, do we have such a kind of widely used device model list? And we
should document such a process that we should keep compatibility on
these devices as well.

I can only say for networking devices:

- e1000(e)
- rtl8139
- virtio-net

Please, at least have these documented as a development process.


See Peter's reply. We should take care of the devices that is involved in at least one versioned machine types.

If you wish, you can post a patch to clarify things in the doc.

Thanks



Regarding this VET register, do you know what guest relies on the POR
value which is zero?

I don't know.


Zero is not a valid ethernet VLAN type. I don't
think changing this will break any guests.

You might be right here, but it would be late if we find it breaks any one.

If it's not a lot of work, I tend to bother with compat stuffs for this.

Regards,
Bin





reply via email to

[Prev in Thread] Current Thread [Next in Thread]