qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v7 1/8] virtio: endian-ambivalent targets using


From: Michael S. Tsirkin
Subject: Re: [Qemu-devel] [PATCH v7 1/8] virtio: endian-ambivalent targets using legacy virtio
Date: Wed, 16 Apr 2014 19:34:15 +0300

On Tue, Apr 15, 2014 at 03:22:56PM +0200, Greg Kurz wrote:
> On Tue, 15 Apr 2014 13:35:03 +0200
> Alexander Graf <address@hidden> wrote:
> 
> > On 04/15/2014 10:40 AM, Greg Kurz wrote:
> > > On Mon, 14 Apr 2014 15:08:23 +0200
> > > Alexander Graf <address@hidden> wrote:
> > >
> > >> On 14.04.14 14:55, Michael S. Tsirkin wrote:
> > >>> On Mon, Apr 14, 2014 at 02:40:04PM +0200, Alexander Graf wrote:
> > >>>> On 14.04.14 14:37, Michael S. Tsirkin wrote:
> > >>>>> On Mon, Apr 14, 2014 at 02:29:20PM +0200, Alexander Graf wrote:
> > >>>>>> On 14.04.14 14:24, Michael S. Tsirkin wrote:
> > >>>>>>> On Mon, Apr 14, 2014 at 02:16:03PM +0200, Alexander Graf wrote:
> > >>>>>>>> On 14.04.14 13:58, Greg Kurz wrote:
> > >>>>>>>>> From: Rusty Russell <address@hidden>
> > >>>>>>>>>
> > >>>>>>>>> virtio data structures are defined as "target endian", which 
> > >>>>>>>>> assumes
> > >>>>>>>>> that's a fixed value.  In fact, that actually means it's 
> > >>>>>>>>> platform-specific.
> > >>>>>>>>> The OASIS virtio 1.0 spec will fix this, by making it all little 
> > >>>>>>>>> endian.
> > >>>>>>>>>
> > >>>>>>>>> We introduce memory accessors to be used accross the virtio code 
> > >>>>>>>>> where
> > >>>>>>>>> needed. These accessors should support both legacy and 1.0 
> > >>>>>>>>> devices.
> > >>>>>>>>> A good way to do it is to introduce a per-device property to 
> > >>>>>>>>> store the
> > >>>>>>>>> endianness. We choose to set this flag at device reset time 
> > >>>>>>>>> because it
> > >>>>>>>>> is reasonnable to assume the endianness won't change unless we 
> > >>>>>>>>> reboot or
> > >>>>>>>>> kexec another kernel. And it is also reasonnable to assume the 
> > >>>>>>>>> new kernel
> > >>>>>>>>> will reset the devices before using them (otherwise it will 
> > >>>>>>>>> break).
> > >>>>>>>>>
> > >>>>>>>>> We reuse the virtio_is_big_endian() helper since it provides the 
> > >>>>>>>>> right
> > >>>>>>>>> value for legacy devices with most of the targets, that have fixed
> > >>>>>>>>> endianness. It can then be overriden to support endian-ambivalent 
> > >>>>>>>>> targets.
> > >>>>>>>>>
> > >>>>>>>>> To support migration, we need to set the flag in virtio_load() as 
> > >>>>>>>>> well.
> > >>>>>>>>>
> > >>>>>>>>> (a) One solution would be to add it to the stream, but it have 
> > >>>>>>>>> some
> > >>>>>>>>>       drawbacks:
> > >>>>>>>>> - since this only affects a few targets, the field should be put 
> > >>>>>>>>> into a
> > >>>>>>>>>     subsection
> > >>>>>>>>> - virtio migration code should be ported to vmstate to be able to 
> > >>>>>>>>> introduce
> > >>>>>>>>>     such a subsection
> > >>>>>>>>>
> > >>>>>>>>> (b) If we assume the following to be true:
> > >>>>>>>>> - target endianness falls under some cpu state
> > >>>>>>>>> - cpu state is always restored before virtio devices state 
> > >>>>>>>>> because they
> > >>>>>>>>>     get initialized in this order in main().
> > >>>>>>>>> Then an alternative is to rely on virtio_is_big_endian() again at
> > >>>>>>>>> load time. No need to mess around with the migration stream in 
> > >>>>>>>>> this case.
> > >>>>>>>>>
> > >>>>>>>>> This patch implements (b).
> > >>>>>>>>>
> > >>>>>>>>> Note that the tswap helpers are implemented in virtio.c so that
> > >>>>>>>>> virtio-access.h stays platform independant. Most of the virtio 
> > >>>>>>>>> code
> > >>>>>>>>> will be buildable under common-obj instead of obj then, and spare
> > >>>>>>>>> some cycles when building for multiple targets.
> > >>>>>>>>>
> > >>>>>>>>> Signed-off-by: Rusty Russell <address@hidden>
> > >>>>>>>>> [ ldq_phys() API change,
> > >>>>>>>>>     relicensed virtio-access.h to GPLv2+ on Rusty's request,
> > >>>>>>>>>     introduce a per-device is_big_endian flag (supersedes 
> > >>>>>>>>> needs_byteswap global)
> > >>>>>>>>>     add VirtIODevice * arg to virtio helpers,
> > >>>>>>>>>     use the existing virtio_is_big_endian() helper,
> > >>>>>>>>>     virtio-pci: use the device is_big_endian flag,
> > >>>>>>>>>     introduce virtio tswap16 and tswap64 helpers,
> > >>>>>>>>>     move calls to tswap* out of virtio-access.h to make it 
> > >>>>>>>>> platform independant,
> > >>>>>>>>>     migration support,
> > >>>>>>>>>     Greg Kurz <address@hidden> ]
> > >>>>>>>>> Cc: Cédric Le Goater <address@hidden>
> > >>>>>>>>> Signed-off-by: Greg Kurz <address@hidden>
> > >>>>>>>>> ---
> > >>>>>>>>>
> > >>>>>>>>> Changes since v6:
> > >>>>>>>>> - merge the virtio_needs_byteswap() helper from v6 and existing
> > >>>>>>>>>     virtio_is_big_endian()
> > >>>>>>>>> - virtio-pci: now supports endianness changes
> > >>>>>>>>> - virtio-access.h fixes (target independant)
> > >>>>>>>>>
> > >>>>>>>>>    exec.c                            |    2 -
> > >>>>>>>>>    hw/virtio/Makefile.objs           |    2 -
> > >>>>>>>>>    hw/virtio/virtio-pci.c            |   11 +--
> > >>>>>>>>>    hw/virtio/virtio.c                |   35 +++++++++
> > >>>>>>>>>    include/hw/virtio/virtio-access.h |  138 
> > >>>>>>>>> +++++++++++++++++++++++++++++++++++++
> > >>>>>>>>>    include/hw/virtio/virtio.h        |    2 +
> > >>>>>>>>>    vl.c                              |    4 +
> > >>>>>>>>>    7 files changed, 185 insertions(+), 9 deletions(-)
> > >>>>>>>>>    create mode 100644 include/hw/virtio/virtio-access.h
> > >>>>>>>>>
> > >>>>>>>>> diff --git a/exec.c b/exec.c
> > >>>>>>>>> index 91513c6..e6777d0 100644
> > >>>>>>>>> --- a/exec.c
> > >>>>>>>>> +++ b/exec.c
> > >>>>>>>>> @@ -42,6 +42,7 @@
> > >>>>>>>>>    #else /* !CONFIG_USER_ONLY */
> > >>>>>>>>>    #include "sysemu/xen-mapcache.h"
> > >>>>>>>>>    #include "trace.h"
> > >>>>>>>>> +#include "hw/virtio/virtio.h"
> > >>>>>>>>>    #endif
> > >>>>>>>>>    #include "exec/cpu-all.h"
> > >>>>>>>>> @@ -2745,7 +2746,6 @@ int cpu_memory_rw_debug(CPUState *cpu, 
> > >>>>>>>>> target_ulong addr,
> > >>>>>>>>>     * A helper function for the _utterly broken_ virtio device 
> > >>>>>>>>> model to find out if
> > >>>>>>>>>     * it's running on a big endian machine. Don't do this at home 
> > >>>>>>>>> kids!
> > >>>>>>>>>     */
> > >>>>>>>>> -bool virtio_is_big_endian(void);
> > >>>>>>>>>    bool virtio_is_big_endian(void)
> > >>>>>>>>>    {
> > >>>>>>>>>    #if defined(TARGET_WORDS_BIGENDIAN)
> > >>>>>>>>> diff --git a/hw/virtio/Makefile.objs b/hw/virtio/Makefile.objs
> > >>>>>>>>> index 1ba53d9..68c3064 100644
> > >>>>>>>>> --- a/hw/virtio/Makefile.objs
> > >>>>>>>>> +++ b/hw/virtio/Makefile.objs
> > >>>>>>>>> @@ -4,5 +4,5 @@ common-obj-y += virtio-bus.o
> > >>>>>>>>>    common-obj-y += virtio-mmio.o
> > >>>>>>>>>    common-obj-$(CONFIG_VIRTIO_BLK_DATA_PLANE) += dataplane/
> > >>>>>>>>> -obj-y += virtio.o virtio-balloon.o
> > >>>>>>>>> +obj-y += virtio.o virtio-balloon.o
> > >>>>>>>>>    obj-$(CONFIG_LINUX) += vhost.o
> > >>>>>>>>> diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
> > >>>>>>>>> index ce97514..82a1689 100644
> > >>>>>>>>> --- a/hw/virtio/virtio-pci.c
> > >>>>>>>>> +++ b/hw/virtio/virtio-pci.c
> > >>>>>>>>> @@ -89,9 +89,6 @@
> > >>>>>>>>>    /* Flags track per-device state like workarounds for quirks in 
> > >>>>>>>>> older guests. */
> > >>>>>>>>>    #define VIRTIO_PCI_FLAG_BUS_MASTER_BUG  (1 << 0)
> > >>>>>>>>> -/* HACK for virtio to determine if it's running a big endian 
> > >>>>>>>>> guest */
> > >>>>>>>>> -bool virtio_is_big_endian(void);
> > >>>>>>>>> -
> > >>>>>>>>>    static void virtio_pci_bus_new(VirtioBusState *bus, size_t 
> > >>>>>>>>> bus_size,
> > >>>>>>>>>                                   VirtIOPCIProxy *dev);
> > >>>>>>>>> @@ -409,13 +406,13 @@ static uint64_t virtio_pci_config_read(void 
> > >>>>>>>>> *opaque, hwaddr addr,
> > >>>>>>>>>            break;
> > >>>>>>>>>        case 2:
> > >>>>>>>>>            val = virtio_config_readw(vdev, addr);
> > >>>>>>>>> -        if (virtio_is_big_endian()) {
> > >>>>>>>>> +        if (vdev->is_big_endian) {
> > >>>>>>>>>                val = bswap16(val);
> > >>>>>>>>>            }
> > >>>>>>>>>            break;
> > >>>>>>>>>        case 4:
> > >>>>>>>>>            val = virtio_config_readl(vdev, addr);
> > >>>>>>>>> -        if (virtio_is_big_endian()) {
> > >>>>>>>>> +        if (vdev->is_big_endian) {
> > >>>>>>>>>                val = bswap32(val);
> > >>>>>>>>>            }
> > >>>>>>>>>            break;
> > >>>>>>>>> @@ -443,13 +440,13 @@ static void virtio_pci_config_write(void 
> > >>>>>>>>> *opaque, hwaddr addr,
> > >>>>>>>>>            virtio_config_writeb(vdev, addr, val);
> > >>>>>>>>>            break;
> > >>>>>>>>>        case 2:
> > >>>>>>>>> -        if (virtio_is_big_endian()) {
> > >>>>>>>>> +        if (vdev->is_big_endian) {
> > >>>>>>>>>                val = bswap16(val);
> > >>>>>>>>>            }
> > >>>>>>>>>            virtio_config_writew(vdev, addr, val);
> > >>>>>>>>>            break;
> > >>>>>>>>>        case 4:
> > >>>>>>>>> -        if (virtio_is_big_endian()) {
> > >>>>>>>>> +        if (vdev->is_big_endian) {
> > >>>>>>>>>                val = bswap32(val);
> > >>>>>>>>>            }
> > >>>>>>>>>            virtio_config_writel(vdev, addr, val);
> > >>>>>>>>> diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
> > >>>>>>>>> index aeabf3a..bb646f0 100644
> > >>>>>>>>> --- a/hw/virtio/virtio.c
> > >>>>>>>>> +++ b/hw/virtio/virtio.c
> > >>>>>>>>> @@ -19,6 +19,7 @@
> > >>>>>>>>>    #include "hw/virtio/virtio.h"
> > >>>>>>>>>    #include "qemu/atomic.h"
> > >>>>>>>>>    #include "hw/virtio/virtio-bus.h"
> > >>>>>>>>> +#include "hw/virtio/virtio-access.h"
> > >>>>>>>>>    /*
> > >>>>>>>>>     * The alignment to use between consumer and producer parts of 
> > >>>>>>>>> vring.
> > >>>>>>>>> @@ -546,6 +547,8 @@ void virtio_reset(void *opaque)
> > >>>>>>>>>        virtio_set_status(vdev, 0);
> > >>>>>>>>> +    vdev->is_big_endian = virtio_is_big_endian();
> > >>>>>>>>> +
> > >>>>>>>>>        if (k->reset) {
> > >>>>>>>>>            k->reset(vdev);
> > >>>>>>>>>        }
> > >>>>>>>>> @@ -897,6 +900,11 @@ int virtio_load(VirtIODevice *vdev, QEMUFile 
> > >>>>>>>>> *f)
> > >>>>>>>>>        BusState *qbus = qdev_get_parent_bus(DEVICE(vdev));
> > >>>>>>>>>        VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus);
> > >>>>>>>>> +    /* NOTE: we assume that endianness is a cpu state AND
> > >>>>>>>>> +     * cpu state is restored before virtio devices.
> > >>>>>>>>> +     */
> > >>>>>>>>> +    vdev->is_big_endian = virtio_is_big_endian();
> > >>>>>>>>> +
> > >>>>>>>>>        if (k->load_config) {
> > >>>>>>>>>            ret = k->load_config(qbus->parent, f);
> > >>>>>>>>>            if (ret)
> > >>>>>>>>> @@ -1153,6 +1161,33 @@ void 
> > >>>>>>>>> virtio_device_set_child_bus_name(VirtIODevice *vdev, char 
> > >>>>>>>>> *bus_name)
> > >>>>>>>>>        }
> > >>>>>>>>>    }
> > >>>>>>>>> +uint16_t virtio_tswap16(VirtIODevice *vdev, uint16_t s)
> > >>>>>>>>> +{
> > >>>>>>>>> +    if (vdev->is_big_endian) {
> > >>>>>>>>> +        return tswap16(s);
> > >>>>>>>>> +    } else {
> > >>>>>>>>> +        return bswap16(tswap16(s));
> > >>>>>>>>> +    }
> > >>>>>>>> This looks pretty bogus. When virtio wants to do "tswap" what it
> > >>>>>>>> means is "give me a host endianness value in virtio endianness". 
> > >>>>>>>> How
> > >>>>>>>> about something like
> > >>>>>>>>
> > >>>>>>>> #ifdef HOST_WORDS_BIGENDIAN
> > >>>>>>>>     return vdev->is_big_endian ? s : bswap16(s);
> > >>>>>>>> #else
> > >>>>>>>>     return vdev->is_big_endian ? bswap16(s) : s;
> > >>>>>>>> #endif
> > >>>>>>>>
> > >>>>>>> Actually why doesn't this call virtio_is_big_endian?
> > >>>>>>> As it is, we get extra branches even if target endian-ness
> > >>>>>>> is fixed.
> > >>>>>> Because virtio_is_big_endian() returns the default endianness, not
> > >>>>>> the runtime endianness of a virtio device.
> > >>>> In fact, we should probably rename it accordingly.
> > >>>>
> > >>>>>>>> That should work pretty well inline as well, so you don't need to
> > >>>>>>>> compile virtio.c as target dependent.
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> Alex
> > >>>>>>> Yes but we'll still need to build two variants: fixed endian and
> > >>>>>>> dynamic endian platforms.
> > >>>>>>> Something along the lines of 32/64 bit split that we have?
> > >>>>>> Why bother? Always make it dynamic and don't change the per-device
> > >>>>>> variable, no? I'd be surprised if the performance difference is
> > >>>>>> measurable. The bulk of the data we transfer gets copied raw anyway.
> > >>>>>>
> > >>>>>>
> > >>>>>> Alex
> > >>>>> This will have to be measured and proved by whoever's proposing the
> > >>>>> patch, not by reviewers.  Platforms such as AMD which don't do
> > >>>>> prediction well would be especially interesting to test on.
> > >>>> Sure, Greg, can you do that? I'm sure Michael has test cases
> > >>>> available he can give you to measure performance on this.
> > >>> Measuring performance is hard ATM.
> > >>>
> > >>> Disk io stress when backed by raw ramdisk in host is usually the easiest
> > >>> way to stress userspace virtio.
> > >>> You need to make sure your baseline is repeateable though:
> > >>> pin down everything from cpu to hardware interrupts,
> > >>> disable power management, c states etc
> > >>> Just taking an average and hoping for the best doesn't cut it.
> > >>>
> > >>>
> > >>> We really should write a benchmark device,
> > >>> to measure just the transport overhead.
> > >>> For example, start with virtio-net but fuse TX and RX
> > >>> VQs together, so you'll get right back whatever you send out.
> > >>>
> > >>> So really, virtio is ATM target-specific and I think it's
> > >>> a good idea to get it working as such first,
> > >>> do extra cleanups like getting rid of target specific code
> > >>> separately.
> > >> How does it help when we keep it target specific? The only way to remove
> > >> any overhead from this refactoring would be to instead of
> > >>
> > >>     if (vdev->is_big_endian)
> > >>
> > >> write it as
> > >>
> > >>     if (virtio_is_big_endian_device(vdev))
> > >>
> > >> which we could define as
> > >>
> > >> static inline bool virtio_is_big_endian_device(VirtIODevice *vdev)
> > >> {
> > >> #if defined(TARGET_PPC)
> > >>     return vdev->is_big_endian
> > >> #elif defined(TARGET_WORDS_BIGENDIAN)
> > >>     return true;
> > >> #else
> > >>     return false;
> > >> #endif
> > >> }
> > >>
> > >> If it makes you happy to do it this way first and move virtio to
> > >> target-independent files later, we can certainly do this.
> > >>
> > >>
> > > This would end up defined in all virtio files with the consequence of
> > > hitting TARGET_WORDS_BIGENDIAN poison for the target independent ones.
> > > Should we kick them out of common-obj variables in makefiles as well ?
> > > Or would it be acceptable to have this helper not inlined ?
> > 
> > That would defeat the purpose - the reason to have the helper inlined is 
> > to remove the conditional branch for x86.
> > 
> 
> Sure but on the other hand, Peter does not like the idea of moving virtio
> to compiled-per-target...


Whenever I do touch hw/virtio/virtio.c I see:

  CC    alpha-softmmu/hw/virtio/virtio.o
  CC    arm-softmmu/hw/virtio/virtio.o
  LINK  alpha-softmmu/qemu-system-alpha
  LINK  arm-softmmu/qemu-system-arm
  CC    aarch64-softmmu/hw/virtio/virtio.o
  LINK  aarch64-softmmu/qemu-system-aarch64
  CC    i386-softmmu/hw/virtio/virtio.o
  LINK  i386-softmmu/qemu-system-i386
  CC    m68k-softmmu/hw/virtio/virtio.o
  LINK  m68k-softmmu/qemu-system-m68k
  CC    mips64el-softmmu/hw/virtio/virtio.o
  LINK  mips64el-softmmu/qemu-system-mips64el
  CC    mips64-softmmu/hw/virtio/virtio.o
  CC    mipsel-softmmu/hw/virtio/virtio.o
  LINK  mips64-softmmu/qemu-system-mips64
  LINK  mipsel-softmmu/qemu-system-mipsel
  CC    mips-softmmu/hw/virtio/virtio.o
  LINK  mips-softmmu/qemu-system-mips
  CC    ppcemb-softmmu/hw/virtio/virtio.o
  CC    ppc64-softmmu/hw/virtio/virtio.o
  LINK  ppcemb-softmmu/qemu-system-ppcemb
  LINK  ppc64-softmmu/qemu-system-ppc64
  CC    ppc-softmmu/hw/virtio/virtio.o
  LINK  ppc-softmmu/qemu-system-ppc
  CC    s390x-softmmu/hw/virtio/virtio.o
  LINK  s390x-softmmu/qemu-system-s390x
  CC    sh4eb-softmmu/hw/virtio/virtio.o
  LINK  sh4eb-softmmu/qemu-system-sh4eb
  CC    sh4-softmmu/hw/virtio/virtio.o
  LINK  sh4-softmmu/qemu-system-sh4
  CC    sparc64-softmmu/hw/virtio/virtio.o
  LINK  sparc64-softmmu/qemu-system-sparc64
  CC    x86_64-softmmu/hw/virtio/virtio.o
  LINK  x86_64-softmmu/qemu-system-x86_64

so it looks like virtio is currently compiled per-target.
So why isn't it reasonable to keep it per-target for
purpose of this enhancement?
What am I missing?

> these look like contradictory requests to me... :-\
> Unless I have missed something, we have to settle whether we favor 
> building/testing
> time or performance of non-{powerpc,arm} targets to have legacy virtio 
> supporting
> LE powerpc and BE arm...
> 
> > 
> > Alex
> > 
> 
> 
> 
> -- 
> Gregory Kurz                                     address@hidden
>                                                  address@hidden
> Software Engineer @ IBM/Meiosys                  http://www.ibm.com
> Tel +33 (0)562 165 496
> 
> "Anarchy is about taking complete responsibility for yourself."
>         Alan Moore.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]