qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-devel] [PATCH V14 0/5] VMXNET3 paravirtual NIC device implementati


From: Dmitry Fleytman
Subject: [Qemu-devel] [PATCH V14 0/5] VMXNET3 paravirtual NIC device implementation
Date: Sat, 9 Mar 2013 11:21:01 +0200

This set of patches implements VMWare VMXNET3 paravirtual NIC device.
The device supports of all the device features including offload capabilties,
VLANs and etc.
The device is tested on different OSes:
    Fedora 15
    Ubuntu 10.4
    Centos 6.2
    Windows 2008R2
    Windows 2008 64bit
    Windows 2008 32bit
    Windows 2003 64bit
    Windows 2003 32bit

Changes in V14:
   QOM-related fixes

Changes in V13:
   Licensing of vmxnet3.h changed to match original
   VMWare's header file license

Changes in V12:
   Reported-by: Andreas Farber <address@hidden>
   QOM-related fixes 

Changes in V11:
   Rebased to master HEAD to fix multiqueue compilation issues

Changes in V10:
   Reported-by: Stefan Hajnoczi <address@hidden>
   Issues reported by Stefan Hajnoczi fixed.

Changes in V9:
   Reported-by: Stefan Hajnoczi <address@hidden>
   Issues reported by Stefan Hajnoczi fixed.

Changes in V8:
   Reported-by: Stefan Hajnoczi <address@hidden>
   Issues reported by Stefan Hajnoczi reviewed and mostly fixed:

> +        }
> +        curr_src_off += src[i].iov_len;
> +    }
> +    return j;
> +}

The existing iov_copy() function provides equivalent functionality.  I
don't think iov_rebuild() is needed.

[DF] Done. Thanks, missed it.


> +            size -= len;
> +        }
> +        iovec_off += iov[i].iov_len;
> +    }
> +    return res;
> +}

Rename this net_checksum_add_iov() and place it in net/checksum.c,
then the new dependency on net from block can be dropped.

[DF] Done. 

> +vmw_shmem_read(hwaddr addr, void *buf, int len)
>  {
>      VMW_SHPRN("SHMEM r: %" PRIx64 ", len: %d to %p", addr, len, buf);
>      cpu_physical_memory_read(addr, buf, len);
>  }

All changes to this file should be squashed with the previous patch.

[DF] Done

> +#ifdef VMXNET_DEBUG_SHMEM_ACCESS
> +#define VMW_SHPRN(fmt, ...)                                                  
>  \
> +    do {                                                                     
>  \
> +        printf("[%s][SH][%s]: " fmt "\n", VMXNET_DEVICE_NAME, __func__,      
>  \
> +            ## __VA_ARGS__);                                                 
>  \
> +    } while (0)
> +#else
> +#define VMW_SHPRN(fmt, ...) do {} while (0)
> +#endif

Please use QEMU tracing.  It eliminates all this boilerplate and
conditional compilation.  Tracing can be enabled/disabled at runtime
and works with SystemTap/DTrace.  See docs/tracing.txt.

[DF] We'd like to stick with compile time logic in this case becase of 2 
reasons:
[DF] 1. These printouts are intended for reverse engineering/development only 
and there is
[DF]    no need to enable them at run time
[DF] 2. There is a big number of printouts, all driver-device communication is 
traced,
[DF]    they hit performance even on strongest x86 in case of run-time logic


> +struct eth_header {
> +    uint8_t  h_dest[ETH_ALEN];   /* destination eth addr */
> +    uint8_t  h_source[ETH_ALEN]; /* source ether addr    */
> +    uint16_t h_proto;            /* packet type ID field */
> +};

Looks like it's copy-pasted stuff from /usr/include/linux/if_*.h,
/usr/include/netinet/*.h, and friends.  If the system-wide headers are
included names will collide for some of the macros at least.

Did you check if the slirp/ definitions can be reused?

[DF] Yes, you are right. This is copy-pasted from different places.
[DF] Slips definishing do not fully cover our needs.


I'd rather we import network header definitions once in a generic
place into the source tree.  That way vmxnet and other components
don't need to redefine these structs.

[DF] Exaclty! Our intention is to create generic header with network 
definitions and make everyone use it.
[DF] We can move our header to some shared place if you want, however I'd do it 
in parallel with cleanup
[DF] of similar definitions in existing code and this is a big change that os 
out of scope of these patches.

> + 
> *===========================================================================*/

Is this huge comment box a sign that the code should be split into a
foo_tx.c and an foo_rx.c file?

[DF] As for me this file is not that big to be splitted (<800 lines), however 
I'll do this if you insist :)

> +size_t vmxnet_tx_pkt_send(VmxnetTxPktH pkt, NetClientState *vc)

'vc' is an old name that was used for VLANClientState.  The struct has
since been renamed to NetClientState and the rest of QEMU uses 'nc'
instead of 'vc'.

[DF] Fixed. Thanks.

> +/* tx module context handle */
> +typedef void *VmxnetTxPktH;

Forward-declaring the struct is nicer:

typedef struct VmxnetTxPkt VmxnetTxPkt;

The definition of VmxnetTxPkt is still hidden from the caller but you
avoid the void* and casting.  In vmxnet_pkt.c define using:

struct VmxnetTxPkt {
    ...
};

[DF] Agree, fixed. Thanks.


> diff --git a/hw/vmxnet_utils.h b/hw/vmxnet_utils.h
> index 7fd9a01..fac3b7b 100644
> --- a/hw/vmxnet_utils.h
> +++ b/hw/vmxnet_utils.h

Please squash these fixes into the previous patch.


[DF] Oops, my bad. Fixed.


> +        uint8_t msix_used;
> +        /* Whether MSI support was installed successfully */
> +        uint8_t msi_used;

These two fields should be bool.

> +        /* Whether automatic interrupts masking enabled */
> +        uint8_t auto_int_masking;

bool

[DF] Done.


> +static inline void vmxnet3_flush_shmem_changes(void)
> +{
> +    /*
> +     * Flush shared memory changes
> +     * Needed before sending interrupt to guest to ensure
> +     * it gets consistent memory state
> +     */
> +    smp_wmb();
> +}

It's useful to document why a memory barrier is being used in each
instance.  Therefore hiding smp_wmb() inside a wrapper function isn't
great.

[DF] Fixed.

Also, it's suspicious that smb_wmb() is used but no other barriers are
used.  What about a read memory barrier when accessing shared memory
written by the guest?

[DF] VMWARE interface build a in safe way - you always read out data buffer 
(packet descriptor) with memcopy,
[DF] and then check its last byte to see whether it was fully filled by driver.
[DF] In this case device doesn't need read barriers. Drivers need write 
barriers of course to secure
[DF] proper order of writes.

Changes in V7:

   Reported-by: Michael S. Tsirkin <address@hidden>
   Issues reported by Michael S. Tsirkin reviewed and mostly fixed:

File vmware_utils.h:

...

> +static inline void
> +vmw_shmem_st64(target_phys_addr_t addr, uint64_t value)
> +{
> +    VMW_SHPRN("SHMEM store64: %" PRIx64 " (value %" PRIx64 ")", addr, value);
> +    stq_le_phys(addr, value);
> +}
> +

Pls remove these wrappers.  These are just memory stores. Our codebase
is too large as it is without every driver wrapping all standard calls.

[DF] Idea behind this macros is to have printout in each of them
[DF] Printouts are really needed when reverse-engineering windows guest drivers
[DF] Since there is no windows drivers code this is the only way to understand
[DF] sequences they use

> +/* MACROS for simplification of operations on array-style registers */

UPPERCASE ABUSE

[DF] Fixed

> +#define IS_MULTIREG_ADDR(addr, base, cnt, regsize)                 \
> +    (((addr + 1) > (base)) && ((addr) < (base) + (cnt) * (regsize)))


Same as range_covers_byte(base, cnt * regsize, addr)?

[DF] Fixed, thanks

> +
> +#define MULTIREG_IDX_BY_ADDR(addr, base, regsize)                  \
> +    (((addr) - (base)) / (regsize))
> +

Above two macros is all that's left. No objection but it does not say
what they do - want to add minimal documentation?
And please prefix with VMWARE_ or something.

[DF] Prefix added, macros documented

File vmxnet_utils.h (and related with the same problems):

...
> + * See the COPYING file in the top-level directory.
> + *
> + */
> +
> +#ifndef _VMXNET_UTILS_H_
> +#define _VMXNET_UTILS_H_

Please do not start identifiers with _ followed by an uppercase
lattters.

[DF] Fixed

...
> +
> +struct eth_header {
> +    uint8_t  h_dest[ETH_ALEN];   /* destination eth addr */
> +    uint8_t  h_source[ETH_ALEN]; /* source ether addr    */
> +    uint16_t h_proto;            /* packet type ID field */
> +};
> +

And fix struct definitions to
1. follow qemu coding style
[DF] It's not clear what's wrong with the coding style. Please, elaborate.
2. start with vmxnet
[DF] Since this is generic network definitions with no device specifics
[DF] I'm not sure it makes sense to add vmxnet prefix.
[DF] I'd say it makes sense to put them into some generic header files under 
"net" directory instead
[DF] and clean up other devices that have their local definitions of the same 
structures.
[DF] Please, advise.


I also don't really understand why are these
functions split out - vmxnet is the only user, no?

[DF] Currently vmxnet3 is the only user, hovewer we have vmxnet2 implementation 
as well (not submitted yet)
[DF] and it uses the same header

File vmxnet_pkt.h:

...
> +
> +/* tx module context handle */
> +typedef void *VmxnetTxPktH;

This gets you zero type safety and makes debugging impossible.
Just use pointers like everyone does.

[DF] Handle type is added to hide VmxnetTxPkt structure into .c header
[DF] for better encapsulation.
[DF] Should we drop it anyway?

Files vmxnet3.*:

...
> +
> +    if (zero_region) {
> +        vmw_shmem_set(pa, 0, size*cell_size);

spaces around *

[DF] Fixed

...
> +#define vmxnet3_ring_dump(macro, ring_name, ridx, r)                         
> \
> +    macro("%s#%d: base %" PRIx64 " size %lu cell_size %lu gen %d next %lu",  
> \
> +          (ring_name), (ridx),                                               
> \
> +          (r)->pa, (r)->size, (r)->cell_size, (r)->gen, (r)->next)

make macros upper case

[DF] Fixed

...
> +static inline void vmxnet3_flush_shmem_changes(void)
> +{
> +    /*
> +     * Flush shared memory changes
> +     * Needed before transferring control to guest

what does 'transferring control to guest' mean?
[DF] Changed to 
[DF]    /*
[DF]     * Flush shared memory changes
[DF]     * Needed before sending interrupt to guest to ensure
[DF]     * it gets consistent memory state
[DF]     */

...
> +     */
> +    smp_wmb();
> +}

Don't use wrappers like this. They just hide bugs. For example
it's not helpful before an interrupt in the function below.

[DF] I guess you are talking about vmxnet3_complete_packet()
[DF] Strictly speaking barrier is a must because we change shared memory in 
[DF] vmxnet3_complete_packet()

[DF] And the wrapper is a good thing because its name explains its effect
[DF] in a formal way as opposed to comments

...
> +    switch (status) {
> +    case VMXNET3_PKT_STATUS_OK: {

don't put {} around cases: they align incorrectly
if it's too big move to a function.

[DF] Fixed

...
> +static bool
> +vmxnet3_send_packet(VMXNET3State *s, uint32_t qidx)
> +{
> +    size_t bytes_sent = 0;
> +    bool res = true;

why = true? don't initialize just because.

[DF] Fixed

...
> +/*
> + * VMWARE headers we got from Linux kernel do not fully comply QEMU coding
> + * standards in sense of types and defines used.
> + * Since we didn't want to change VMWARE code, following set of typedefs
> + * and defines needed to compile these headers with QEMU introduced.
> + */

No need for this now.
You can export headers and put them under linux-headers.

[DF] Not sure it is possible because the header as-is is not stand-alone and 
won't compile
[DF] without changes. We extracted definitions we use from their header and 
dropped unused
[DF] and kernel-specific stuff.
[DF} Please, advise.

...

> +        if (VMXNET3_OM_TSO == s->offload_mode) {

Don't do Yoda style like this

[DF] "Yoda" style removed everywhere


Changes in V6:
   Fixed most of problems pointed out by Michael S. Tsirkin
   The only issue still open is creation of shared place
   with generic network structures and functions. Currently
   all generic network code introduced by VMXNET3 resides in
   vmxnet_utils.c/h files. It could be moved to some shared location however
   we believe it is a matter of separate refactoring as there are a lot of 
copy-pasted
   definitions in almost every device and code cleanup efforts requred in order
   to create truly shared codebase.

     Reported-by: Michael S. Tsirkin <address@hidden>

   Implemented suggestions by Anthony Liguori 

     Reported-by: Anthony Liguori <address@hidden>

   Fixed incorrect checksum caclulation for some packets in SW offloads mode

     Reported-by: Gerhard Wiesinger <address@hidden>

Changes in V5:
   MSI-X save/load implemented in the device instead of pci bus as
   suggested by Michael S. Tsirkin

     Reported-by: Michael S. Tsirkin <address@hidden>

   Patches regrouped as suggested by Paolo Bonzini

     Reported-by: Paolo Bonzini <address@hidden>

Changes in V4:
   Fixed a few problems uncovered by NETIO test suit
   Assertion on failure to initialize MSI/MSI-X replaced with warning 
   message and fallback to Legacy/MSI respectively   

     Reported-by: Gerhard Wiesinger <address@hidden>

   Various coding style adjustments and patch split-up as suggested by Anthony 
Liguori
     
     Reported-by: Anthony Liguori <address@hidden>

   Live migration support added

Changes in V3:
   Fixed crash when net device that is used as network fronted has no
   virtio HDR support.
   Task offloads emulation for cases when net device that is used as 
   network fronted has no virtio HDR support.

     Reported-by: Gerhard Wiesinger  <address@hidden>

Changes in V2:
   License text changed accoring to community suggestions
   Standard license header from GPLv2+ - licensed QEMU files used

Dmitry Fleytman (5):
  Checksum-related utility functions
  net: iovec checksum calculator
  Common definitions for VMWARE devices
  Packet abstraction for VMWARE network devices
  VMXNET3 device implementation

 default-configs/pci.mak |    1 +
 hw/Makefile.objs        |    2 +
 hw/pci/pci.h            |    1 +
 hw/vmware_utils.h       |  143 +++
 hw/vmxnet3.c            | 2461 +++++++++++++++++++++++++++++++++++++++++++++++
 hw/vmxnet3.h            |  760 +++++++++++++++
 hw/vmxnet_debug.h       |  115 +++
 hw/vmxnet_rx_pkt.c      |  187 ++++
 hw/vmxnet_rx_pkt.h      |  174 ++++
 hw/vmxnet_tx_pkt.c      |  567 +++++++++++
 hw/vmxnet_tx_pkt.h      |  148 +++
 include/net/checksum.h  |   26 +-
 include/net/eth.h       |  347 +++++++
 net/Makefile.objs       |    1 +
 net/checksum.c          |   42 +-
 net/eth.c               |  217 +++++
 16 files changed, 5185 insertions(+), 7 deletions(-)
 create mode 100644 hw/vmware_utils.h
 create mode 100644 hw/vmxnet3.c
 create mode 100644 hw/vmxnet3.h
 create mode 100644 hw/vmxnet_debug.h
 create mode 100644 hw/vmxnet_rx_pkt.c
 create mode 100644 hw/vmxnet_rx_pkt.h
 create mode 100644 hw/vmxnet_tx_pkt.c
 create mode 100644 hw/vmxnet_tx_pkt.h
 create mode 100644 include/net/eth.h
 create mode 100644 net/eth.c

-- 
1.8.1.2




reply via email to

[Prev in Thread] Current Thread [Next in Thread]