From: Stuart Yoder
Subject: [Qemu-devel] RFC [v2]: vfio / device assignment -- layout of device fd files
Date: Fri, 9 Sep 2011 08:11:54 -0500

Based on the discussions over the last couple of weeks
I have updated the device fd file layout proposal and
tried to specify it a bit more formally.


1.  Overview

  This specification describes the layout of device files
  used in the context of vfio, which gives user space
  direct access to I/O devices that have been bound to

  When a device fd is opened and read, offset 0x0 contains
  a fixed sized header followed by a number of variable length
  records that describe different characteristics
  of the device-- addressable regions, interrupts, etc.

  0x0  +-------------+-------------+
       |         magic             | u32  // identifies this as a vfio
device file
       +---------------------------+         and identifies the type of bus
       |         version           | u32  // specifies the version of this
       |         flags             | u32  // encodes any flags
       |  dev info record 0        |
       |    type                   | u32   // type of record
       |    rec_len                | u32   // length in bytes of record
       |                           |          (including record header)
       |    flags                  | u32   // type specific flags
       |    ...content...          |       // record content, which could
       +---------------------------+       // include sub-records
       |  dev info record 1        |
       |  dev info record N        |

  The device info records following the file header may have
  the following record types each with content encoded in
  a record specific way:

              |  type |
   Region     |  num  | Description
  REGION           1    describes an addressable address range for the device
  DTPATH           2    describes the device tree path for the device
  DTINDEX          3    describes the index into the related device tree
                          property (reg,ranges,interrupts,interrupt-map)
  INTERRUPT        4    describes an interrupt for the device
  PCI_CONFIG_SPACE 5    property identifying a region as PCI config space
  PCI_BAR_INDEX    6    describes the BAR index for a PCI region
  PHYS_ADDR        7    describes the physical address of the region

2. Header

The header is located at offset 0x0 in the device fd
and has the following format:

    struct devfd_header {
        __u32 magic;
        __u32 version;
        __u32 flags;

    The 'magic' field contains a magic value that will
    identify the type bus the device is on.  Valid values

        0x70636900   // "pci" - PCI device
        0x64740000   // "dt" - device tree (system bus)

3. Region

  A REGION record an addressable address region for the device.

    struct devfd_region {
        __u32 type;   // must be 0x1
        __u32 record_len;
        __u32 flags;
        __u64 offset; // seek offset to region from beginning
                      // of file
        __u64 len   ; // length of the region

  The 'flags' field supports one flag:


4. Device Tree Path (DTPATH)

  A DTPATH record is a sub-record of a REGION and describes
  the path to a device tree node for the region

    struct devfd_dtpath {
        __u32 type;   // must be 0x2
        __u32 record_len;
        __u64 char[]   ; // length of the region

5. Device Tree Index (DTINDEX)

  A DTINDEX record is a sub-record of a REGION and specifies
  the index into the resource list encoded in the associated
  device tree property-- "reg", "ranges", "interrupts", or

    struct devfd_dtindex {
        __u32 type;   // must be 0x3
        __u32 record_len;
        __u32 prop_type;
        __u32 prop_index;  // index into the resource list

    prop_type must have one of the follow values:
       1   // "reg" property
       2   // "ranges" property
       3   // "interrupts" property
       4   // "interrupts" property

    Note: prop_index is not the byte offset into the property,
    but the logical index.

6. Interrupts (INTERRUPT)

  An INTERRUPT record describes one of a device's interrupts.
  The handle field is an argument to VFIO_DEVICE_GET_IRQ_FD
  which user space can use to receive device interrupts.

    struct devfd_interrupts {
        __u32 type;   // must be 0x4
        __u32 record_len;
        __u32 flags;
        __u32 handle;  // parameter to VFIO_DEVICE_GET_IRQ_FD

7.  PCI Config Space (PCI_CONFIG_SPACE)

    A PCI_CONFIG_SPACE record is a sub-record of a REGION record
    and identifies the region as PCI configuration space.

    struct devfd_cfgspace {
        __u32 type;   // must be 0x5
        __u32 record_len;
        __u32 flags;

8.  PCI Bar Index (PCI_BAR_INDEX)

    A PCI_BAR_INDEX record is a sub-record of a REGION record
    and identifies the PCI BAR index for the region.

    struct devfd_barindex {
        __u32 type;   // must be 0x6
        __u32 record_len;
        __u32 flags;
        __u32 bar_index;

9.  Physical Address (PHYS_ADDR)

    A PHYS_ADDR record is a sub-record of a REGION record
    and specifies the physical address of the region.

    struct devfd_physaddr {
        __u32 type;   // must be 0x7
        __u32 record_len;
        __u32 flags;
        __u64 phys_addr;


