qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC PATCH 3/4] block: VHDX block driver support


From: Jeff Cody
Subject: Re: [Qemu-devel] [RFC PATCH 3/4] block: VHDX block driver support
Date: Tue, 19 Feb 2013 08:26:35 -0500
User-agent: Mutt/1.5.21 (2010-09-15)

On Tue, Feb 19, 2013 at 10:20:40AM +0100, Stefan Hajnoczi wrote:
> On Mon, Feb 18, 2013 at 06:03:31PM -0500, Jeff Cody wrote:
> > This is preliminary and beta support for the VHDX image format,
> > as specified in:
> >    "VHDX Format Specification v0.95", published 4/12/2012
> >     https://www.microsoft.com/en-us/download/details.aspx?id=29681
> > 
> > This RFC patch has the following limitations
> >     - read-only (no write support yet)
> >     - does not support differencing files
> >     - does not make use of iovec or coroutines
> >     - may likely be broken in novel and interesting ways
> > 
> > I've doen read testing from a dynamic, 127GB VHDX image created
> > using Hyper-V.  The image was partitioned with a F18 install,
> > and the partitions could be mounted and files read.
> > 
> > As this is still under active development, and an RFC, you can
> > assume that any debug print statements will be removed prior
> > to formal patch submission.
> > 
> > Signed-off-by: Jeff Cody <address@hidden>
> > ---
> >  block/vhdx.c | 814 
> > +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 814 insertions(+)
> >  create mode 100644 block/vhdx.c
> > 
> > diff --git a/block/vhdx.c b/block/vhdx.c
> > new file mode 100644
> > index 0000000..96f8fc7
> > --- /dev/null
> > +++ b/block/vhdx.c
> > @@ -0,0 +1,814 @@
> > +/*
> > + * Block driver for Hyper-V VHDX Images
> > + *
> > + * Copyright (c) 2013 Red Hat, Inc.,
> > + *
> > + * Authors:
> > + *  Jeff Cody <address@hidden>
> > + *
> > + *  This is based on the "VHDX Format Specification v0.95", published 
> > 4/12/2012
> > + *  by Microsoft:
> > + *      https://www.microsoft.com/en-us/download/details.aspx?id=29681
> > + *
> > + * This program is free software; you can redistribute it and/or modify it
> > + * under the terms of the GNU General Public License as published by the 
> > Free
> > + * Software Foundation; either version 2 of the License, or (at your 
> > option)
> > + * any later version.
> > + *
> > + */
> > +
> > +#include "qemu-common.h"
> > +#include "block/block_int.h"
> > +#include "qemu/module.h"
> > +#include "migration/migration.h"
> > +#if defined(CONFIG_UUID)
> > +#include <uuid/uuid.h>
> > +#endif
> 
> uuid is not used in this patch?
>

It is not indeed, these are all MS-specific UIDs.  Deleted!


> > +#include "qemu/crc32c.h"
> > +#include "block/vhdx.h"
> > +
> > +#define vhdx_nop(x) do { (void)(x); } while (0)
> > +
> > +/* Help macros to copy data from file buffers to header
> > + * structures, with proper endianness.  These help avoid
> > + * using packed structs */
> 
> Why not use packed structs?  Then you don't need memcpy/endianness
> macros.  Abusing *_to_cpu() to go both "to" and "from" CPU endianness
> is ugly.

Packed structs are (IMO) worse, and I think should be avoided wherever
possible.  Depending on the platform and structure, it can lead to
unaligned data access, which is either just slower (x86) or worse
(e.g. some arm variants).  Maybe these are all defined well in the
spec, so as to be mostly alignment proof from most architectures, but
I figured it just as well to avoid their usage since we have to touch
each field for alignment (well, I guess only on BE architectures in
the end).

These macros are just copying the headers from disk, so they are not
going "from" CPU endianness, just "to".

> 
> > +/* Do not use directly, see macros below */
> > +#define _hdr_copy(item, buf, size, offset, to_cpu) do { \
> > +    memcpy((item), (buf)+(offset), (size));        \
> > +    to_cpu((item));                                \
> > +    (offset) += (size); } while (0)
> > +
> > +/* for all of these, buf should be a uint8_t buffer */
> > +
> > +/* copy 16-bit header field */
> > +#define hdr_copy16(item, buf, offset) do { \
> > +    _hdr_copy((item), (buf), 2, (offset), (le16_to_cpus)); } while (0)
> > +
> > +/* copy 32-bit header field */
> > +#define hdr_copy32(item, buf, offset) do { \
> > +    _hdr_copy((item), (buf), 4, (offset), (le32_to_cpus)); } while (0)
> > +
> > +/* copy 64-bit header field */
> > +#define hdr_copy64(item, buf, offset) do { \
> > +    _hdr_copy((item), (buf), 8, (offset), (le64_to_cpus)); } while (0)
> > +
> > +/* copy variable-length header field, no endian swapping */
> > +#define hdr_copy(item, buf, size, offset) do { \
> > +    _hdr_copy((item), (buf), (size), (offset), vhdx_nop); } while (0)
> > +
> > +/* copies a defined msguid field, with correct endianness
> > + * a msguid entry has 3 data types with endianness sensitivity,
> > + * followed by a byte array */
> > +#define hdr_copy_guid(item, buf, offset) do {        \
> > +        hdr_copy32(&(item).data1, (buf), (offset));  \
> > +        hdr_copy16(&(item).data2, (buf), (offset));  \
> > +        hdr_copy16(&(item).data3, (buf), (offset));  \
> > +        hdr_copy(&(item).data4, (buf), sizeof((item).data4), (offset)); \
> > +        } while (0)
> > +
> > +
> > +/* Several metadata and region table data entries are identified by
> > + * guids in  a MS-specific GUID format. */
> > +
> > +
> > +/* ------- Known Region Table GUIDs ---------------------- */
> > +static const ms_guid bat_guid =      { .data1 = 0x2dc27766,
> > +                                       .data2 = 0xf623,
> > +                                       .data3 = 0x4200,
> > +                                       .data4 = { 0x9d, 0x64, 0x11, 0x5e,
> > +                                                  0x9b, 0xfd, 0x4a, 0x08} 
> > };
> > +
> > +static const ms_guid metadata_guid = { .data1 = 0x8b7ca206,
> > +                                       .data2 = 0x4790,
> > +                                       .data3 = 0x4b9a,
> > +                                       .data4 = { 0xb8, 0xfe, 0x57, 0x5f,
> > +                                                  0x05, 0x0f, 0x88, 0x6e} 
> > };
> > +
> > +
> > +
> > +/* ------- Known Metadata Entry GUIDs ---------------------- */
> > +static const ms_guid file_param_guid =   { .data1 = 0xcaa16737,
> > +                                           .data2 = 0xfa36,
> > +                                           .data3 = 0x4d43,
> > +                                           .data4 = { 0xb3, 0xb6, 0x33, 
> > 0xf0,
> > +                                                      0xaa, 0x44, 0xe7, 
> > 0x6b} };
> > +
> > +static const ms_guid virtual_size_guid = { .data1 = 0x2FA54224,
> > +                                           .data2 = 0xcd1b,
> > +                                           .data3 = 0x4876,
> > +                                           .data4 = { 0xb2, 0x11, 0x5d, 
> > 0xbe,
> > +                                                      0xd8, 0x3b, 0xf4, 
> > 0xb8} };
> > +
> > +static const ms_guid page83_guid =       { .data1 = 0xbeca12ab,
> > +                                           .data2 = 0xb2e6,
> > +                                           .data3 = 0x4523,
> > +                                           .data4 = { 0x93, 0xef, 0xc3, 
> > 0x09,
> > +                                                      0xe0, 0x00, 0xc7, 
> > 0x46} };
> > +
> > +static const ms_guid logical_sector_guid = {.data1 = 0x8141bf1d,
> > +                                            .data2 = 0xa96f,
> > +                                            .data3 = 0x4709,
> > +                                           .data4 = { 0xba, 0x47, 0xf2, 
> > 0x33,
> > +                                                      0xa8, 0xfa, 0xab, 
> > 0x5f} };
> > +
> > +static const ms_guid phys_sector_guid =  { .data1 = 0xcda348c7,
> > +                                           .data2 = 0x445d,
> > +                                           .data3 = 0x4471,
> > +                                           .data4 = { 0x9c, 0xc9, 0xe9, 
> > 0x88,
> > +                                                      0x52, 0x51, 0xc5, 
> > 0x56} };
> > +
> > +static const ms_guid parent_locator_guid = {.data1 = 0xa8d35f2d,
> > +                                            .data2 = 0xb30b,
> > +                                            .data3 = 0x454d,
> > +                                           .data4 = { 0xab, 0xf7, 0xd3, 
> > 0xd8,
> > +                                                      0x48, 0x34, 0xab, 
> > 0x0c} };
> > +
> > +
> > +
> > +#define META_FILE_PARAMETER_PRESENT      0x01
> > +#define META_VIRTUAL_DISK_SIZE_PRESENT   0x02
> > +#define META_PAGE_83_PRESENT             0x04
> > +#define META_LOGICAL_SECTOR_SIZE_PRESENT 0x08
> > +#define META_PHYS_SECTOR_SIZE_PRESENT    0x10
> > +#define META_PARENT_LOCATOR_PRESENT      0x20
> > +
> > +#define META_ALL_PRESENT    \
> > +    (META_FILE_PARAMETER_PRESENT | META_VIRTUAL_DISK_SIZE_PRESENT | \
> > +     META_PAGE_83_PRESENT | META_LOGICAL_SECTOR_SIZE_PRESENT | \
> > +     META_PHYS_SECTOR_SIZE_PRESENT)
> > +
> > +typedef struct vhdx_metadata_entries {
> > +    vhdx_metadata_table_entry file_parameters_entry;
> > +    vhdx_metadata_table_entry virtual_disk_size_entry;
> > +    vhdx_metadata_table_entry page83_data_entry;
> > +    vhdx_metadata_table_entry logical_sector_size_entry;
> > +    vhdx_metadata_table_entry phys_sector_size_entry;
> > +    vhdx_metadata_table_entry parent_locator_entry;
> > +    uint16_t present;
> > +} vhdx_metadata_entries;
> > +
> > +
> > +typedef struct BDRVVHDXState {
> > +    CoMutex lock;
> > +
> > +    int curr_header;
> > +    vhdx_header * headers[2];
> > +
> > +    vhdx_region_table_header rt;
> > +    vhdx_region_table_entry bat_rt;         /* region table for the BAT */
> > +    vhdx_region_table_entry metadata_rt;    /* region table for the 
> > metadata */
> > +    vhdx_region_table_entry *unknown_rt;
> > +    unsigned int unknown_rt_size;
> > +
> > +    vhdx_metadata_table_header  metadata_hdr;
> > +    vhdx_metadata_entries metadata_entries;
> > +
> > +    vhdx_file_parameters params;
> > +    uint32_t block_size_bits;
> > +    uint32_t sectors_per_block;
> > +    uint32_t sectors_per_block_bits;
> > +
> > +    uint64_t virtual_disk_size;
> > +    uint32_t logical_sector_size;
> > +    uint32_t physical_sector_size;
> > +
> > +    uint64_t chunk_ratio;
> > +    uint32_t chunk_ratio_bits;
> > +    uint32_t logical_sector_size_bits;
> > +
> > +    uint32_t bat_entries;
> > +    vhdx_bat_entry *bat;
> > +
> > +    /* TODO */
> > +
> > +} BDRVVHDXState;
> > +
> > +/* CRC-32C, Castagnoli polynomial, code 0x11EDC6F41 */
> > +static uint32_t vhdx_checksum(uint8_t *buf, size_t size)
> > +{
> > +    uint32_t chksum;
> > +    chksum =  crc32c(0xffffffff, buf, size);
> > +
> > +    return chksum;
> > +}
> > +
> > +/* validates the checksum of a region of table entries, substituting zero
> > + * in for the in-place checksum field of the region.
> > + * buf: buffer to compute crc32c over,
> > + * size: size of the buffer to checksum
> > + * crc_offset: offset into buf that contains the existing 4-byte checksum
> > + */
> > +static int vhdx_validate_checksum(uint8_t *buf, size_t size, int 
> > crc_offset)
> > +{
> > +    uint32_t crc_orig;
> > +    uint32_t crc;
> > +
> > +    assert(buf != NULL);
> > +    assert(size > (crc_offset+4));
> > +
> > +    memcpy(&crc_orig, buf+crc_offset, sizeof(crc_orig));
> > +    memset(buf+crc_offset, 0, sizeof(crc_orig));
> > +
> > +    crc = vhdx_checksum(buf, size);
> > +
> > +    memcpy(buf+crc_offset, &crc_orig, sizeof(crc_orig));
> > +
> > +    crc_orig = le32_to_cpu(crc_orig);
> > +    return crc == crc_orig ? 0 : 1;
> 
> This function should return bool.  Then you can also drop the tristate
> operator (although callers need to invert their logic so true ==
> success and false == failure).

OK.  I could just invert the logic here (i.e. return crc != crc_orig),
if we want to keep success as 0/false and failure as true/!0.
> 
> > +}
> > +
> > +/*
> > + * Per the MS VHDX Specification, for every VHDX file:
> > + *      - The header section is fixed size - 1 MB
> > + *      - The header section is always the first "object"
> > + *      - The first 64KB of the header is the File Identifier
> > + *      - The first uint64 (8 bytes) is the VHDX Signature ("vhdxfile")
> > + *      - The following 512 bytes constitute a UTF-16 string identifiying 
> > the
> > + *        software that created the file, and is optional and diagnostic 
> > only.
> > + *
> > + *  Therefore, we probe by looking for the vhdxfile signature "vhdxfile"
> > + */
> > +static int vhdx_probe(const uint8_t *buf, int buf_size, const char 
> > *filename)
> > +{
> > +    if (buf_size >= 8 && !strncmp((char *)buf, "vhdxfile", 8)) {
> 
> Dropping const here to fit within 80-char line limit?  Declare a local
> variable instead to keep const.

OK.

> 
> > +        return 100;
> > +    }
> > +    return 0;
> > +}
> > +
> > +
> > +static void vhdx_fill_header(vhdx_header *h, uint8_t *buffer)
> > +{
> > +    int offset = 0;
> > +    assert(h != NULL);
> > +    assert(buffer != NULL);
> > +
> > +    /* use memcpy to avoid unaligned data read */
> > +    hdr_copy32(&h->signature,       buffer, offset);
> > +    hdr_copy32(&h->checksum,        buffer, offset);
> > +    hdr_copy64(&h->sequence_number, buffer, offset);
> > +
> > +    hdr_copy_guid(h->file_write_guid, buffer, offset);
> > +    hdr_copy_guid(h->data_write_guid, buffer, offset);
> > +    hdr_copy_guid(h->log_guid, buffer, offset);
> > +
> > +    hdr_copy16(&h->log_version,     buffer,  offset);
> > +    hdr_copy16(&h->version,         buffer,  offset);
> > +    hdr_copy32(&h->log_length,      buffer,  offset);
> > +    hdr_copy64(&h->log_offset,      buffer,  offset);
> > +}
> > +
> > +
> > +/* opens the specified header block from the VHDX file header section */
> > +static int vhdx_parse_header(BlockDriverState *bs, BDRVVHDXState *s)
> > +{
> > +    int ret = 0;
> > +    vhdx_header *header1;
> > +    vhdx_header *header2;
> > +    uint64_t h1_seq = 0;
> > +    uint64_t h2_seq = 0;
> > +    uint8_t *buffer;
> > +
> > +    header1 = g_malloc(sizeof(vhdx_header));
> > +    header2 = g_malloc(sizeof(vhdx_header));
> > +
> > +    buffer = g_malloc(VHDX_HEADER_SIZE);
> > +
> > +    s->headers[0] = header1;
> > +    s->headers[1] = header2;
> > +
> > +    ret = bdrv_pread(bs->file, VHDX_HEADER1_OFFSET, buffer, 
> > VHDX_HEADER_SIZE);
> > +    if (ret < 0) {
> > +        goto fail;
> > +    }
> > +    vhdx_fill_header(header1, buffer);
> > +
> > +    if (vhdx_validate_checksum(buffer, VHDX_HEADER_SIZE, 4) == 0 &&
> > +        header1->signature == VHDX_HDR_MAGIC) {
> > +        printf("header1 is valid!\n");
> > +        h1_seq = header1->sequence_number;
> > +    }
> > +
> > +    ret = bdrv_pread(bs->file, VHDX_HEADER2_OFFSET, buffer, 
> > VHDX_HEADER_SIZE);
> > +    if (ret < 0) {
> > +        goto fail;
> > +    }
> > +    vhdx_fill_header(header2, buffer);
> > +
> > +    if (vhdx_validate_checksum(buffer, VHDX_HEADER_SIZE, 4) == 0 &&
> > +        header2->signature == VHDX_HDR_MAGIC) {
> > +        printf("header2 is valid!\n");
> > +        h2_seq = header2->sequence_number;
> > +    }
> > +
> > +    if (h1_seq > h2_seq) {
> > +        s->curr_header = 0;
> > +    } else if (h2_seq > h1_seq) {
> > +        s->curr_header = 1;
> > +    } else {
> > +        printf("NO VALID HEADER\n");
> > +        ret = -1;
> 
> Missing goto fail?

Oops... thanks.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]