[Qemu-devel] [PATCH v2 0/7] Refactor DMG driver to have chunk size indep

From: Ashijeet Acharya
Subject: [Qemu-devel] [PATCH v2 0/7] Refactor DMG driver to have chunk size independence
Date: Thu, 27 Apr 2017 13:36:30 +0530

Previously posted series patches:
v1: http://lists.nongnu.org/archive/html/qemu-devel/2017-04/msg04641.html

This series helps to provide chunk size independence for DMG driver to prevent
denial-of-service in cases where untrusted files are being accessed by the user.

This task is mentioned on the public block ToDo
Here -> http://wiki.qemu.org/ToDo/Block/DmgChunkSizeIndependence

Patch 1 introduces a new data structure to aid caching of random access points
within a compressed stream.

Patch 2 is an extension of patch 1 and introduces a new function to
initialize/update/reset our cached random access point.

Patch 3 limits the output buffer size to a max of 2MB to avoid QEMU allocate
huge amounts of memory.

Patch 4 is a simple preparatory patch to aid handling of various types of 

Patches 5 & 6 help to handle various types of chunks.

Patch 7 simply refactors dmg_co_preadv() to read multiple sectors at once.

Patch 8 finally removes the error messages QEMU used to throw when an image with
chunk sizes above 64MB were accessed by the user.

->Testing procedure:
Convert a DMG file to raw format using the "qemu-img convert" tool present in
Next convert the same image again after applying these patches.
Compare the two images using "qemu-img compare" tool to check if they are

You can pickup any DMG image from the collection present
Here -> https://lists.gnu.org/archive/html/qemu-devel/2014-12/msg03606.html

->Important note:
These patches assume that the terms "chunk" and "block" are synonyms of each 
when we talk about bz2 compressed streams. Thus according to the bz2 docs[1],
the max uncompressed size of a chunk/block can reach to 46MB which is less than
the previously allowed size of 64MB, so we can continue decompressing the whole
chunk/block at once instead of partial decompression just like we do now.

This limitation was forced by the fact that bz2 compressed streams do not allow
random access midway through a chunk/block as the BZ2_bzDecompress() API in 
seeks for the magic key "BZh" before starting decompression.[2] This magic key 
present at the start of every chunk/block only and since our cached random 
points need not necessarily point to the start of a chunk/block, 
fails with an error value BZ_DATA_ERROR_MAGIC[3]

[1] https://en.wikipedia.org/wiki/Bzip2#File_format
[2] https://blastedbio.blogspot.in/2011/11/random-access-to-bzip2.html
[3] http://linux.math.tifr.res.in/manuals/html/manual_3.html#SEC17

Special thanks to Peter Wu for helping me understand and tackle the bz2
compressed chunks.

Changes in v2:
- limit the buffer size to 2MB after fixing the buffering problems (john/fam)

Ashijeet Acharya (7):
  dmg: Introduce a new struct to cache random access points
  dmg: New function to help us cache random access point
  dmg: Refactor and prepare dmg_read_chunk() to cache random access
  dmg: Handle zlib compressed chunks
  dmg: Handle bz2 compressed/raw/zeroed chunks
  dmg: Refactor dmg_co_preadv() to start reading multiple sectors
  dmg: Limit the output buffer size to a max of 2MB

 block/dmg.c | 214 +++++++++++++++++++++++++++++++++++++++---------------------
 block/dmg.h |  10 +++
 2 files changed, 148 insertions(+), 76 deletions(-)


