[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

btrfs/ZFS and file read hooks

From: Colin Watson
Subject: btrfs/ZFS and file read hooks
Date: Fri, 4 Feb 2011 17:53:00 +0000
User-agent: Mutt/1.5.18 (2008-05-17)

This report relates to the 'butter' branch of GRUB, not yet merged into

When your environment block (normally /boot/grub/grubenv) is on btrfs,
save_env will fail with "sparse files not allowed".  This is a mistaken
error message: the reason that it can't save the environment block is
because btrfs doesn't implement read hooks, so save_env doesn't know
where to write.  I expect that similar issues affect people trying to
install GRUB to a partition containing a btrfs filesystem, and I also
expect that GRUB's ZFS implementation has similar problems.

I looked into fixing this, but it was sufficiently non-trivial that I
thought it best to ask here.  Read hooks are currently designed for the
simple case where a file spans a number of blocks on a single disk, with
each byte of the file being in exactly one block.  With btrfs, a file
may span blocks on multiple disks, and may be redundantly spread across
multiple physical blocks.  Since we're using a filesystem abstraction
rather than a disk abstraction (which is correct, I think), this means
that any calling code that uses a file's disk directly may have to
become aware of multi-device filesystems.

There are the following users of file read hooks right now, with their

 * grub-core/commands/blocklist.c: general idea is that this prints a
   blocklist you can use to refer to a file in "raw" ways such as
   chainloading, so maybe this ought to print any of the blocklists you
   might use?
 * grub-core/commands/loadenv.c: save_env needs to be able to write back
   to all of the redundant blocks occupied by the environment block file
 * grub-core/commands/testload.c: trivial, just prints progress dots
 * util/grub-setup.c: needs a single linear blocklist from a single disk
   that it can encode into a boot sector installed on that same disk

As a strawman, we might simply extend the read hook interface with an
extra argument passing the disk being read from (note that in this case
the "disk" may actually be a partition).  This would allow callers that
need it to keep track of which disk each sector/offset/length triplet
refers to.  The callers would then do the following:

 * blocklist: keep a list for each disk and print them separately,
   perhaps one per line
 * loadenv: keep a list for each disk, and write back to each list that
   doesn't fail the malformedness/sparseness tests
 * testload: probably no changes required
 * grub-setup: ignore anything not referring to the disk it's installing
   the boot sector to, and check that the result is consistent

The btrfs and ZFS implementations could then just set the disk read hook
as normal when reading part of a file from a disk.  The remaining
complication is that we probably don't want to always read multiple
copies of a file's data, as it would be significantly slower when
reading large files such as the kernel; but blocklist, loadenv, and
grub-setup would need to either force a complete multi-device read, or
do a "fake" read that merely calls the disk read hook (but I think that
would actually be more intrusive to the existing code).  The simplest
answer would probably be to add an extra 'read_all' member alongside
read_hook in 'struct grub_file' that causes multi-device filesystem
implementations to call grub_disk_read on all copies of a block; they
would only emit a failure if none of the copies of a block could be
successfully read.

There's a further obstacle, in that btrfs and ZFS have on-disk
checksums, and of course there are things like compression and even
encryption.  We probably ought to call the read hook only in cases where
we know how to write back to the file.  I'm not sure exactly what this
will involve yet.

Before I start implementing this, does that sound like a reasonable plan
of attack?

Colin Watson                                       address@hidden

reply via email to

[Prev in Thread] Current Thread [Next in Thread]