[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: RFC: du reports a 1.2PB file on a 1TB btrfs disk
From: |
Kaz Kylheku (Coreutils) |
Subject: |
Re: RFC: du reports a 1.2PB file on a 1TB btrfs disk |
Date: |
Wed, 11 Mar 2020 04:29:24 -0700 |
User-agent: |
Roundcube Webmail/0.9.2 |
On 2020-03-10 21:31, Jim Meyering wrote:
On Tue, Mar 10, 2020 at 12:24 PM Kaz Kylheku (Coreutils)
<address@hidden> wrote:
On 2020-03-10 11:52, Jim Meyering wrote:
> Otherwise, du provides no way of seeing how much of the actual disk
> space is being used by such FS-compressed files.
If you stat the file, what are the values of st_size, st_blksize and
st_blocks?
That particular file is long gone, but I've just created a 1.8T file
on a 700G file system.
Before I began this experiment, "Avail" was 524G, so it appears to
occupy about 60G actual space.
Sorry; forget I mentioned st_blksize; I forgot that st_blocks is
measured in 512 byte blocks regardless of st_blksize.
FTR, I created the file by running this: yes $(printf '%065535d\n' 0) >
big
$ stat big
File: big
Size: 1957123607586 Blocks: 3822507048 IO Block: 4096 regular
file
So here, the Blocks value (coming from st_blocks) doesn't inform us
differently from size; if we multiply it by 512, it matches the size
exactly.
The underlying FS can use the st_blocks value to indicate the actual
storage. For instance, if I do this on ext4:
# dd of=file seek=$((1024 * 1024)) count=1 if=/dev/zero
Then:
# du -h file
12K file
# du --apparent-size -h file
513M file
The apparent size comes from the st_blocks information in the stat
structure:
# stat file
File: `file'
Size: 536871424 Blocks: 24 IO Block: 4096 regular
file
Device: 902h/2306d Inode: 1624448 Links: 1
Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/
root)
Access: 2020-03-11 04:22:26.000000000 -0700
Modify: 2020-03-11 04:22:26.000000000 -0700
Change: 2020-03-11 04:22:26.000000000 -0700
The issue you are seeing here is that btrfs should be probably be
publishing a st_blocks value that matches the actual storage,
accounting for sparseness and compression, and not just a repetition
of the size, rounded up to a block and quoted in 512 byte units.
The fidelity of the du output is only as good as what is in stat.