[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#16361: compile cache confused about file identity

From: Mark H Weaver
Subject: bug#16361: compile cache confused about file identity
Date: Wed, 01 Oct 2014 15:22:58 -0400
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux)

tags 16361 + notabug wontfix
close 16361

Zefram <address@hidden> writes:

> The automatic cache of compiled versions of scripts in guile-2.0.9
> identifies scripts mainly by name, and partially by mtime.  This is not
> actually sufficient: it is easily misled by a pathname that refers to
> different files at different times.  Test case:
> $ echo '(display "aaa\n")' >t13
> $ echo '(display "bbb\n")' >t14
> $ guile-2.0 t13
> ;;; note: auto-compilation is enabled, set GUILE_AUTO_COMPILE=0
> ;;;       or pass the --no-auto-compile argument to disable.
> ;;; compiling /home/zefram/usr/guile/t13
> ;;; compiled 
> /home/zefram/.cache/guile/ccache/2.0-LE-8-2.0/home/zefram/usr/guile/t13.go
> aaa
> $ mv t14 t13
> $ guile-2.0 t13
> aaa
> You can see that the mtime is not fully used here: the cache is misapplied
> even if there is a delay of seconds between the creations of the two
> script files.  The cache's mtime check will only notice a mismatch if
> the script currently seen under the supplied name was modified later
> than when the previous script was *compiled*.
> Obviously, in this test case the cache could trivially distinguish the
> two script files by looking at the inode numbers.  On its own the inode
> number isn't sufficient, but exact match on device, inode number, and
> mtime would be far superior to the current behaviour, only going wrong
> in the presence of deliberate timestamp manipulation.  As a bonus, if
> the cache were actually *keyed* by inode number and device, rather than
> by pathname, it would retain the caching of compilation across renamings
> of the script.
> Or, even better, the cache could be keyed by a cryptographic hash of the
> file contents.  This would be immune even to timestamp manipulation, and
> would preserve the cached compilation even across the script being copied
> to a fresh file or being edited and reverted.  This would be a cache
> worthy of the name.  The only downside is the expense of computing the
> hash, but I expect this is small compared to the expense of compilation.

You could make the same complaint about 'make', 'rsync', or any number
of other programs.  It's true that a cryptographic hash would be more
robust, but it would also be considerably more expensive in the common
case where the .go file is already in the cache.

I don't think it's worth paying this cost every time a .go file is
loaded, to guard against the unlikely scenario you outlined above.

The mtime check is very widely used, and accepted practice.

I'm closing this ticket.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]