[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gnu-arch-users] Re: tla1.2 on cygwin

From: Stefan Monnier
Subject: Re: [Gnu-arch-users] Re: tla1.2 on cygwin
Date: 14 Mar 2004 13:05:35 -0500
User-agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.3.50

>> I'd rather assume that if ctime and size have not changed, then the file
>> hasn't changed, even though it's possibly incorrect.
> You meant mtime, right? :-)

Actually no: I meant ctime.
mtime can be tweaked with touch, so you can't rely on it if you want to
be safe.  But admittedly, CVS relies on exclusively on mtime (not even the
size) and problems related to that have been extremely rare.

> But what if the mtime and size haven't changed, but the file an utterly
> different file?

That can happen while keeping the inode constant as well.

> In fact, renames on arch-managed files are very hard to detect without
> looking at inodes, since all of them are created within a few seconds
> of eachother.

Looking at the ,,inode-sigs files, you can easily find all the files with
the same mtime and size, so maybe you then do something clever, like "assume
there's no swap" (which means that as long as there's no file added or
missing from the list of files with same mtime and same size, you know
there has been no change).

But in any case, I don't care that much about checking inodes as long as
it's only a performance tweak.  I do care about the compulsory check of
inodes and device number on revlibs because it ruins them on systems where
device numbers or inodes aren't constant.

>>> But I'd hate to go without inode signatures in libraries, because
>>> detecting corruption is essential.
>> Detecting corruption is *not* essential when I want to look
>> at `tla file-diffs' or `tla changes'.

> All of the "changes" and "file diffs" will produce faulty output if the
> basis for comparison is corrupt.

Sure.  Corruption can and does happen without changing any inode number,
mtime, or size.  We had better start by deciding from which kind of corruption
we want to protect ourselves, otherwise "corruption" is a moot argument
since you'd then have to constantly check everything (including tla, the
libc you're linking against, locking everything to avoid race conditions
(the revlib could get corrupted after you checked the inode but before you
read the file)).

>> It's even less essential to detect corruption of file BAR when I do
>> `tla file-diffs FOO'.
> I haven't looked at the code, but I imagine we only look at the reference
> version of FOO.

I doubt it.  On a small project, `tla file-diffs' is essentially
instantaneous, whereas on an Emacs tree, it takes several seconds.
Another data point:

   emacs/work-0% tla file-diffs INSTALL.CVS
   * auto-adding address@hidden/emacs--monnier--0--patch-18 to greedy revision 
library /part/00/Tmp/monnier/archlib
   * found immediate ancestor revision in library 
   * patching for this revision (address@hidden/emacs--monnier--0--patch-18)
   emacs/work-0% tla file-diffs INSTALL.CVS
   emacs/work-0% tla my-revision-library 
   emacs/work-0% rm 
   emacs/work-0% tla file-diffs INSTALL.CVS
   corrupt library (failed inode signature validation)
       archive: address@hidden
       revision: emacs--monnier--0--patch-18
   You should remove this revision from your library.

>> And if we really want to detect corruption, how about MD5 instead of inodes?
> Performing an MD5sum is much slower than a stat().  It would make "tla
> changes" orders of magnitude slower.

Why would it be?  If you're not going to read the file, why would you check
its `stat' for corruption?  You only need to compute the MD5 on files that
you actually read, so it shouldn't cost much.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]