lmi
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lmi] Uniquification of '....'


From: Vadim Zeitlin
Subject: Re: [lmi] Uniquification of '....'
Date: Fri, 30 Apr 2021 00:52:15 +0200

On Thu, 29 Apr 2021 22:28:31 +0000 Greg Chicares <gchicares@sbcglobal.net> 
wrote:

GC> On 4/28/21 12:17 PM, Greg Chicares wrote:
GC> > On 4/26/21 9:53 PM, Vadim Zeitlin wrote:
GC> >> 
GC> >>  The branch "std-fs" of https://github.com/vadz/lmi.git repository
GC> > I get the following unit-test error for both msw builds:
GC> > 
GC> > Running path_utility_test:
GC> > 
GC> > ???? test failed:   '...-20210428T115328Z.' == '....'
GC> > [file /opt/lmi/src/lmi/path_utility_test.cpp, line 302]
GC> 
GC> Now that I think about this, I wonder if that failure indicates
GC> a need to rework unique_filepath().

 I don't think there is any _need_ to rework it, AFAICS it still works as
fine as it ever did, but it could probably indeed be improved.

GC> The motivation is that if a user has 'foo.xls' open in 'excel', then
GC> that file is locked (under native msw) and cannot be altered by any
GC> other process; therefore, when lmi wants to overwrite it, it cannot,
GC> and must therefore write something like 'foo-20210428T115328Z.xls'.
GC> This is a real problem, and this solution has worked well for
GC> a long time.

 Yes, and I don't think we have to change it.

GC> To determine whether an existing file can be replaced,
GC> unique_filepath() does this:
GC> 
GC>         fs::remove(filepath);
GC>         LMI_ASSERT(!fs::exists(filepath));
GC> 
GC> and if that throws, then the timestamp is appended to the
GC> name of the new file.
GC> 
GC> My first thought is that deleting '....' seems potentially
GC> catastrophic on an OS where that's shorthand for something like
GC>   .\\..\\..

 Just for the record, fs::remove() can never be that catastrophic, as it
only removes a single file or an empty directory. You'd need to use
fs::remove_all() to have a chance to do something really awful.

GC> Maybe we can be sure that, today at least, this function is
GC> called only with a 'filepath' generated by lmi, so that can't
GC> ever happen; but I feel less sure of tomorrow. Maybe the
GC> triple-dot shorthand is only an old, old thing:
GC>   https://devblogs.microsoft.com/oldnewthing/20160202-00/?p=92953
GC> that ms hasn't supported in this century; and maybe the 'wine'
GC> maintainers would be absolutely sure not to do it even as a
GC> matter of backward compatibility; sure, maybe. But, having
GC> considered this special case in a unit test, shouldn't we
GC> guard against it in unique_filepath()?

 There are too many special cases to guard against from and this one
doesn't seem worse (or better) than any other.

GC> E.g.:
GC> 
GC>   if(!has_path(original_filepath))
GC>     throw "absurdity encountered!";
GC> 
GC> What would happen if we, unintentionally of course, executed
GC>   fs::remove("/");
GC> ?

 The only thing that would happen is that you'd receive an error or an
exception.

GC> Here's a second thought. Reconsidering the two lines quoted
GC> above, before "and if that throws", do we really need a
GC> try...catch here in this code now?

 No, we definitely could improve this. I did think about it but, as often,
decided to keep the changes as minimal as possible for now.

GC> I forget what the boost implementation did, but now we have this
GC> overload [fs.op.remove]
GC> 
GC>   bool remove(const path& p, error_code& ec) noexcept;
GC>   Postconditions: exists(symlink_status(p)) is false.
GC> 
GC> that doesn't throw, and in the unlikely case that I truly
GC> understand symlink_status(), the postcondition would seem
GC> to make this assertion
GC>         LMI_ASSERT(!fs::exists(filepath));
GC> pointless. If so, could we replace "try...catch" with "if"?

 Yes.

GC> And, even if we write code that a conforming implementation
GC> must execute as intended, do we yet have faith that any
GC> std::filesystem implementation is correct in all details?
GC> E.g., might a library author just check static permissions
GC> to see whether the file is erasable (ignoring an actual
GC> filesystem's locking), instead of verifying that the
GC> erased file truly does not exist?

 A library can have any bugs in it, of course, but I wouldn't expect a
fs::remove() implementation not removing the file and not returning an
error to be very well-received, so I think we'd just have to assume that
the function is not completely broken because if it is, what could we
possibly do, anyhow?

GC> And how would that work
GC> on msw anyway, where users are almost certainly running
GC> "anti-malware" software that may keep a handle to a file
GC> open longer than our software or the OS anticipates?

 It would fail if the file is locked, just as the OS API it ends by calling
does.

GC> Still, we could do this:
GC> 
GC> -    try
GC> -        {
GC> -        fs::remove(filepath);
GC> -        LMI_ASSERT(!fs::exists(filepath));
GC> +    if(fs::remove(filepath), !fs::exists(filepath))
GC> 
GC> or s/,/&&/ if the comma operator seems passé.

 I wouldn't even bother checking for existence but just write

        std::error_code ec;
        if(fs::remove(path, ec))

because if remove() succeeds, the file must not exist -- unless it was
recreated in the meanwhile, of course, but the same could happen between
the call to fs::exists() and the next statement too, so we don't gain
anything by checking for this.

 If you really wanted some extra safety (but against what, exactly?), you
could start by doing

        auto const status = fs::status(path, ec);
        if(!ec || !fs::is_regular_file(status))
            {
            ... error about using it with a non-file ...
            }

before calling fs::remove().

 AFAICS, putting aside symlink complications, this should be completely
fine. Please let me know if you'd like me to produce a patch along these
lines or if you're going to update the function yourself -- which might be
preferable because I'm not completely sure I understand your requirements
for it and so my patch might not satisfy them.

 Thanks in advance,
VZ

Attachment: pgp2vWF2ioLDv.pgp
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]