[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug#42023] [PATCH] database: register-items: reduce transaction scope.

From: Caleb Ristvedt
Subject: [bug#42023] [PATCH] database: register-items: reduce transaction scope.
Date: Sat, 08 Aug 2020 23:13:35 -0500
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux)

Ludovic Courtès <> writes:

> Hi reepca,
> Did you have time to look into this (see below)?  I still see a lot of
> contention on /var/guix/db/db.sqlite-shm on berlin and I feel like these
> changes would be welcome.  :-)

Apologies for the long delay, I think I'm caught up on this issue
now. Patch series detailed below.

> Thanks,
> Ludo’.

>>> So general "fiddling" can't happen, but GC can. The responsibility for
>>> both locking and registering as temporary roots falls on the caller,
>>> currently, as I believe it should. We may want to document this
>>> responsibility in the docstring for register-items, though.
>> Yes, it would be good to add a sentence to document it.

Patch attached.

>>> So currently finalize-store-file is safe from clobbering by another
>>> process attempting to finalize to the same path, but not safe from
>>> garbage collection between when the temporary store file is renamed and
>>> when it is registered. It needs an add-temp-root prior to renaming.
>> Ah, so we’d need to do that before applying the patch that reduces the
>> scope of the transaction, right?

AFAIK the only code path that actually uses finalize-store-file
currently is from the build hook / offload thingy, and it turns out that
the outputs of derivations should already be registered as temp roots by
the daemon process that launched the offload process (specifically
registered in haveDerivation() in nix/libstore/ So this is
technically not currently necessary before or after the scope-reducing
patch. But it makes sense to guarantee that the register-items
invocation will only happen on items that are protected from garbage
collection, so I've put a patch, right before the documenting of that
requirement, that modifies finalize-store-file to always add the file
being finalized as a temp root for the extent of the register-items

>>> Also, replace-with-link doesn't check whether the "containing directory"
>>> is the store like optimisePath_() does, so in theory if another process
>>> tried to make a permanent change to the store's permissions it could be
>>> clobbered when replace-with-link restores the permissions. I don't know
>>> of any instance where this could happen currently, but it's something to
>>> keep in mind.
>> I guess we should also avoid changing permissions on /gnu/store, it
>> would be wiser.

While rebasing my changes I noticed that the current implementation of
this uses (%store-directory) from (guix build utils), which may not
correspond to the #:store keyword argument of 'deduplicate'. So I added
a patch that propagates the store through to replace-with-link and from
there to with-writable-file.

>>> diff --git a/guix/store/deduplication.scm b/guix/store/deduplication.scm
>>> index 6784ee0b92..b6d94e49c2 100644
>>> --- a/guix/store/deduplication.scm
>>> +++ b/guix/store/deduplication.scm
>>> @@ -161,26 +161,44 @@ under STORE."
>>>                    (scandir* path))
>>>          (let ((link-file (string-append links-directory "/"
>>>                                          (bytevector->nix-base32-string 
>>> hash))))
>>> -          (if (file-exists? link-file)
>>> -              (replace-with-link link-file path
>>> -                                 #:swap-directory links-directory)
>>> -              (catch 'system-error
>>> -                (lambda ()
>>> -                  (link path link-file))
>>> -                (lambda args
>>> -                  (let ((errno (system-error-errno args)))
>>> -                    (cond ((= errno EEXIST)
>>> -                           ;; Someone else put an entry for PATH in
>>> -                           ;; LINKS-DIRECTORY before we could.  Let's use 
>>> it.
>>> -                           (replace-with-link path link-file
>>> -                                              #:swap-directory 
>>> links-directory))
>>> -                          ((= errno ENOSPC)
>>> -                           ;; There's not enough room in the directory 
>>> index for
>>> -                           ;; more entries in .links, but that's fine: we 
>>> can
>>> -                           ;; just stop.
>>> -                           #f)
>>> -                          ((= errno EMLINK)
>>> -                           ;; PATH has reached the maximum number of 
>>> links, but
>>> -                           ;; that's OK: we just can't deduplicate it more.
>>> -                           #f)
>>> -                          (else (apply throw args)))))))))))
>>> +          (let retry ()
>>> +            (if (file-exists? link-file)
>>> +                (catch 'system-error
>>> +                  (lambda ()
>>> +                    (replace-with-link link-file path
>>> +                                       #:swap-directory links-directory))
>>> +                  (lambda args
>>> +                    (if (and (= (system-error-errno args) ENOENT)
>>> +                             ;; ENOENT can be produced because:
>>> +                             ;; - LINK-FILE has missing directory 
>>> components
>>> +                             ;; - LINKS-DIRECTORY has missing directory
>>> +                             ;;   components
>>> +                             ;; - PATH has missing directory components
>>> +                             ;;
>>> +                             ;; the last two are errors, the first just
>>> +                             ;; requires a retry.
>>> +                             (file-exists? (dirname path))
>>> +                             (file-exists? links-directory))
>>> +                        (retry)
>>> +                        (apply throw args))))
>> I feel like there are TOCTTOU issues here and redundant ‘stat’ calls,
>> plus the risk of catching a ‘system-error’ coming from “somewhere else.”
>> What about baking this logic directly in ‘replace-with-link’, and
>> replacing ‘file-exists?’ calls by 'system-error handling?

Patch attached. I've renamed replace-with-link to canonicalize-with-link
since it may now have to create the target link. Unfortunately there are
places where 'file-exists?' error handling is necessary, simply because
of ambiguity in errnos. For example, link(oldpath, newpath) can return
ENOENT, but there's no way of knowing from that alone whether this is
because oldpath has missing directories or newpath has missing
directories or the directory components of oldpath are all present but
the file itself is missing (the case we need to be able to recognize and
retry in case of).

Also, I tried removing the first check for whether the canonical link
exists in favor of error handling like you suggested, but this actually
messes up the hard-coded number of link-invocations expected in
tests/store-deduplication.scm - it expects a single link invocation when
the canonical link already exists, but the error-handling approach uses
two - one to discover it exists, and another to create the temporary
link. I didn't know how to reconcile the testing strategy with this
behavior, so I kept the behavior of first using a 'file-exists?' call to
test for the existence of the canonical link.

All tests pass except for tests/packages.scm, which failed even without
the patches.

- reepca

Attachment: 0001-.dir-locals.el-fix-call-with-retrying-transaction-in.patch
Description: Text Data

Attachment: 0002-deduplication-pass-store-directory-to-replace-with-l.patch
Description: Text Data

Attachment: 0003-deduplication-retry-on-ENOENT.patch
Description: Text Data

Attachment: 0004-nar-ensure-finalization-target-is-a-temp-root-during.patch
Description: Text Data

Attachment: 0005-database-document-extra-registration-requirements.patch
Description: Text Data

Attachment: 0006-database-register-items-reduce-transaction-scope.patch
Description: Text Data

Attachment: signature.asc
Description: PGP signature

reply via email to

[Prev in Thread] Current Thread [Next in Thread]