[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Potential for a race condition with latest-repository-commit
From: |
Christopher Baines |
Subject: |
Re: Potential for a race condition with latest-repository-commit |
Date: |
Sun, 29 Mar 2020 15:10:54 +0100 |
User-agent: |
mu4e 1.2.0; emacs 26.3 |
Ludovic Courtès <address@hidden> writes:
> Hi,
>
> Christopher Baines <address@hidden> skribis:
>
>> So one of the things that's currently restricted to doing by one job at
>> a time in the Guix Data Service is running latest-repository-commit from
>> the (guix git) module.
>>
>> Previously this was more of a problem for the Guix Data Service, as a
>> large section of the work for loading information about a revision was
>> serialised to avoid the potential for contention over the cached
>> checkout this procedure uses. There's still a lock used now, but I
>> realised when looking in to this further that it's only necessary to
>> lock around this specific call, not the larger section that was
>> restricted previously.
>>
>> I think that add-to-store which this procedure uses isn't atomic for a
>> directory, so there's a risk of weird results if the repository is
>> changed after the required revision is checked out. While it isn't
>> common to run guix pull twice, I think this could happen there if you
>> ran guix pull concurrently for the same repository, but two different
>> profiles. I added a sleep call just prior to the add-to-store call in
>> latest-repository-commit, and tested running guix pull twice at roughly
>> the same time, with different branches and profiles, and I did see both
>> profiles then reflecting a single branch, so one profile was mistakenly
>> referring to the wrong branch.
>
> Yes, we’ve also seen this problem with ‘static-web-site-service’¹,
> whereby if several instances would try to pull from, say,
> guix-artwork.git, we’d get non-deterministic results. We worked around
> it by using different cache directories…
>
> ¹
> https://git.savannah.gnu.org/cgit/guix/maintenance.git/tree/hydra/modules/sysadmin/web.scm
>
>> Is this something that is worth guarding against? Maybe
>> latest-repository-commit could double check the Git repo state after
>> add-to-store completes, and raise an error if it's different to what it
>> expects. Or perhaps individual worktrees could be used for each process,
>> which would hopefully avoid the race condition entirely.
>
> It think it’d be worth guarding against it, yes. What we could do is
> lock the cache directory, with something like ‘with-file-lock’.
>
> WDYT?
That sounds good to me :)
signature.asc
Description: PGP signature