guix-patches
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug#58035] sync-before-registering is false, possibly the cause of empt


From: Maxime Devos
Subject: [bug#58035] sync-before-registering is false, possibly the cause of empty files in the store
Date: Tue, 4 Oct 2022 16:04:29 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.12.0



On 04-10-2022 09:52, Ludovic Courtès wrote:
Hi,

[...]
However, currently sync-before-registering is set to 'false' AFAICT.
I think this might be the cause of bugs like
<https://issues.guix.gnu.org/58013> (‘Can't use "guix pull"’), and
maybe <https://issues.guix.gnu.org/57838> (‘failing to boot, probably
due to guix gc’).

It might be a factor, combined with the fact that the file system was
not properly unmounted (power outage or similar).

However, calling sync(2) for each store item is going to be expensive.
Recursive fsync/fdatasync calls are also likely to be too expensive (see
<https://issues.guix.gnu.org/55707> for a concrete example of the cost
on a spinning disk).

Thoughts?

Debian uses fsync (going by https://wiki.debian.org/Teams/Dpkg/FAQ), and even though that according to that FAQ dpkg can be slow, in my experience it wasn't too bad. Also, having to investigate store corruption and how to fix it is a form of slowness, especially when it fails or you don't have the technical expertise and consequentially you need to reinstall (losing old non-back-upped work).

'sync' seems relatively inexpensive to me, compared to the time required for building a package and even more inexpensive compared to the cost of debugging store corruption:

antipode@antipode ~$ time sync

real    0m0,230s
user    0m0,004s
sys     0m0,047s
antipode@antipode ~$
antipode@antipode ~$ time sync

real    0m0,045s
user    0m0,003s
sys     0m0,014s
antipode@antipode ~$ time sync

real    0m0,044s
user    0m0,004s
sys     0m0,012s

Or, after a download:

$ time "guix build download"
real    0m50,681s
user    0m3,856s
sys     0m0,198s
$ sync
# I forgot to properly time this one, but < 0.5 sec
# Don't have numbers on the time required for debugging corruption.

(On a SSD)

Also, the situation is unlike 55707 -- we don't need to call 'fsync' or 'sync' after building each store item or writing each line of a store item file, we only need to do it before registering it in the database and returning it to the user -- in some sense, the 'fsync' can be done sort-of asynchronuously.

For example, if "guix build" asks for foo.drv is built, and it depends on bar.drv and baz.drv, then the daemon can build bar.drv, baz.drv and 'foo.drv' (without registering or fsyncing or registering in the database).

Once all the things are built, the daemon could then fsync the things, and after the fsyncing completes, register things in the database -- on the speed, I would like to note that:

  (*) if the store items that were made were small, then fsync'ing them
      should be pretty fast, as there isn't much to sync (at least in
      theory, I think I read about some limitation in the ext3
      implementation where 'fsync' is essentially 'sync' or something
      like that?  Don't know if that's still the case, though.)

  (*) if the store items were sufficiently large (say, a bunch more than
      Linux is willing to buffer), then at some point Linux will have
      flushed most of them anyway.  I don't have a clue what heuristics
      it uses though (except for 'no more than there is free RAM :p)').

  (*) In theory, if a file is already written to disk (implicitly as
      part of some heuristic, or by an explicit 'fsync'), 'fsync'
      should be about zero cost.  Also, for a reasonable implementation
      of 'fsync', I would expect the OS to take the opportunity to
      write some other files as well (if the disk is seeking anyways,
      it might might as well write some stuff while it's moving
      to the right position and such).

(This requires changes to the daemon of course).

Another difference with 55707, is that the write/fsync pattern is very different -- in 55707, it's like

   write a small line (after waiting for the previous fsync to complete)
   fsync the file
   repeat very often

whereas with 'recursive fsync before registering (without other changes)', it's like

   write files (number and size varies)
recursive fsync the store item (note: as written elsewhere, the cost of a recursive fsync should in theory be a lot less than the sum of the fsync cost of an individual file, as the kernel likely takes the opportunity to write some other stuff anyways)
   wait for fsync to complete
repeat for the next store item (much less frequent than the previous case (*))

and with 'recursive fsync before registering, and delay the registering where possible':

  write files for a store item
  repeat for other store items
fsync the new files (good chance they were flushed to disk already when the store items are large)
  wait for fsync to complete
  repeat with next call to "guix build" / "guix shell" ...

-- there should be much less frequent 'fsync' than in 55707, and the 'fsyncs' that are done would be mostly batched.

(*) I'm not considering things like the 'can compute derivation' linter -- that linter would be in theory slowed down a lot, but I don't see a reason why that linter would need a daemon to talk too.

Summarised, the main two points are that:

  * the fsync can be delayed for a while
  * there's a good chance that delayed fsyncs are done automatically
    by the kernel in the background (making the explicit 'fsync'
    of later mostly free)
  * the total time for doing multiple fsyncs should be much less
    than the sum of the times of doing individual fsyncs

(This is currently all theoretical)

Greetings,
Maxime.

Attachment: OpenPGP_0x49E3EE22191725EE.asc
Description: OpenPGP public key

Attachment: OpenPGP_signature
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]