lilypond-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Using 'libfaketime' for reproducible builds


From: Werner LEMBERG
Subject: Re: Using 'libfaketime' for reproducible builds
Date: Mon, 28 Dec 2020 11:40:53 +0100 (CET)

> I definitely consider intercepting various syscalls by means of
> LD_PRELOADing more intrusive than setting a single environment
> variable that was invented for the purpose of setting timestamps.
> Just think of a new shiny syscall that might add a new source of
> non-reproducibility.

What 'new shiny syscall' shall influence the creation of PDFs,
specified by international standards?  I think this is a straw man
argument.

I dare to say that the ghostscript interface changes in the last few
years are by far more numerous (look at the LilyPond commits Masamichi
had to implement) than the number of time interface changes (which,
AFAIK, are zero since a long time, but I'm not an expert)...

> 1) Strip non-determinism from the generated PDF. This is even
>    mentioned at https://reproducible-builds.org/docs/timestamps/ -
>    before discussing libfaketime which spends more than half of the
>    paragraph mentioning possible issues.  [...]

This is what I've started with, see the attached experimental stuff.
However, I stopped working on it since it will always remain a partial
solution, because ...

> This probably leaves the UUIDs (is that the issue you mention
> above?)  which can be overridden using -sDocumentUUID and
> -sInstanceUUID.

... there is one additional field called `/ID` in (some) PDF output
files that is apparently a random-based value.  I've contacted some gs
people to get more info on that.

It also seems that ghostscript's creation and insertion of subsetted
fonts is dependent on the system time.  To me this looks like a gs
bug.  During my tests a lot of PDFs – even with the above experimental
changes – have exactly this problem (this is, the subsetted fonts were
not identical inspite of completely identical source fonts), which
means that you can't circumvent it.  Using 'libfaketime', this issue
magically disappears.

> Setting a constant time using libfaketime will result in the same
> UUID for all generated PDFs, so it can't get worse; but I think it
> would be desirable to do better than that and compute a "unique" ID
> based on the input file, maybe as simple as the hash of the file
> path.

Well, UUIDs as used by ghostscript are based on both the time and hash
values, which means that we actually *do* get unique UUIDs, with the
restriction that the first 12 digits of the UUID are a fixed value
because of the frozen time.  In other words, this is not a reason to
reject the use of 'libfaketime'.


    Werner
diff --git a/Documentation/GNUmakefile b/Documentation/GNUmakefile
index a8c96dcbdb..412cc866ef 100644
--- a/Documentation/GNUmakefile
+++ b/Documentation/GNUmakefile
@@ -213,11 +213,13 @@ ifeq ($(USE_EXTRACTPDFMARK),yes)
                  -dAutoRotatePages=/None \
                  -dPrinted=false \
                  -sOutputFile=$@ \
+                 -sDocumentUUID="00000000-0000-0000-0000-000000000000" \
                  -c "30000000 setvmthreshold" \
                  -I $(top-build-dir)/out-fonts \
                  -I $(top-build-dir)/out-fonts/Font \
                  $(outdir)/$*.pdfmark \
-                 $(outdir)/$*.tmp.pdf
+                 $(outdir)/$*.tmp.pdf \
+                 $(top-src-dir)/Documentation/no-pdf-dates.ps
        rm $(outdir)/$*.tmp.pdf
 else
        mv $(outdir)/$*.tmp.pdf $@
@@ -677,8 +679,10 @@ $(outdir)/%.pdf: %.eps
            -dNOPAUSE \
            -dBATCH \
            -sOutputFile=$@ \
+           -sDocumentUUID="00000000-0000-0000-0000-000000000000" \
            -dEPSCrop \
-           -f $<
+           $< \
+           $(top-src-dir)/Documentation/no-pdf-dates.ps
 
 # ly-examples/
 $(outdir)/%.png: %.ly

Attachment: no-pdf-dates.ps
Description: PostScript document


reply via email to

[Prev in Thread] Current Thread [Next in Thread]