guix-patches
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug#34223] Fixing timestamps in archives.


From: Ludovic Courtès
Subject: [bug#34223] Fixing timestamps in archives.
Date: Sat, 16 Feb 2019 23:35:50 +0100
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux)

Hi Tim,

Sorry for the delay!

Tim Gesthuizen <address@hidden> skribis:

> as discussed before I have looked into the problems of timestamps in the
> zip files.
> I looked at the way this is solved in ant-build-system with jar files
> and thought that this could be done in a more elegant way.
> Because of this I wrote a simple frontend for LibArchive in C that
> repacks archives and sets their timestamps to zero and disables
> compression as it is done in the ant-build-system.
> Creative as I am the program is called repack.
> You find a git repository attached with the history of the repack program.
> The attached patches add repack to Guix and use it for pwsafe and the
> ant-build-system.

Nice work!  It’s great that libarchive doesn’t need to actually extract
the zip file to operate on it.

Overall I think the approach of factorizing archive-timestamp-resetting
in one place and using it everywhere (‘ant-build-system’ and all) is the
right thing to do.

However, I’m not sure whether we should introduce a new program for this
purpose.  I believe ‘strip-nondeterminism’¹ (in Perl) by fellow
Reproducible Builds hackers also addresses this problem, so it may be
wiser to use it.

But really, since (guix build utils) already implements a significant
subset of ‘strip-nondeterminism’, it would be even better if could avoid
to shell out to a C or Perl program.

I played a bit with this idea and, as an example, the attached file
allows you to traverse the list of entries in a zip file (it uses
‘guile-bytestructures’).  Specifically, you can get the list of file
names in a zip file by running:

  (call-with-input-file "something.zip"
    (lambda (port)
      (fold-entries cons '() port)))

Resetting timestamps should be just as simple.

How about taking this route?

Thanks,
Ludo’.

¹ https://salsa.debian.org/reproducible-builds/strip-nondeterminism

(define-module (guix zip)
  #:use-module (rnrs bytevectors)
  #:use-module (rnrs io ports)
  #:use-module (bytestructures guile)
  #:use-module (ice-9 match)
  #:export (fold-entries))

(define <file-header>
  ;; File header, see
  ;; <https://en.wikipedia.org/wiki/Zip_(file_format)#File_headers>.
  (bs:struct #t                                   ;packed
             `((signature ,uint32le)
               (version-needed ,uint16le)
               (flags ,uint16le)
               (compression ,uint16le)
               (modification-time ,uint16le)
               (modification-date ,uint16le)
               (crc32 ,uint32le)
               (compressed-size ,uint32le)
               (uncompressed-size ,uint32le)
               (file-name-length ,uint16le)
               (extra-field-length ,uint16le))))

(define-bytestructure-accessors <file-header>
  file-header-unwrap file-header-ref set-file-header!)

(define (fold-entries proc seed port)
  "Fold PROC over all the entries in the zip file at PORT."
  (let loop ((result seed))
    (match (get-bytevector-n port (bytestructure-descriptor-size
                                   <file-header>))
      ((? bytevector? bv)
       (match (file-header-ref bv signature)
         (#x04034b50                              ;local file header
          (let* ((len  (file-header-ref bv file-name-length))
                 (name (utf8->string (get-bytevector-n port len))))
            (set-port-position! port
                                (+ (file-header-ref bv extra-field-length)
                                   (file-header-ref bv compressed-size)
                                   (port-position port)))
            (loop (proc name result))))
         (#x02014b50                               ;central directory record
          result)
         (#x06054b50                          ;end of central directory record
          result)))
      ((? eof-object?)
       result))))

reply via email to

[Prev in Thread] Current Thread [Next in Thread]