guix-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Experiment in generating multi-layer Docker images with guix pack


From: Christopher Baines
Subject: Re: Experiment in generating multi-layer Docker images with guix pack
Date: Thu, 26 Mar 2020 20:15:09 +0000
User-agent: mu4e 1.2.0; emacs 26.3

Ludovic Courtès <address@hidden> writes:

> Christopher Baines <address@hidden> skribis:
>
>> I think it could be useful to support multiple different strategies for
>> generating layers for Docker images, with different trade-offs. This approach
>> using two layers should make the resulting images more efficient to use in 
>> the
>> case where like the guile example above, where the packages you run guix pack
>> with have exactly matching inputs.
>
> Did you read <https://grahamc.com/blog/nix-and-layered-docker-images>?
> They came up with a pretty smart algorithm that would be worth copying.

I'm aware of it, but I haven't read it in detail yet.

>> As well as these behaviour changes, these patches also modify the
>> implementation. Rather than having some build side code that's used in the
>> pack and vm module gexpressions, these patches introduce two new record 
>> types:
>> <docker-image-layer> and <docker-image>. This at least structures the
>> derivations so that each layer is represented by a derivation, and then
>> there's a derivation for the image itself, which is a little more efficient 
>> in
>> terms of computation.
>
> Nice.
>
> I think a layering algorithm like Graham Christensen’s above requires
> knowledge of the reference graph, meaning that layering can only be
> computed on the build side, using #:references-graphs.  In that case, it
> could be that you can’t have a host-side <docker-image-layer> record.

As I understand it, you only have to do the computation on the build
side if you're restricted to doing a single set of builds. If you first
build the store items you want to put in the image, then look at there
references and compute the derivation for building the image, then you
could do this kind of computation on the client side.

But yeah, this is important to work out, as how image generation should
work, and what behaviours we want should define the structure of the
code.

I went with records to represent layers partially because I'm familiar
with it, but also because it allows for easier manipulation of layers on
the client side. Representing different layers as different derivations
also allows them to potentially be built in parallel, although I'm not
sure how beneficial this might be.

Related to this, at the moment Docker V1 images can be generated, it
would be good in the future to also support Docker V2 images and OCI
images. All three container formats use a layered approach to managing
the files, but they are all different (as far as I'm aware).

In my mind there are three architectural approaches:

 - Image generation entirely on the build side

   - The layers and the image are constructed through one derivation
   - The code for building images is in a module available at build time
   - Different approaches for layering are implemented in the module
     available at build time, and parameters are passed in as
     data/gexpressions

 - Image generation entirely on the client side

   - Each layer is a derivation, and the image is an additional
     derivation that takes the layers as an input
   - The code for building images is inside gexp compilers for the
     record types representing the images and layers
   - Different approaches for layering manipulate the layer records on
     the client side

 - Image generation can be done both build and client side

   - Depending on the parameters, the layers and image can be a single
     derivation, or one for each layer, and another for the image
   - The code for building images is in a module available at build
     time, and this is also used by gexp compilers
   - Different approaches for layering have the option of either being
     on the build side, or the client side

What are peoples thoughts?

Thanks,

Chris

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]