guix-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 0/15] Add preliminary support for Linux containers


From: Thompson, David
Subject: Re: [PATCH 0/15] Add preliminary support for Linux containers
Date: Wed, 8 Jul 2015 09:00:33 -0400

On Wed, Jul 8, 2015 at 8:46 AM, Ludovic Courtès <address@hidden> wrote:
> "Thompson, David" <address@hidden> skribis:
>
>> On Tue, Jul 7, 2015 at 6:28 AM, Ludovic Courtès <address@hidden> wrote:
>
> [...]
>
>>>>       (lambda ()
>>>>         (sethostname "guix-0.8.3"))
>>>
>>> Surprisingly, calling ‘getpid’ in the thunk returns the PID of the
>>> parent (I was expecting it to return 1.)  Not sure why that is the
>>> case.  I’m still amazed that this works as non-root, BTW.
>>
>> The first process created inside the PID namespace gets the honor of
>> being PID 1, not the process created with the 'clone' call.
>>
>> For more information, see: https://lwn.net/Articles/532748/
>
> To me, the thunk above is just like ‘childFunc’ in
> <https://lwn.net/Articles/533492/>–i.e., it’s the procedure that ‘clone’
> calls in the first child process of the new PID name space.
>
> What am I missing?

It's non-intuitive because PID namespaces are given special treatment.
The cloned process is like PID 1 in the sense that if you fork, the
new process is PID 2.  However, if you call 'getpid' in the cloned
process, it returns the PID in the context of the parent PID
namespace, and you are expecting PID 1.

In that example from LWN, 'childFunc' calls 'execvp', and *that* new
process becomes PID 1 (and 'getpid' agrees).  This is the usual
pattern I see in all container implementations:  The process that
calls clone sets up the environment and then execs the real init
system.

Is it more clear now?

>>> There’s an issue when the parent’s Guile is not mapped into the
>>> container’s file system: ‘use-modules’ forms and auto-loading will fail.
>>> For instance, I did (use-modules (ice-9 ftw)) in the parent and called
>>> ‘scandir’ in the child, but that failed because of an attempt to
>>> auto-load (ice-9 i18n), which is unavailable in the container.
>>
>> Hmm, I don't know of a way to deal with that other than the user being
>> careful to bind-mount in the Guile modules they need.
>
> Right.  Maybe the best we can do is to add a word of caution in the
> docstring or something.

Okay, I will do that.

>> Hmm, there's various reasons that EINVAL would be thrown.  Could you
>> readlink "those" files, that is /proc/<pid-outside-container>/ns/user
>> and /proc/<pid-inside-container>/ns/user, and tell me if the contents
>> are the same?  They shouldn't be, but this will eliminate one of the
>> possible causes of EINVAL.
>
> It turns out I was targeting the wrong PID.

Glad it's not totally broken on machines other than mine. :)

>>> Also, I think we should add --expose and --share as for ‘guix system’,
>>> though that can come later.
>>
>> Yes, I also really want that, but it's a task for another time.
>
> Sure.
>
>>>> Here's how you build it:
>>>>
>>>>     guix system container container.scm
>>>
>>> Very neat.  I wonder if that should automatically override the
>>> ‘file-systems’ field to be ‘%container-file-systems’, so that one can
>>> reuse existing OS declarations unmodified.  WDYT?
>>
>> This would be a better user experience, for sure.  I thought about
>> this, but I don't know how to do it in a way that isn't surprising or
>> just broken.  Ideas?
>
> IMO it’d be fine to simply override the subset of ‘file-systems’ that
> clashes with ‘%container-file-systems’, similar to what
> ‘virtualized-operating-system’ does in (gnu system vm).

I will implement that.

Thanks!

- Dave



reply via email to

[Prev in Thread] Current Thread [Next in Thread]