[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
cgroupfs, /hurd/proc and subhurds
From: |
Justus Winter |
Subject: |
cgroupfs, /hurd/proc and subhurds |
Date: |
Mon, 16 Sep 2013 16:00:25 +0200 |
User-agent: |
alot/0.3.4 |
Hi :)
this mail discusses my recent attempts of creating a cgroupfs, related
problems and issues I encountered so far.
Problem statement
=================
Linux has this feature called cgroups. It groups processes (threads)
together in groups, furthermore so called controllers can be used to
restrict the use of various resources (like cpu time, memory) on a
per-group basis. A notable feature of cgroups is that a process cannot
escape its group, and any children of a process are born into the same
group as the parent.
In order to create a cgroupfs on Hurd, one has to make the same
guarantee. Currently the parental relationship of processes is a
Hurd-only concept, established by the parent process calling
proc_child. This is not robust enough, as a process can create a new
process using task_create and not claim ownership of that process.
cgroup documentation:
https://www.kernel.org/doc/Documentation/cgroups/cgroups.txt
Related problems
----------------
The proc server has to be privileged (needs the host-priv port) so
that it can query the kernel for new processes (see proc/mgt.c
(add_tasks)). It does that mainly when a process triggers a lookup
like proc_task2proc for newly created tasks. Ironically then the task
port for the new process is already known, still add_tasks requests
*all* task ports from the kernel, only to mach_port_deallocate all the
ones it already knows and to add new tasks (in my test it was always
just one, the newly created task of which the task port is already
known). I believe that this is done in the lookup mainly to do this
periodically, so that any task will eventually get noticed (but see
below). I do not know how expensive this is in practice, but it seems
very wasteful and unnecessary to me.
So there are three related issues:
1. Non-root users cannot start sub-hurds:
https://savannah.gnu.org/bugs/?17341
2. Due to the process of discovering all task ports, any sub-hurd gets
a handle of any task running on the system, so root users inside a
sub-hurd can interfere with the operation of the parent hurd, which
is undesirable from an isolation point of view. This is also
related to 1. from a security point of view.
3. add_tasks is unnecessary and potentially wasteful.
/hurd/proc is a dark corner indeed
----------------------------------
The routine description of proc_child looks harmless enough:
/* Declare that a task is a child of the caller. The task's state
will then inherit from the caller. This call can be made only once
per task. */
routine proc_child (
process: process_t;
child: task_t);
But dragons are lurking here. sysdeps/mach/hurd/fork.c explains this
best:
/* Register the child with the proc server. It is important that
this be that last thing we do before starting the child thread
running. Once proc_child has been done for the task, it appears
as a POSIX.1 process. Any errors we get must be detected before
this point, and the child must have a message port so it responds
to POSIX.1 signals. */
if (err = __USEPORT (PROC, __proc_child (port, newtask)))
LOSE;
So proc_child not only declares that the newly created process is ones
task, but it also indicates that the process is all set up and ready
to receive POSIX signals. Surely enough all hell broke loose when I
tried to use my shiny new notification system (see below) to supply
the parental relations instead of waiting for someone to call
proc_child.
Proposed solution
=================
I propose a notification based system to fix all of the above issues:
1. The topmost /hurd/proc server registers for notifications at the
kernel. The notifications are sent when a new task is created and
carry the task ports of both the parent and the newly created task.
Only one process can register for this notifications and it has to
have the host_priv port to authenticate itself. This makes the
kernel implementation quite tiny and unobtrusive.
2. Anyone can register for process change notifications at the proc
server. These notifications carry the PID and PPID of newly created
processes (and ones that died) and allow cgroupfs to implement the
cgroup semantics. This information could also be obtained by
polling the proc server and diffing the results, so it should not
be necessary to restrict the usage of this interface.
3. A proc server running in a sub-hurd can register at the topmost
proc server for new task notifications, like the topmost proc
server registers with the kernel. The proc servers are just a
little bit sub-hurd aware, and because of the robust parental
relationship of the tasks (*not* processes) provided by the kernel
it can track which sub-hurd a task belongs to and notify the
appropriate proc server.
Implementation
==============
I've created a proof of concept implementation for points 1. and
2. I'll send it as follow-ups to this mail. It contains:
* A general purpose notification library libhurdnotify.
* A port of /hurd/init to libhurdnotify.
* The new process notifications.
* New task notifications in gnumach, the proc server registers for
those and they arrive, though nothing useful is done with them atm.
The cgroupfs repository can be found here:
http://darnassus.sceen.net/gitweb/teythoon/cgroupfs.git/
The state of cgroupfs is described in my last blog post:
https://teythoon.cryptobitch.de/posts/cgroupfs-is-as-cgroupy-as-it-gets/
I appreciate your input,
Justus
- cgroupfs, /hurd/proc and subhurds,
Justus Winter <=
- [PATCH 1/7] libnotify: add a general notification library, Justus Winter, 2013/09/16
- [PATCH 4/7] proc: implement proc_request_process_change_notification, Justus Winter, 2013/09/16
- [PATCH 5/7] hurd: add proc_request_process_change_notification, Justus Winter, 2013/09/16
- [PATCH 6/7] FIX BUILD, Justus Winter, 2013/09/16
- [PATCH 2/7] init: use libhurdnotify for shutdown notifications, Justus Winter, 2013/09/16
- [PATCH 3/7] hurd: add notification callbacks for the process management, Justus Winter, 2013/09/16
- [PATCH 7/7] XXX: register for new task notifications, Justus Winter, 2013/09/16
- [PATCH] kern: new task notifications, Justus Winter, 2013/09/16
- Re: cgroupfs, /hurd/proc and subhurds, Ludovic Courtès, 2013/09/17