qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Virtio-fs] [PATCH for-5.1 3/3] virtiofsd: probe unshare(CLONE_FS) a


From: Stefan Hajnoczi
Subject: Re: [Virtio-fs] [PATCH for-5.1 3/3] virtiofsd: probe unshare(CLONE_FS) and print an error
Date: Thu, 23 Jul 2020 16:19:29 +0100

On Thu, Jul 23, 2020 at 09:56:03AM -0400, Vivek Goyal wrote:
> On Thu, Jul 23, 2020 at 01:50:35PM +0100, Daniel P. Berrangé wrote:
> > On Thu, Jul 23, 2020 at 01:46:11PM +0100, Stefan Hajnoczi wrote:
> > > On Wed, Jul 22, 2020 at 06:03:28PM +0100, Daniel P. Berrangé wrote:
> > > > On Wed, Jul 22, 2020 at 02:02:06PM +0100, Stefan Hajnoczi wrote:
> > > > > An assertion failure is raised during request processing if
> > > > > unshare(CLONE_FS) fails. Implement a probe at startup so the problem 
> > > > > can
> > > > > be detected right away.
> > > > > 
> > > > > Unfortunately Docker/Moby does not include unshare in the seccomp.json
> > > > > list unless CAP_SYS_ADMIN is given. Other seccomp.json lists always
> > > > > include unshare (e.g. podman is unaffected):
> > > > > https://raw.githubusercontent.com/seccomp/containers-golang/master/seccomp.json
> > > > > 
> > > > > Use "docker run --security-opt seccomp=path/to/seccomp.json ..." if 
> > > > > the
> > > > > default seccomp.json is missing unshare.
> > > > > 
> > > > > Cc: Misono Tomohiro <misono.tomohiro@jp.fujitsu.com>
> > > > > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> > > > > ---
> > > > >  tools/virtiofsd/fuse_virtio.c | 13 +++++++++++++
> > > > >  1 file changed, 13 insertions(+)
> > > > > 
> > > > > diff --git a/tools/virtiofsd/fuse_virtio.c 
> > > > > b/tools/virtiofsd/fuse_virtio.c
> > > > > index 3b6d16a041..ebeb352514 100644
> > > > > --- a/tools/virtiofsd/fuse_virtio.c
> > > > > +++ b/tools/virtiofsd/fuse_virtio.c
> > > > > @@ -949,6 +949,19 @@ int virtio_session_mount(struct fuse_session *se)
> > > > >  {
> > > > >      int ret;
> > > > >  
> > > > > +    /*
> > > > > +     * Test that unshare(CLONE_FS) works. fv_queue_worker() will 
> > > > > need it. It's
> > > > > +     * an unprivileged system call but some Docker/Moby versions are 
> > > > > known to
> > > > > +     * reject it via seccomp when CAP_SYS_ADMIN is not given.
> > > > > +     */
> > > > > +    ret = unshare(CLONE_FS);
> > > > > +    if (ret == -1 && errno == EPERM) {
> > > > > +        fuse_log(FUSE_LOG_ERR, "unshare(CLONE_FS) failed with EPERM. 
> > > > > If "
> > > > > +                "running in a container please check that the 
> > > > > container "
> > > > > +                "runtime seccomp policy allows unshare.\n");
> > > > > +        return -1;
> > > > > +    }
> > > > > +
> > > > 
> > > > This describes the unshare() call as a "probe" and a "test", but that's
> > > > misleading IMHO. A "probe" / "test" implies that after it has completed,
> > > > there's no lingering side-effect, which isn't the case here.
> > > > 
> > > > This is actively changing the process' namespace environment in the
> > > > success case, and not putting it back how it was originally.
> > > > 
> > > > May be this is in fact OK, but if so I think the commit message and
> > > > comment should explain/justify what its fine to have this lingering
> > > > side-effect.
> > > > 
> > > > If we want to avoid the side-effect then we need to fork() and run
> > > > unshare() in the child, and use a check of exit status of the child
> > > > to determine the result.
> > > 
> > > Thanks for pointing this out. I'll add a comment explaining that
> > > virtiofsd is single-threaded at this point. No other threads share the
> > > file system attributes so the call has no observable side-effects.
> > 
> > Also, if we do an unshare() here, do we still need the unshare() later
> > on in the code ?
> 
> I think so. That unshare() later is to isolate one thread from other.

Yes. Each thread that calls unshare(CLONE_FS) gets its own current
working directory and other file system attributes.

The purpose of the call is to allow worker threads to change current
working directory without affecting each other.

Stefan

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]