[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RFC: linux-user: preserving argv[0] of the original binary in context of

From: Michael Tokarev
Subject: RFC: linux-user: preserving argv[0] of the original binary in context of binfmt-misc
Date: Sun, 14 Feb 2021 18:25:57 +0300
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.7.1


As known for a long time, qemu's linux-user, when invoked in context of 
binfmt-misc mechanism,
does not preserve the original argv[0] element, so some software which relies 
on argv[0] is
not functioning under qemu-user.  When run this way, argv[0] of the program 
being run under
qemu-user points to this qemu-user binary, instead of being what has been used 
to spawn the
original binary.

There's an interpreter flag in binfmt handling in recent kernels, P, or 
preserve, which meant
to pass 3 extra first arguments to the binfmt interpeter, - namely, the path to 
itself, the argv[0] as used when spawning the original binary, and the actual 
path of the
said binary. But qemu-user/main does not handle this situation, - it should be 
prepared for
this unusual way of being invoked.

There are several hackish solutions exists on this theme used by downstreams, 
which introduces
a wrapper program especially for binfmt registration and nothing else, uses 
this P flag, and
uses -argv0 qemu-user argument to pass all the necessary information to 

But these wrappers introduce a different issue: since the wrapper executes the 
qemu binary,
we can't use F binfmt-misc flag anymore without copying the qemu-user binary 
inside any
foreign chroot being used with it.

So the possible solution is to made qemu-user aware of this in-kernel binfmt 
which I implemented for Debian for now, as a guinea pig :)

Since the original problem is the proper handling of programs which depend on 
their own
name in argv[0], the proposed solution is also based on the program name, - 
this time
it is the name under which qemu-user binary is called.

I introduced a special name for qemu-user binaries to be used _only_ for binfmt 
This is, in my case, /usr/libexec/qemu-binfmt/foo-binfmt-P - where "foo" is the 
name, and "-binfmt-P" is the literal suffix. This name is just a (sym)link to 
- just an alternative name for qemu-foo, nothing more.

And added a patch for linux-user/main.c which checks suffix of its argv[0], and 
if it ends
up on the literal "-binfmt-P", we assume we're being called from in-kernel 
subsystem with the P flag enabled, which means first 3 args are: our own name, 
the original
argv[0], and the intended binary's path, and the rest are the arguments for the 
binary to

At first I thought it is a hackish approach, and mentioned that in a comment in 
the code,
but the more I think about it, the more I like it, and the more it makes sense.

Here's the patch I used in Debian (it is not intended for upstream for now), - 
for comments,
what do you think? At least it seems like a good step in the right direction, 
finally.. :)

And we have another issue still, in the same field. Some programs executes 
itself by using
/proc/self/exe. This does not work under linux-user too, since this link, 
again, points to
the qemu binary, not the original binary being run. But this is a different 



Subject: [PATCH, HACK]: linux-user: handle binfmt-misc P flag as a separate exe 
From: Michael Tokarev <mjt@tls.msk.ru>
Date: Sat, 13 Feb 2021 13:57:52 +0300

A hackish way to distinguish the case when qemu-user binary is executed
using in-kernel binfmt-misc subsystem with P flag (preserve argv).
We register binfmt interpreter under name 
(which is just a symlink to ../../bin/qemu-foo), and if run like that,
qemu-user binary will "know" it should interpret argv[1] & argv[2]
in a special way.

diff --git a/linux-user/main.c b/linux-user/main.c
index 24d1eb73ad..5596dab9be 100644
--- a/linux-user/main.c
+++ b/linux-user/main.c
@@ -560,6 +560,27 @@ static int parse_args(int argc, char **argv)

+    /* HACK alert.
+     * when run as an interpreter using kernel's binfmt-misc mechanism,
+     * we have to know where are we (our own binary), where's the binary being 
+     * and what it's argv[0] element.
+     * Only with the P interpreter flag kernel passes all 3 elements as first 
3 argv[],
+     * but we can't distinguish if we were run with or without this P flag.
+     * So we register a special name with binfmt-misc system, a name which 
ends up
+     * in "-binfmt-P", and if our argv[0] ends up with that, we assume we were 
+     * from kernel's binfmt with P flag and our first 3 args are from kernel.
+     */
+    if (strlen(argv[0]) > sizeof("binfmt-P") &&
+        strcmp(argv[0] + strlen(argv[0]) - sizeof("binfmt-P"), "-binfmt-P") == 
0) {
+        if (argc < 3) {
+            (void) fprintf(stderr, "qemu: %s has to be run using kernel binfmt-misc 
subsystem\n", argv[0]);
+            exit(EXIT_FAILURE);
+        }
+        handle_arg_argv0(argv[1]);
+        exec_path = argv[2];
+        return 2;
+    }
     optind = 1;
     for (;;) {
         if (optind >= argc) {

reply via email to

[Prev in Thread] Current Thread [Next in Thread]