qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 2/3] vl: sanity check cpu topology


From: Eduardo Habkost
Subject: Re: [Qemu-devel] [PATCH 2/3] vl: sanity check cpu topology
Date: Tue, 11 Nov 2014 13:48:00 -0200
User-agent: Mutt/1.5.23 (2014-03-12)

On Tue, Nov 11, 2014 at 03:37:11PM +0100, Andrew Jones wrote:
> On Tue, Nov 11, 2014 at 10:41:00AM -0200, Eduardo Habkost wrote:
> > On Fri, Nov 07, 2014 at 05:04:39PM +0100, Andrew Jones wrote:
> > > smp_parse allows partial or complete cpu topology to be given.
> > > In either case there may be inconsistencies in the input which
> > > are currently not sounding any alarms. In some cases the input
> > > is even being silently corrected. We shouldn't do this. Add
> > > warnings when input isn't adding up right, and even abort when
> > > the complete cpu topology has been input, but isn't correct.
> > > 
> > > Signed-off-by: Andrew Jones <address@hidden>
> > 
> > So, we are fixing bugs and changing behavior on two different cases
> > here:
> > 
> > 1) when all options are provided and they aren't enough for smp_cpus;
> > 2) when one option was missing, but the existing calculation was
> >    incorrect because of division truncation.
> 
>   3) when threads were provided, but incorrect, we silently changed
>      it. I thought you wanted to fix this one right now too.

True. When I described (1) I was seeing (3) as a subset of it, as
threads is silently changed only if core and sockets were also provided
and cores*sockets*threads != smp_cpus.

So, yes, I want to fix (1) and (3) on 2.2. I am not sure about (2).

> 
> > 
> > I don't think we need to keep compatibility on (1) because the user is
> > obviously providing an invalid configuration. That's why I suggested we
> > implemented it in 2.2. And it is safer because we won't be silently
> > changing behavior: QEMU is going to abort and the mistake will be easily
> > detected.
> > 
> > But (2) is fixing a QEMU bug, not user error. The user may be unaware of
> > the bug, and will get a silent ABI change once upgrading to a newer
> > QEMU.
> 
> We can keep it rounding down, unless the result is zero, as the current
> code does. How about keeping the new warning? Nah, let's drop it. Who
> actually cares about warnings anyway...

A warning would be interesting for (2), maybe. I just would like to have
it addressed in a separate patch, so we can fix (3) and (1) in 2.2
before we decide about (2).

> 
> > 
> > I suggest fixing only (1) by now and keeping the behavior for (2) on
> > QEMU 2.2. Something like:
> > 
> >     if (sockets == 0) {
> >         /* keep existing code for sockets == 0 */
> >     } else if (cores == 0) {
> >         /* keep existing code for cores == 0 */
> >     } else if (threads == 0) {
> >         /* keep existing code for threads == 0 */
> 
> This doesn't exist with current code. Adding an 'if (threads == 0)'
> case is fix (3).

Yes, I was talking about the existing code that's inside the "else"
branch (and would change threads silently even if it was already set),
that would fix (3) too.

> 
> >     } else {
> >         /* new code: */
> >         if (sockets * cores * threads < cpus) {
> >             fprintf(stderr, "cpu topology: error: "
> >                     "sockets (%u) * cores (%u) * threads (%u) < smp_cpus 
> > (%u)\n",
> >                     sockets, cores, threads, cpus);
> >             exit(1);
> >         }
> >     }
> > 
> > 
> 
> Below is a v2 I can post if it looks good to you.
> 
> From: Andrew Jones <address@hidden>
> Date: Fri, 7 Nov 2014 15:45:07 +0100
> Subject: [PATCH v2] vl: sanity check cpu topology
> 
> smp_parse allows partial or complete cpu topology to be given.
> In either case there may be inconsistencies in the input which
> are currently not sounding any alarms. In some cases the input
> is even being silently corrected. Stop silently adjusting input
> and abort when the complete cpu topology has been input, but
> isn't correct.

I don't think that's accurate: you are not aborting only when the
complete CPU topology has been input, but also when QEMU automatic
calculation is incorrect due to truncation.

> 
> Signed-off-by: Andrew Jones <address@hidden>
> ---
>  vl.c | 31 +++++++++++++++++++++++++++----
>  1 file changed, 27 insertions(+), 4 deletions(-)
> 
> diff --git a/vl.c b/vl.c
> index 9d9855092ab4a..e686fd21e266f 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -1288,16 +1288,39 @@ static void smp_parse(QemuOpts *opts)
>              if (cores == 0) {
>                  threads = threads > 0 ? threads : 1;
>                  cores = cpus / (sockets * threads);
> -            } else {
> -                threads = cpus / (cores * sockets);
> +                if (cpus % (sockets * threads)) {
> +                    /* The calculation resulted in a fractional number, so we
> +                     * need to adjust it. The below adjustment is wrong, it
> +                     * should be '+= 1', but we need to keep it this way for
> +                     * compatibility.
> +                     */
> +                    cores = cores > 0 ? cores : 1;
> +                }
> +            } else if (threads == 0) {
> +                threads = cpus / (sockets * cores);
> +                if (cpus % (sockets * cores)) {
> +                    /* The calculation resulted in a fractional number, so we
> +                     * need to adjust it. The below adjustment is wrong, it
> +                     * should be '+= 1', but we need to keep it this way for
> +                     * compatibility.
> +                     */
> +                    threads = threads > 0 ? threads : 1;
> +                }

You are fixing a (subset of) 2 while fixing 1 and 3, again. Can we
address (2) in a separate patch?

Untested patch that fixes only (1) & (3) below:

Signed-off-by: Eduardo Habkost <address@hidden>
---
diff --git a/vl.c b/vl.c
index f4a6e5e..536c7d3 100644
--- a/vl.c
+++ b/vl.c
@@ -1284,13 +1284,17 @@ static void smp_parse(QemuOpts *opts)
             if (cpus == 0) {
                 cpus = cores * threads * sockets;
             }
-        } else {
-            if (cores == 0) {
-                threads = threads > 0 ? threads : 1;
-                cores = cpus / (sockets * threads);
-            } else {
-                threads = cpus / (cores * sockets);
-            }
+        } else if (cores == 0) {
+            threads = threads > 0 ? threads : 1;
+            cores = cpus / (sockets * threads);
+        } else if (threads ==0) {
+            threads = cpus / (cores * sockets);
+        } else if (sockets * cores * threads < cpus) {
+            fprintf(stderr, "cpu topology: error: "
+                    "sockets (%u) * cores (%u) * threads (%u) < "
+                    "smp_cpus (%u)\n",
+                    sockets, cores, threads, cpus);
+            exit(1);
         }
 
         max_cpus = qemu_opt_get_number(opts, "maxcpus", 0);


>              }
>          }
>  
> +        if (sockets * cores * threads < cpus) {
> +            fprintf(stderr, "cpu topology: error: "
> +                    "sockets (%u) * cores (%u) * threads (%u) < smp_cpus 
> (%u)\n",
> +                    sockets, cores, threads, cpus);
> +            exit(1);
> +        }
> +
>          max_cpus = qemu_opt_get_number(opts, "maxcpus", 0);
>  
>          smp_cpus = cpus;
> -        smp_cores = cores > 0 ? cores : 1;
> -        smp_threads = threads > 0 ? threads : 1;
> +        smp_cores = cores;
> +        smp_threads = threads;
>  
>      }
>  
> -- 
> 1.9.3
> 


-- 
Eduardo



reply via email to

[Prev in Thread] Current Thread [Next in Thread]