qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH V5 02/12] NUMA: split -numa option


From: Laszlo Ersek
Subject: Re: [Qemu-devel] [PATCH V5 02/12] NUMA: split -numa option
Date: Wed, 17 Jul 2013 13:00:46 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130621 Thunderbird/17.0.7

I'm reviewing this with respect to opts-visitor usage:

On 07/17/13 11:29, Wanlong Gao wrote:
> Change -numa option like following as Paolo suggested:
>     -numa node,nodeid=0,cpus=0-1 \
>     -numa mem,nodeid=0,size=1G
> 
> This new option will make later coming memory hotplug better.
> And this new option is implemented using OptsVisitor.
> 
> Signed-off-by: Wanlong Gao <address@hidden>
> ---
>  Makefile.target         |   2 +-
>  include/sysemu/sysemu.h |   3 +
>  numa.c                  | 164 
> ++++++++++++++++++++++++++++++++++++++++++++++++
>  qemu-options.hx         |   6 +-
>  vl.c                    | 107 +++----------------------------
>  5 files changed, 182 insertions(+), 100 deletions(-)
>  create mode 100644 numa.c
> 
> diff --git a/Makefile.target b/Makefile.target
> index 9a49852..7e1fddf 100644
> --- a/Makefile.target
> +++ b/Makefile.target
> @@ -113,7 +113,7 @@ endif #CONFIG_BSD_USER
>  #########################################################
>  # System emulator target
>  ifdef CONFIG_SOFTMMU
> -obj-y += arch_init.o cpus.o monitor.o gdbstub.o balloon.o ioport.o
> +obj-y += arch_init.o cpus.o monitor.o gdbstub.o balloon.o ioport.o numa.o
>  obj-y += qtest.o
>  obj-y += hw/
>  obj-$(CONFIG_FDT) += device_tree.o
> diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
> index 3caeb66..cf8e6e5 100644
> --- a/include/sysemu/sysemu.h
> +++ b/include/sysemu/sysemu.h
> @@ -132,6 +132,9 @@ extern QEMUClock *rtc_clock;
>  extern int nb_numa_nodes;
>  extern uint64_t node_mem[MAX_NODES];
>  extern unsigned long *node_cpumask[MAX_NODES];
> +extern QemuOptsList qemu_numa_opts;
> +int numa_init_func(QemuOpts *opts, void *opaque);
> +void set_numa_nodes(void);
>  
>  #define MAX_OPTION_ROMS 16
>  typedef struct QEMUOptionRom {
> diff --git a/numa.c b/numa.c
> new file mode 100644
> index 0000000..da68c4b
> --- /dev/null
> +++ b/numa.c
> @@ -0,0 +1,164 @@
> +/*
> + * QEMU System Emulator
> + *
> + * Copyright (c) 2013 Fujitsu Ltd.
> + * Author: Wanlong Gao <address@hidden>
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a 
> copy
> + * of this software and associated documentation files (the "Software"), to 
> deal
> + * in the Software without restriction, including without limitation the 
> rights
> + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
> + * copies of the Software, and to permit persons to whom the Software is
> + * furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice shall be included in
> + * all copies or substantial portions of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
> FROM,
> + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
> + * THE SOFTWARE.
> + */
> +
> +#include "sysemu/sysemu.h"
> +#include "qemu/bitmap.h"
> +#include "qapi-visit.h"
> +#include "qapi/opts-visitor.h"
> +#include "qapi/dealloc-visitor.h"
> +
> +QemuOptsList qemu_numa_opts = {
> +    .name = "numa",
> +    .implied_opt_name = "type",
> +    .head = QTAILQ_HEAD_INITIALIZER(qemu_numa_opts.head),
> +    .desc = { { 0 } } /* validated with OptsVisitor */
> +};

Looks OK.

> +
> +static int numa_node_parse_cpus(int nodenr, const char *cpus)
> +{
> +    char *endptr;
> +    unsigned long long value, endvalue;
> +
> +    /* Empty CPU range strings will be considered valid, they will simply
> +     * not set any bit in the CPU bitmap.
> +     */
> +    if (!*cpus) {
> +        return 0;
> +    }

I think this can crash, can't it? All option arguments (struct members)
you introduce in 01/12 are optional. For scalar types (boolean, int etc)
you can usually simply check the member itself (field XXXX will have
value 0 if has_XXXX is false), but strings, sub-structs etc. are
generated as pointers, and are NULL if has_XXXX is false.

IOW numa_node_parse() may pass in a NULL "cpus" if ",cpus=..." is
completely absent.

> +
> +    if (parse_uint(cpus, &value, &endptr, 10) < 0) {
> +        goto error;
> +    }
> +    if (*endptr == '-') {
> +        if (parse_uint_full(endptr + 1, &endvalue, 10) < 0) {
> +            goto error;
> +        }
> +    } else if (*endptr == '\0') {
> +        endvalue = value;
> +    } else {
> +        goto error;
> +    }
> +
> +    if (endvalue >= MAX_CPUMASK_BITS) {
> +        endvalue = MAX_CPUMASK_BITS - 1;
> +        fprintf(stderr,
> +            "qemu: NUMA: A max of %d VCPUs are supported\n",
> +             MAX_CPUMASK_BITS);
> +    }
> +
> +    if (endvalue < value) {
> +        goto error;
> +    }
> +
> +    bitmap_set(node_cpumask[nodenr], value, endvalue-value+1);
> +    return 0;
> +
> +error:
> +    fprintf(stderr, "qemu: Invalid NUMA CPU range: %s\n", cpus);
> +    return -1;
> +}

The function looks OK to me otherwise.

BTW are discontiguous CPU maps unsupported on purpose?

Hm, I think that's actually supported, the user can pass

  -numa node,nodeid=1,cpus=1-2 -numa node,nodeid=1,cpus=5-6

and the bitmask will be the union.

If that's the case, opts-visitor would support the following notation
too (you'd have to turn "cpus" into a list of strings):

  -numa node,nodeid=1,cpus=1-2,cpus=5-6

But I'm not sure if that's the intent and/or if it would be worth the churn.

> +
> +static int numa_node_parse(NumaNodeOptions *opts)
> +{
> +    uint64_t nodenr;
> +    const char *cpus = NULL;
> +
> +    nodenr = opts->nodeid;
> +    if (nodenr >= MAX_NODES) {
> +        fprintf(stderr, "qemu: Max number of NUMA nodes reached: %d\n",
> +                (int)nodenr);
> +        return -1;
> +    }
> +
> +    cpus = opts->cpus;
> +    return numa_node_parse_cpus(nodenr, cpus);
> +}

"nodeid" can be negative (has type 'int' in the json, corresponding to
int64_t in generated C code). I guess casting it to uint64_t here
catches it in practice, but you could also change the type to something
unsigned in the json.

Also, what should happen if nodeid was not specified? (Nodenr will have
value 0.) If that's intended, maybe it should be documented in the json.

> +
> +static int numa_mem_parse(NumaMemOptions *opts)
> +{
> +    uint64_t nodenr, mem_size;
> +
> +    nodenr = opts->nodeid;
> +    if (nodenr >= MAX_NODES) {
> +        fprintf(stderr, "qemu: Max number of NUMA nodes reached: %d\n",
> +                (int)nodenr);

PRIu64 and no casting would be more fortunate I believe (or, if you
change the nodeid type in the json to uint32 for example, that format
specified).

> +        return -1;
> +    }
> +
> +    mem_size = opts->size;
> +    node_mem[nodenr] = mem_size;
> +
> +    return 0;
> +}
> +
> +int numa_init_func(QemuOpts *opts, void *opaque)
> +{
> +    NumaOptions *object = NULL;
> +    Error *err = NULL;
> +    int ret = 0;
> +
> +    {
> +        OptsVisitor *ov = opts_visitor_new(opts);
> +        visit_type_NumaOptions(opts_get_visitor(ov), &object, NULL, &err);
> +        opts_visitor_cleanup(ov);
> +    }
> +
> +    if (error_is_set(&err)) {
> +        fprintf(stderr, "qemu: %s\n", error_get_pretty(err));
> +        error_free(err);
> +        ret = -1;
> +        goto error;
> +    }
> +
> +    switch (object->kind) {
> +    case NUMA_OPTIONS_KIND_NODE:
> +        if (nb_numa_nodes >= MAX_NODES) {
> +            fprintf(stderr, "qemu: too many NUMA nodes\n");
> +            ret = -1;
> +            goto error;
> +        }
> +        nb_numa_nodes++;
> +        ret = numa_node_parse(object->node);
> +        break;

This assumes that nodeid values are unique.

> +    case NUMA_OPTIONS_KIND_MEM:
> +        ret = numa_mem_parse(object->mem);
> +        break;
> +    default:
> +        fprintf(stderr, "qemu: Invalid NUMA options type.\n");
> +        ret = -1;
> +        goto error;
> +    }

Small nit: this last "goto error" is useless.

> +
> +error:
> +    if (object) {
> +        QapiDeallocVisitor *dv = qapi_dealloc_visitor_new();
> +        visit_type_NumaOptions(qapi_dealloc_get_visitor(dv),
> +                               &object, NULL, NULL);
> +        qapi_dealloc_visitor_cleanup(dv);
> +    }
> +
> +    return ret;
> +}
> +

I think the general logic is OK.

> diff --git a/qemu-options.hx b/qemu-options.hx
> index 4e98b4f..7ec4486 100644
> --- a/qemu-options.hx
> +++ b/qemu-options.hx
> @@ -95,11 +95,13 @@ specifies the maximum number of hotpluggable CPUs.
>  ETEXI
>  
>  DEF("numa", HAS_ARG, QEMU_OPTION_numa,
> -    "-numa node[,mem=size][,cpus=cpu[-cpu]][,nodeid=node]\n", QEMU_ARCH_ALL)
> +    "-numa node[,nodeid=node][,cpus=cpu[-cpu]]\n"
> +    "-numa mem[,nodeid=node][,size=size]\n"
> +    , QEMU_ARCH_ALL)
>  STEXI
>  @item -numa @var{opts}
>  @findex -numa
> -Simulate a multi node NUMA system. If mem and cpus are omitted, resources
> +Simulate a multi node NUMA system. If @var{size} and @var{cpus} are omitted, 
> resources
>  are split equally.
>  ETEXI
>  
> diff --git a/vl.c b/vl.c
> index 25b8f2f..d3e6d8c 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -1330,102 +1330,6 @@ char *get_boot_devices_list(size_t *size)
>      return list;
>  }
>  
> -static void numa_node_parse_cpus(int nodenr, const char *cpus)
> -{
> -    char *endptr;
> -    unsigned long long value, endvalue;
> -
> -    /* Empty CPU range strings will be considered valid, they will simply
> -     * not set any bit in the CPU bitmap.
> -     */
> -    if (!*cpus) {
> -        return;
> -    }
> -
> -    if (parse_uint(cpus, &value, &endptr, 10) < 0) {
> -        goto error;
> -    }
> -    if (*endptr == '-') {
> -        if (parse_uint_full(endptr + 1, &endvalue, 10) < 0) {
> -            goto error;
> -        }
> -    } else if (*endptr == '\0') {
> -        endvalue = value;
> -    } else {
> -        goto error;
> -    }
> -
> -    if (endvalue >= MAX_CPUMASK_BITS) {
> -        endvalue = MAX_CPUMASK_BITS - 1;
> -        fprintf(stderr,
> -            "qemu: NUMA: A max of %d VCPUs are supported\n",
> -             MAX_CPUMASK_BITS);
> -    }
> -
> -    if (endvalue < value) {
> -        goto error;
> -    }
> -
> -    bitmap_set(node_cpumask[nodenr], value, endvalue-value+1);
> -    return;
> -
> -error:
> -    fprintf(stderr, "qemu: Invalid NUMA CPU range: %s\n", cpus);
> -    exit(1);
> -}
> -
> -static void numa_add(const char *optarg)
> -{
> -    char option[128];
> -    char *endptr;
> -    unsigned long long nodenr;
> -
> -    optarg = get_opt_name(option, 128, optarg, ',');
> -    if (*optarg == ',') {
> -        optarg++;
> -    }
> -    if (!strcmp(option, "node")) {
> -
> -        if (nb_numa_nodes >= MAX_NODES) {
> -            fprintf(stderr, "qemu: too many NUMA nodes\n");
> -            exit(1);
> -        }
> -
> -        if (get_param_value(option, 128, "nodeid", optarg) == 0) {
> -            nodenr = nb_numa_nodes;
> -        } else {
> -            if (parse_uint_full(option, &nodenr, 10) < 0) {
> -                fprintf(stderr, "qemu: Invalid NUMA nodeid: %s\n", option);
> -                exit(1);
> -            }
> -        }
> -
> -        if (nodenr >= MAX_NODES) {
> -            fprintf(stderr, "qemu: invalid NUMA nodeid: %llu\n", nodenr);
> -            exit(1);
> -        }
> -
> -        if (get_param_value(option, 128, "mem", optarg) == 0) {
> -            node_mem[nodenr] = 0;
> -        } else {
> -            int64_t sval;
> -            sval = strtosz(option, &endptr);
> -            if (sval < 0 || *endptr) {
> -                fprintf(stderr, "qemu: invalid numa mem size: %s\n", optarg);
> -                exit(1);
> -            }
> -            node_mem[nodenr] = sval;
> -        }
> -        if (get_param_value(option, 128, "cpus", optarg) != 0) {
> -            numa_node_parse_cpus(nodenr, option);
> -        }
> -        nb_numa_nodes++;
> -    } else {
> -        fprintf(stderr, "Invalid -numa option: %s\n", option);
> -        exit(1);
> -    }
> -}
> -
>  static QemuOptsList qemu_smp_opts = {
>      .name = "smp-opts",
>      .implied_opt_name = "cpus",
> @@ -2961,6 +2865,7 @@ int main(int argc, char **argv, char **envp)
>      qemu_add_opts(&qemu_tpmdev_opts);
>      qemu_add_opts(&qemu_realtime_opts);
>      qemu_add_opts(&qemu_msg_opts);
> +    qemu_add_opts(&qemu_numa_opts);
>  
>      runstate_init();
>  
> @@ -3147,7 +3052,10 @@ int main(int argc, char **argv, char **envp)
>                  }
>                  break;
>              case QEMU_OPTION_numa:
> -                numa_add(optarg);
> +                opts = qemu_opts_parse(qemu_find_opts("numa"), optarg, 1);
> +                if (!opts) {
> +                    exit(1);
> +                }
>                  break;
>              case QEMU_OPTION_display:
>                  display_type = select_display(optarg);
> @@ -4226,6 +4134,11 @@ int main(int argc, char **argv, char **envp)
>  
>      register_savevm_live(NULL, "ram", 0, 4, &savevm_ram_handlers, NULL);
>  
> +    if (qemu_opts_foreach(qemu_find_opts("numa"), numa_init_func,
> +                          NULL, 1) != 0) {
> +        exit(1);
> +    }
> +
>      if (nb_numa_nodes > 0) {
>          int i;
>  
> 

Please feel free to ignore any of the above notes if you disagree with them.

Would it be OK with you if I skipped the rest of the series? :)

Thanks,
Laszlo



reply via email to

[Prev in Thread] Current Thread [Next in Thread]