qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 2/4] virtio-scsi: default num_queues to -smp N


From: Daniel P . Berrangé
Subject: Re: [PATCH v2 2/4] virtio-scsi: default num_queues to -smp N
Date: Mon, 3 Feb 2020 10:57:44 +0000
User-agent: Mutt/1.13.3 (2020-01-12)

On Mon, Feb 03, 2020 at 11:25:29AM +0100, Sergio Lopez wrote:
> On Thu, Jan 30, 2020 at 10:52:35AM +0000, Stefan Hajnoczi wrote:
> > On Thu, Jan 30, 2020 at 01:29:16AM +0100, Paolo Bonzini wrote:
> > > On 29/01/20 16:44, Stefan Hajnoczi wrote:
> > > > On Mon, Jan 27, 2020 at 02:10:31PM +0100, Cornelia Huck wrote:
> > > >> On Fri, 24 Jan 2020 10:01:57 +0000
> > > >> Stefan Hajnoczi <address@hidden> wrote:
> > > >>> @@ -47,10 +48,15 @@ static void vhost_scsi_pci_realize(VirtIOPCIProxy 
> > > >>> *vpci_dev, Error **errp)
> > > >>>  {
> > > >>>      VHostSCSIPCI *dev = VHOST_SCSI_PCI(vpci_dev);
> > > >>>      DeviceState *vdev = DEVICE(&dev->vdev);
> > > >>> -    VirtIOSCSICommon *vs = VIRTIO_SCSI_COMMON(vdev);
> > > >>> +    VirtIOSCSIConf *conf = &dev->vdev.parent_obj.parent_obj.conf;
> > > >>> +
> > > >>> +    /* 1:1 vq to vcpu mapping is ideal because it avoids IPIs */
> > > >>> +    if (conf->num_queues == VIRTIO_SCSI_AUTO_NUM_QUEUES) {
> > > >>> +        conf->num_queues = current_machine->smp.cpus;
> > > >> This now maps the request vqs 1:1 to the vcpus. What about the fixed
> > > >> vqs? If they don't really matter, amend the comment to explain that?
> > > > The fixed vqs don't matter.  They are typically not involved in the data
> > > > path, only the control path where performance doesn't matter.
> > > 
> > > Should we put a limit on the number of vCPUs?  For anything above ~128
> > > the guest is probably not going to be disk or network bound.
> > 
> > Michael Tsirkin pointed out there's a hard limit of VIRTIO_QUEUE_MAX
> > (1024).  We need to at least stay under that limit.
> > 
> > Should the guest have >128 virtqueues?  Each virtqueue requires guest
> > RAM and 2 host eventfds.  Eventually these resource requirements will
> > become a scalability problem, but how do we choose a hard limit and what
> > happens to guest performance above that limit?
> 
> From the UX perspective, I think it's safer to use a rather low upper
> limit for the automatic configuration.
> 
> Users of large VMs (>=32 vCPUs) aiming for the optimal performance are
> already facing the need of manually tuning (or relying on a software
> to do that for them) other aspects of it, like vNUMA, IOThreads and
> CPU pinning, so I don't think we should focus on this group.

Whether they're runing manually, or relying on software to tune for
them, we (QEMU maintainers) still need to provide credible guidance
on what todo with tuning for large CPU counts. Without clear info
from QEMU, it just descends into hearsay and guesswork, both of which
approaches leave QEMU looking bad.

So I think we need to, at the very least, make a clear statement here
about what tuning approach should be applied vCPU count gets high,
and probably even apply that  as a default out of the box approach.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|




reply via email to

[Prev in Thread] Current Thread [Next in Thread]