[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Qemu-devel] Re: [PATCH-RFC 13/13] virtio-net: connect to vhost net back
From: |
Michael S. Tsirkin |
Subject: |
[Qemu-devel] Re: [PATCH-RFC 13/13] virtio-net: connect to vhost net backend |
Date: |
Mon, 25 Jan 2010 22:27:11 +0200 |
User-agent: |
Mutt/1.5.19 (2009-01-05) |
On Mon, Jan 25, 2010 at 01:58:08PM -0600, Anthony Liguori wrote:
> On 01/11/2010 11:23 AM, Michael S. Tsirkin wrote:
>> start/stop backend on driver start/stop
>>
>> Signed-off-by: Michael S. Tsirkin<address@hidden>
>> ---
>> hw/virtio-net.c | 40 ++++++++++++++++++++++++++++++++++++++++
>> 1 files changed, 40 insertions(+), 0 deletions(-)
>>
>> diff --git a/hw/virtio-net.c b/hw/virtio-net.c
>> index c2a389f..99169e1 100644
>> --- a/hw/virtio-net.c
>> +++ b/hw/virtio-net.c
>> @@ -17,6 +17,7 @@
>> #include "net/tap.h"
>> #include "qemu-timer.h"
>> #include "virtio-net.h"
>> +#include "vhost_net.h"
>>
>> #define VIRTIO_NET_VM_VERSION 11
>>
>> @@ -47,6 +48,7 @@ typedef struct VirtIONet
>> uint8_t nomulti;
>> uint8_t nouni;
>> uint8_t nobcast;
>> + uint8_t vhost_started;
>> struct {
>> int in_use;
>> int first_multi;
>> @@ -114,6 +116,10 @@ static void virtio_net_reset(VirtIODevice *vdev)
>> n->nomulti = 0;
>> n->nouni = 0;
>> n->nobcast = 0;
>> + if (n->vhost_started) {
>> + vhost_net_stop(tap_get_vhost_net(n->nic->nc.peer), vdev);
>> + n->vhost_started = 0;
>> + }
>>
>> /* Flush any MAC and VLAN filter table state */
>> n->mac_table.in_use = 0;
>> @@ -820,6 +826,36 @@ static NetClientInfo net_virtio_info = {
>> .link_status_changed = virtio_net_set_link_status,
>> };
>>
>> +static void virtio_net_set_status(struct VirtIODevice *vdev)
>> +{
>> + VirtIONet *n = to_virtio_net(vdev);
>> + if (!n->nic->nc.peer) {
>> + return;
>> + }
>> + if (n->nic->nc.peer->info->type != NET_CLIENT_TYPE_TAP) {
>> + return;
>> + }
>> +
>> + if (!tap_get_vhost_net(n->nic->nc.peer)) {
>> + return;
>> + }
>> + if (!!n->vhost_started == !!(vdev->status& VIRTIO_CONFIG_S_DRIVER_OK))
>> {
>> + return;
>> + }
>> + if (vdev->status& VIRTIO_CONFIG_S_DRIVER_OK) {
>> + int r = vhost_net_start(tap_get_vhost_net(n->nic->nc.peer), vdev);
>> + if (r< 0) {
>> + fprintf(stderr, "unable to start vhost net: "
>> + "falling back on userspace virtio\n");
>> + } else {
>> + n->vhost_started = 1;
>> + }
>> + } else {
>> + vhost_net_stop(tap_get_vhost_net(n->nic->nc.peer), vdev);
>> + n->vhost_started = 0;
>> + }
>> +}
>> +
>>
>
> This function does a number of bad things. It makes virtio-net have
> specific knowledge of backends (like tap) and then has virtio-net pass
> device specific state (vdev) to a network backend.
>
> Ultimately, the following things need to happen:
>
> 1) when a driver is ready to begin operating:
> a) virtio-net needs to tell vhost the location of the ring in physical
> memory
> b) virtio-net needs to tell vhost about any notification it receives
> (allowing kvm to shortcut userspace)
> c) virtio-net needs to tell vhost about which irq is being used
> (allowing kvm to shortcut userspace)
>
> What this suggests is that we need an API for the network backends to do
> the following:
>
> 1) probe whether ring passthrough is supported
> 2) set the *virtual* address of the ring elements
> 3) hand the backend some sort of notification object for sending and
> receiving notifications
> 4) stop ring passthrough
Look at how vnet_hdr is setup: frontend probes backend, and has 'if
(backend has vnet header) blabla' vhost is really very similar, so I
would like to do it in the same way.
Generally I do not believe in abstractions that only have one
implementation behind it: you only know how to abstract interface after you
have more than one implementation. So whoever writes another frontend
that can use vhost will be in a better position to add infrastructure to
abstract both that new thing and virtio.
> vhost should not need any direct knowledge of the device. This
> interface should be enough to communicating the required data. I think
> the only bit that is a little fuzzy and perhaps non-obvious for the
> current patches is the notification object. I would expect it look
> something like:
>
> typedef struct IOEvent {
> int type;
> void (*notify)(IOEvent *);
> void (*on_notify)(IOEvent *, void (*cb)(IOEvent *, void *));
> } IOEvent;
> And then we would have
>
> typedef struct KVMIOEvent {
> IOEvent event = {.type = KVM_IO_EVENT};
> int fd;
> } KVMIOEvent;
>
> if (kvm_enabled()) in virtio-net, we would allocate a KVMIOEvent for the
> PIO notification and hand that to vhost. vhost would check event.type
> and if it's KVM_IOEVENT, downcast and use that to get at the file
> descriptor.
Since we don't have any other IOEvents, I just put the fd
in the generic structure directly. If other types surface
we'll see how to generalize it.
> The KVMIOEvent should be created, not in the set status callback, but in
> the PCI map callback. And in the PCI map callback, cpu_single_env
> should be passed to a kvm specific function to create this KVMIOEvent
> object that contains the created eventfd() that's handed to kvm via
> ioctl.
So this specific thing does not work very well because with irqchip, we
want to bypass notification and send irq directly from kernel.
So I created a structure but it does not have callbacks,
only the fd.
> It doesn't have to be exactly like this, but the point of all of this is
> that these KVM specific mechanism (which are really implementation
> details) should not be pervasive throughout the QEMU interfaces.
> There
> should also be strong separation between the vhost-net code and the
> virtio-net device.
>
> Regards,
>
> Anthony Liguori
>
I don't see the point of this last idea. vhost is virtio accelerator,
not a generic network backend. Whoever wants to use vhost for
non-virtio frontends will have to add infrastructure for this
separation, but I do not believe it's practical or desirable.
>
>> VirtIODevice *virtio_net_init(DeviceState *dev, NICConf *conf)
>> {
>> VirtIONet *n;
>> @@ -835,6 +871,7 @@ VirtIODevice *virtio_net_init(DeviceState *dev, NICConf
>> *conf)
>> n->vdev.set_features = virtio_net_set_features;
>> n->vdev.bad_features = virtio_net_bad_features;
>> n->vdev.reset = virtio_net_reset;
>> + n->vdev.set_status = virtio_net_set_status;
>> n->rx_vq = virtio_add_queue(&n->vdev, 256, virtio_net_handle_rx);
>> n->tx_vq = virtio_add_queue(&n->vdev, 256, virtio_net_handle_tx);
>> n->ctrl_vq = virtio_add_queue(&n->vdev, 64, virtio_net_handle_ctrl);
>> @@ -864,6 +901,9 @@ VirtIODevice *virtio_net_init(DeviceState *dev, NICConf
>> *conf)
>> void virtio_net_exit(VirtIODevice *vdev)
>> {
>> VirtIONet *n = DO_UPCAST(VirtIONet, vdev, vdev);
>> + if (n->vhost_started) {
>> + vhost_net_stop(tap_get_vhost_net(n->nic->nc.peer), vdev);
>> + }
>>
>> qemu_purge_queued_packets(&n->nic->nc);
>>
>>