Re: [Qemu-devel] [PATCH] net: add raw backend

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH] net: add raw backend

From:	Jan Kiszka
Subject:	Re: [Qemu-devel] [PATCH] net: add raw backend
Date:	Wed, 15 Jul 2009 23:06:21 +0200
User-agent:	Mozilla/5.0 (X11; U; Linux i686 (x86_64); de; rv:1.8.1.12) Gecko/20080226 SUSE/2.0.0.12-1.1 Thunderbird/2.0.0.12 Mnenhy/0.7.5.666

Jamie Lokier wrote:
> Or Gerlitz wrote:
>> Jamie Lokier wrote:
>>> The problem is simply what the guest sends goes out on the network and is 
>>> not looped backed to the host network stack, and vice versa. So if your 
>>> host is 192.168.1.1 and is running a DNS server (say), and the guest is 
>>> 192.168.1.2, when the guest sends queries to 192.168.1.1 the host won't 
>>> see those queries.  Same if you're running an FTP server on the host and 
>>> the guest wants to connect to it, etc. It also means multiple guests can't 
>>> see each other, for the same reason. So it's much less useful than 
>>> bridging, where the guests and host can all see each other and connect to 
>>> each other.
>> I wasn't sure to follow if your example refers to the case when 
>> networking uses the bridge or NAT. If its bridge, then through which 
>> bridge interface the packet arrives the host stack? say you have a 
>> bridge whose attached interfaces are tap1(VM1), tap2(VM2) and eth0(NIC), 
>> in your example did you mean that the host IP address is assigned to the 
>> bridge interface? or you were referring a NAT based scheme?
> 
> When using a bridge, you set the IP address on the bridge itself (for
> example, br0).  DHCP runs on the bridge itself, so does the rest of
> the Linux host stack, although you can use raw sockets on the other
> interfaces.
> 
> But reading and controlling the hardware is done on the interfaces.
> 
> So if you have some program like NetworkManager which checks if you
> have a wire plugged into eth0, it has to read eth0 to get the wire
> status, but it has to run DHCP on br0.
> 
> Those programs don't generally have that option, which makes bridges
> difficult to use for VMs in a transparent way.
> 
> I wasn't referring to NAT, but you can use NAT with a bridge on Linux;
> it's called brouting :-)
> 
>>> Unfortunately, bridging is a pain to set up, if your host has any 
>>> complicated or automatic network configuration already.
> 
>> As you said bridging requires more configuration
> 
> A bridge is quite simple to configure.  Unfortunately because Linux
> requires all the IP configuration on the bridge device, but network
> device control on the network device, bridges don't work well with
> automatic configuration tools.
> 
> If you could apply host IP configuration to the network device and
> still have a bridge, that would be perfect.  You would just create
> br0, add tap1(VM1), tap2(VM2) and eth0(NIC), and everything would work
> perfectly.
> 
>> but not less important the performance (packets per second and cpu
>> utilization) one can get with bridge+tap is much lower vs what you
>> get with the raw mode approach.
> 
> Have you measured it?
> 
>> All in all, its clear that with this approach VM/VM and VM/Host
>> communication would have to get switched either at the NIC (e.g
>> SR/IOV capable NICs supporting a virtual bridge) or at an external
>> switch and make a U turn.
> 
> Unfortunately that's usually impossible.  Most switches don't do U
> turns, and a lot of simple networks don't have any switches except a
> home router.
> 
>> There's a bunch of reasons why people would 
>> like to do that, among them performance boost,
> 
> No, it makes performance _much_ worse if you have packets leaving the
> host, do a U turn and come back on the same link.  Much better to use
> a bridge inside the host.  Probably ten times faster because host's
> internal networking is much faster than a typical gigabit link :-)
> 
>> the ability to shape, 
>> manage and monitor VM/VM traffic in external switches and more.
> 
> That could be useful, but I think it's's probably quite unusual for
> someone to want to shape traffic between a VM and it's own host.  Also
> if you want to do that, you can do it inside the host.
> 
> Sometimes it would be useful to send it outside the host and U turn,
> but not very often; only for diagnostics I would think.  And even that
> can be done with Linux bridges, using VLANs :-)
> 
>>> It would be really nice to find a way which has the advantages of both.  
>>> Either by adding a different bridging mode to Linux, where host interfaces 
>>> can be configured for IP and the bridge hangs off the host interface, or 
>>> by a modified tap interface, or by an alternative
>>> pcap/packet-like interface which forwards packets in a similar way to 
>>> bridging.  
> 
>> It seems that this will not yield  the performance improvement we can 
>> get with going directly to the NIC.
> 
> If you don't need any host<->VM networking, maybe a raw packet socket
> is faster.
> 
> But are you sure it's faster?
> I'd want to see measurements before I believe it.
> 
> If you need any host<->VM networking, most of the time the packet
> socket isn't an option at all.  Not many switches will 'U turn'
> packets as you suggest.

FWIW, the fastest local VM<->VM bridge I've happened to measure so far
was using qemu's -net socket,listen/connect, ie. a plain local IP or
unix domain socket between two qemu instances. No tap devices, no
in-kernel bridges involved. But this picture may change once we have
some in-kernel virtio-net backend.

Jan

signature.asc
Description: OpenPGP digital signature

[Prev in Thread]

Current Thread

[Next in Thread]

[Qemu-devel] [PATCH] net: add raw backend, Or Gerlitz, 2009/07/01
- Re: [Qemu-devel] [PATCH] net: add raw backend, Jamie Lokier, 2009/07/01
  - Re: [Qemu-devel] [PATCH] net: add raw backend, Or Gerlitz, 2009/07/02
    - Re: [Qemu-devel] [PATCH] net: add raw backend, Jamie Lokier, 2009/07/02
    - Re: [Qemu-devel] [PATCH] net: add raw backend, Or Gerlitz, 2009/07/07
    - Re: [Qemu-devel] [PATCH] net: add raw backend, Jamie Lokier, 2009/07/07
    - Re: [Qemu-devel] [PATCH] net: add raw backend, Or Gerlitz, 2009/07/08
    - Re: [Qemu-devel] [PATCH] net: add raw backend, Or Gerlitz, 2009/07/14
    - Re: [Qemu-devel] [PATCH] net: add raw backend, Jamie Lokier, 2009/07/15
    - Re: [Qemu-devel] [PATCH] net: add raw backend, Jan Kiszka <=
    - Re: [Qemu-devel] [PATCH] net: add raw backend, Jamie Lokier, 2009/07/15
    - Re: [Qemu-devel] [PATCH] net: add raw backend, Or Gerlitz, 2009/07/16
    - Re: [Qemu-devel] [PATCH] net: add raw backend - some performance measurements, Or Gerlitz, 2009/07/20
    - Re: [Qemu-devel] [PATCH] net: add raw backend - some performance measurements, Herbert Xu, 2009/07/20
    - Re: [Qemu-devel] [PATCH] net: add raw backend - some performance measurements, Michael S. Tsirkin, 2009/07/20
    - Re: [Qemu-devel] [PATCH] net: add raw backend - some performance measurements, Herbert Xu, 2009/07/20
    - Re: [Qemu-devel] [PATCH] net: add raw backend - some performance measurements, Or Gerlitz, 2009/07/21
    - Re: [Qemu-devel] [PATCH] net: add raw backend - some performance measurements, Herbert Xu, 2009/07/21
    - Re: [Qemu-devel] [PATCH] net: add raw backend - some performance measurements, Or Gerlitz, 2009/07/21
    - Re: [Qemu-devel] [PATCH] net: add raw backend - some performance measurements, Michael S. Tsirkin, 2009/07/21

Prev by Date: [Qemu-devel] qcow2 relative paths (was: [PATCH] rev5: support colon in filenames)
Next by Date: Re: [Qemu-devel] 2nd try: [PATCH] fix for bad macaddr of e1000 in Windows 2003 server with original Microsoft driver
Previous by thread: Re: [Qemu-devel] [PATCH] net: add raw backend
Next by thread: Re: [Qemu-devel] [PATCH] net: add raw backend
Index(es):
- Date
- Thread