Re: Networking design proposal

bug-hurd
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Networking design proposal

From:	Michal 'hramrach' Suchanek
Subject:	Re: Networking design proposal
Date:	Mon, 11 Nov 2002 17:07:07 +0100
User-agent:	Mutt/1.4i
On Wed, Nov 06, 2002 at 02:52:25PM +0100, Niels M?ller wrote:
> Olivier P?ningault <peningault@free.fr> writes:
> 
> > You didn't understand correctly. Layer 2 translator performs ethernet +
> > arp, not ip !
If you do not do IP int L2, how can you tell which L3 gets the packet? 
> 
> I think it's unclear to talk about a "layer 2" translator. To me,
> "layer 2" is naturally an interface, not an action or a translation.
> So you need to say explicitly what the interfaces are on each side of
> the translator.
Or perhaps L2 is a glue encapsulating some L1 and presenting a new interface? 
> 
> >From your description, my guess is that you mean that on one end, the
> translator talks to some device driver to send and receive ethernet
> frames. And on the other end, it sends and receives ip packets to the
> upper layer. If so, it's about the same thing as what I called "layer
On the other hand, it can present some interface for sending/receiving data
but add/strip/interpret the headers internally.
> 
> > __________________   _____________________  ___________________
> > | L3 translator  |   |  L3 translator    |  | L3 translator   |
> > | 192.168.1.1/32 |   | 192.168.2.1/32    |  | 2001::1/128 ;)  |
> > ------------------   ---------------------  -------------------
> >  | Mach              || Mach     | Mach       | Mach
> >  | port              || port     | port       | port
> >  | 1                 || 2 5      | 3          | 4
> > ______________________________  _______________________________
> > | L2 trans on /dev/eth0      |  | L2 trans on /dev/eth1       |
> > |Registred L3 trans :        |  |Registred L3 trans:          |
> > | + 1 192.168.1.1/32 0x800   |  | + 3 192.168.2.1/32 0x800    |
> > | + 2 192.168.2.0/24 0x800   |  | + 4 2001::1/128    0x86DD   |
> > | + 0.0.0.0/0        0x800   |  |                             |
> > ------------------------------  -------------------------------
> >                |                               |
> > -------------------------------------------------------------------
aka L1?
> > 
> >  K  E  R  N  E  L    W /    D  E  V  I  C  E    D  R  I  V  E  R  S
> > 
> > --------------------------------------------------------------------
> 
> Looks reasonable to me. I'd do some details a little differently, but
> I think it's basically right.
> 
> > > Such an icmp service makes sense if one has several independent
> > > processes doing transport that talk to the same interface. But I'm not
> > > sure that makes sense; I think it's better to have at most one
> > > transport proces per ip number, to get easy management of portnumbers.
> > This would require a lot of memory !
> 
> I don't understand this comment.
> 
> > > Are there any icmp messages that you can process without knowledge of
> > > transport level state?
> > No, but think about ISO-CLNP. It is a stand alone layer 3 protocol, data
> > transmission (a la ip) and control (a la icmp) are implemented in this
> > protocol. If icmp is implemented with layer 4 protocols, you won't be
> > able to implement protocols built like CLNP.
> 
> You sure can. Consider a hardware bridge that talks plain IP (e.g over
> ethernet) on one port, and IP over CLNP on another. If CLNP requires
> that icmp packets are encapsulated differently from other ip packets,
Why? they are just packets. IMHO there are two ways to push a connection 
through incompatible link. You either encapsulate the connection, wrapping
all packets with an additional header and unpack them unchanged on the
other end, or you make a proxy that translates control packets into control
and data into data, both at packet level.
But I have no idea how this supports implementing icmp in Level4. Icmp is
just an addition to ip to make the protocol complete.
> then the bridge will have to look into the IP packets it receives and
> do the right thing. (I don't know CLNP, but that's the way it has to
> work if you want to use it with IP and interoperate with any other
> link technology). A hurd CLNP driver could do the same thing, just
> looking into the ip packets to figure out how to transmit them. And I
> don't think we need any tweaks in the interfaces to optimize for the
> CLNP case.
> 
> > I want to have ip and icmp to run together, because they provide the two
> > services of the layer 3 : data transmission, and control.
> 
> That sentence is very strange to me. To me, layer 3 is only one single
> and quite primitive service: transport of IP datagrams between nodes
> in the network. No more, no less. (And I don't think about the routing
> that layer 3 has to under the hood as a service: A layer 3 user will
> ask layer 3 "please deliver this IP packet", it won't ask "please
> figure out the route for this packet" or "please figure out the path
> MTU to this remote address").
> 
> > If you receive an icmp packet, you'll have enough information (ip
> > header + 8 bytes of the layer 4 packet) to know to who you will
> > notify this error.
> 
> You also need to understand some of layer 4. If you have a received
> icmp packet and a bunch of clients, then you need some more
> information about your clients, like "client 1 sends tcp-packets from
> address 1.2.3.4" or "client 2 sends udp-packets from address 5.6.7.8,
> port 47".
But that is why they register their IPs with L2, isn't it? You need the 
same for an ordinary ip packet.
> 
> You could have a subscription interface where clients specifies to the
> layer 3 code what an icmp packet should look like in order to be
> interesting. But then it's more general to have a subscription
> interface where client can specify what an *ip* packet should look like
> in order to be interesting. And then the layer 3 code need not know
> anything in particular about icmp.
> 
> How powerful rules do we need? I can think of at least four levels of power:
> 
> 1. IP address only. Say what ip-addresses you want packets for.
> 2. IP address and upperlevel protocol code.
> 3. IP address, upperlevel protocol code, prefix of upperlevel packet.
> 4. Regexp matched against full ip packet.
> 
> With (1), we can really only have at most one transport server for
> each ip-address. (2) is quite useless, as it's not powerful enough to
> distinguish between icmp messages for different transport servers. I
> hope (3) is probably powerful enough to handle several transport
> servers, icmp messages, ipsec security associations, etc.
> 
> > > That sounds odd. Port number space is a part of layer 4, not layer 3.
> > >>
> > I know it. but, I thought about the way it had to be implemented.
> > - in the layer 4. If you run (at least) 2 layer 4 translators, I've
> > found race conditions that will disturb the service.
> 
> I think it's a reasonable restriction that you can't have two
> transport programs/translators do the same protocol (udp or tcp) on
> the same ip address. That should solve the problem, I think.
> 
> And if you later want to get rid of that restriction, you need to
> figure out some service and/or protocol that lets two processes to
> share port number space. But that should still be responsibility of
> the (multiple) transport servers, and it can be implemented without
> any modifications to any components in the stack above or below the
> transport server.
> 
> > - in the layer 3 translator. This doesn't respect ISO layers, BUT :
> >  * L3 translators will only know L-4 tyranslators want to get a number,
> > L3 translator will not know it is a network port. Only a number !!!
> 
> It also has to manage seperate namespaces, port numbers are local to
> an ip-address and protocol.
But you already need to specify which ip you are interested in which is
protocol local. Adding the last number to tell exacly what you want may
make things simpler.  On the other hand, the Level4 may need only one
port (pair) per ip on the 3/4 interface as opposed to a port per socket.
> 
> >  * no race condition is possible (it is usefull, AFAIK)
> >  * less rpc calls.
> 
> I don't see what you gain by putting it into layer 3 rather than in a
> general "number allocation" server. I really think the allocation and
> mangement belongs in the transport server.
> 
> > This is a way of avoiding too much mach port allocation/release. For
> > tcp, we need a mach port (and a network port) for each session.
> 
> The program that communicates with the transport service should have
> one port per open socket, just like for any other open files. In the
> transport server's end, that port should be associated with a sending
> address and port (and that applies to *both* udp and tcp), and for tcp
> also with the address and port at the remote end.
This sounds reasonable.
> 
> For communication between the transport server and the ip (layer 3)
> server, you should not need more than one or two ports. The way I see
> it, the transport server should give the ip server complete ip
> datagrams, with source and destination addresses and port numbers
> already filled in. There may be other calls to figure out available
> addresses, source and destination address selection, etc, but nothing
> that has to be done for every packet.
This will be needed for routing. But for packets that are used by
connections to/from local machine you are usually not interested in protocol
specific details. But it may be easier to cope with them than designing
a protocol independent abstraction ;)

> 
> > This is beautifull, here is what I think :
> > 
> >   +---------------------+
> >   | random posix socket |
> >   |    applications     |
> >   +---------------------+
> > --------------------------- The standard socket API
> >   +--------------+
> >   | glibc glue   | (with some help of L-4 and L3 translators. as you said
> >   +--------------+  socket.defs will be split here, in -lsocket)
> > --------------------------- Layer 4+ interface
> >   +----------------------+ 
> >   | transport protocols  |
> >   +----------------------+
> > --------------------------- Layer 3/4 interface
> >   +----------------------------------+
> >   | layer 3 data transmission +      |
> >   | control + <<numbers>> allocation |
> >   +----------------------------------+
> > --------------------------- Layer 2/3 interface
> >   +--------------------+
> >   | layer 2 stuff      |
> >   +--------------------+
> > --------------------------- Kernel/user land
> >   +--------------+
> >   | network card |
> >   +--------------+
> > --------------------------- Physical interface
> 
> > Please think about it. But take your time to answer, I won't be there in
> > the next days. :)
> 
> I think my main remaining objections to this model concerns "layer 3/4
> interface". I want this interface to be only read and write ip packets
> (plus some for configuration of addresses, etc). I want to move the
> responsibility for "control + <<numbers>> allocation" up one level.
> The transport server should be the only component that knows details
> about port numbers.
I do not completely agree with this. Ip address is just a few numbers, port
is another number. You may choose to interpret som numbers in one layer and
other numbers elsewhere. But I do not see any reason for this as the numbers
are all available at the same level. 
As for generating icmp packets: It is not very consistent if some program
generates host-unreachable and other connection-refused. But with your 
separation of ip:s from ports the knowledge about ip and port is divided 
unneccessarily.
> 
> You also put some routing into layer 2, I'd prefer to move that up one
> level, either into the "layer 3" block, or into a separate
> process that talks to the "layer 2" block, independently from the
> layer 3 block that doesn't handle forwarded traffic.
That sounds good, routing processes and firewlalls should be able to
register some addresses with L2 and junk/forward them as they wish.
This brings another problem: routing process probably wants _all_ packets
that that are not received by somebody else and http proxy could want
all packets to port 80 whatever the destination address. Firewalls would
want even finer granularity if they aren't going to be implemented as
a proxy at 3/4 or somesuch.
There should be some order in which the rules for delivering packets
are evaluated.
ie driver <-> L2 <-> L3 <-> firewall - L3' <-> L4, router

-- 
Michal Suchanek
hramrach@centrum.cz
[Prev in Thread]
Current Thread
[Next in Thread]
Re: Networking design proposal, Niels Möller, 2002/11/06
- Re: Networking design proposal, Michal 'hramrach' Suchanek <=
  - Re: Networking design proposal, Niels Möller, 2002/11/12
    - Re: Networking design proposal, Michal 'hramrach' Suchanek, 2002/11/12
    - Re: Networking design proposal, Niels Möller, 2002/11/13
    - Re: Networking design proposal, Michal 'hramrach' Suchanek, 2002/11/12
    - Re: Networking design proposal, Niels Möller, 2002/11/12
    - Re: Networking design proposal, Hisham Kotry, 2002/11/12
    - Re: Networking design proposal, Niels Möller, 2002/11/13
Prev by Date: Console-client xkb keyboard plugin
Next by Date: Re: Console-client xkb keyboard plugin
Previous by thread: Re: Networking design proposal
Next by thread: Re: Networking design proposal
Index(es):
- Date
- Thread