[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [lwip-users] Recent tcp_rexmit() changes
From: |
Sam Jansen |
Subject: |
Re: [lwip-users] Recent tcp_rexmit() changes |
Date: |
Tue, 27 Jul 2004 13:34:09 +1200 |
User-agent: |
Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7) Gecko/20040616 |
K.J. Mansley wrote:
On Sun, 2004-07-25 at 20:46, Karl Jeacle wrote:
The timeout would be OK, and as expected, if it happened just once, but a
timeout is taking place at each RTT... the sender is stalled for 500ms at
a time instead of one RTT at a time. I hope I am making some sense!
Yes, I can see your problem. You'll end up with retransmitted segments
(sent every 500ms) interleaving with new segments (when an ACK for a
retransmitted one is received), and a lot of timeouts will be necessary
to recover. I think, although can't be sure, that this is the intended
behaviour. Complex cases such as this are rarely used when illustrating
protocols though, so if anyone knows otherwise, or has a few minutes to
see how a Linux or BSD stack behaves in this scenario, I'd be very
interested to hear from them.
After a retransmit timeout, the sender should be in slow start. lwIP
seems to miss this fundamental point. RFC 2001 states that
"Therefore, after retransmitting the dropped segment the TCP sender uses
the slow start algorithm to increase the window from 1 full-sized
segment to the new value of ssthresh, at which point congestion
avoidance again takes over."
It is because of this lack in functionality that lwIP behaves poorly
when there is enough loss that fast retransmit cannot cope and a timeout
occurs.
> That is a possibility, and although I don't like dumping the whole
> unacked queue on the unsent queue just in case it's necessary, it would
> solve your problem. My only worry is that it might result (if we're not
> careful) in a large number of segments being put on the network as a
> result of a loss, which is completely the opposite of what the sender
> should be doing.
I see why you might be concerned, the behaviour I found before the fast
retransmit fix showed such happening. However this will not happen if
lwIP enters slow start correctly. I have implemented this behaviour and
now lwIP behaves well enough during loss.
Changes I made:
* Added a tcp_rexmit_rto function that is a clone of the old tcp_rexmit
function
* Made sure this function was called AFTER cwnd is set to 1 mss in
tcp_slowtmr
* Uncommented the old code which allowed acks to ack data in the unsent
queue
* Made a small modification to tcp_output which checks the sequence
number of the packet just sent. This was needed because a packet sent
with a fast retransmit would end up on the end of the unacked queue,
even though it should be at the start of the queue.
Attached is a diff. I made it against the stable version, but it looks
like it applies fine to HEAD.
I think this is the best way to proceed, and I believe it's designed to
solve almost exactly this problem. I don't think it will involve too
much work, so may have a look later today. Perhaps if I do get
something coded up you'd be willing to test/debug it for us?
I'm not certain SACK is all it is thought to be. Research in this area
by myself as well as recent analysis points out that SACK almost never
makes a difference. However, there are other strategies you can employ
to improve throughput.
Consider this: FreeBSD does not implement SACK (though I have heard it
is making its way into -CURRENT) and OpenBSD does. However, FreeBSD
vastly outperforms OpenBSD, even in lossy situations. I've also measured
Linux (2.4.20 at the time) to get no real difference in throughput with
or without SACK in laboratory tests.
The most interesting research I have seen that improves throughput under
loss is the (somewhat) recent Westwood congestion control algorithm that
is a part of Linux since 2.4.26 (and 2.6.something). I've found its
quite an improvement on its own, though with the help of SACK it's even
better.
See:
http://www-ictserv.poliba.it/mascolo/tcp%20westwood/homeW.htm
and
http://www-ictserv.poliba.it/mascolo/tcp%20westwood/Tech_Rep_07_03_S.pdf
for information on Westwood.
Failing that, perusing the sources of FreeBSD shows all sorts of tricks
they use to improve performance at little cost.
--
Sam Jansen address@hidden
Wand Network Research Group http://www.wand.net.nz/~stj2
Index: src/core/tcp.c
===================================================================
RCS file: /home/stj2/cvs/nsc/lwip/src/core/tcp.c,v
retrieving revision 1.1
diff -u -r1.1 tcp.c
--- src/core/tcp.c 1 Jun 2004 20:54:22 -0000 1.1
+++ src/core/tcp.c 27 Jul 2004 01:26:50 -0000
@@ -610,7 +610,6 @@
if (pcb->state != SYN_SENT) {
pcb->rto = ((pcb->sa >> 3) + pcb->sv) << tcp_backoff[pcb->nrtx];
}
- tcp_rexmit(pcb);
/* Reduce congestion window and ssthresh. */
eff_wnd = LWIP_MIN(pcb->cwnd, pcb->snd_wnd);
pcb->ssthresh = eff_wnd >> 1;
@@ -620,6 +619,9 @@
pcb->cwnd = pcb->mss;
LWIP_DEBUGF(TCP_CWND_DEBUG, ("tcp_slowtmr: cwnd %u ssthresh %u\n",
pcb->cwnd, pcb->ssthresh));
+
+ /* The following needs to be called AFTER cwnd is set to one mss - STJ
*/
+ tcp_rexmit_rto(pcb);
}
}
/* Check if this PCB has stayed too long in FIN-WAIT-2 */
Index: src/core/tcp_out.c
===================================================================
RCS file: /home/stj2/cvs/nsc/lwip/src/core/tcp_out.c,v
retrieving revision 1.2
diff -u -r1.2 tcp_out.c
--- src/core/tcp_out.c 16 Jul 2004 06:03:25 -0000 1.2
+++ src/core/tcp_out.c 27 Jul 2004 00:57:47 -0000
@@ -462,8 +462,16 @@
pcb->unacked = seg;
useg = seg;
} else {
- useg->next = seg;
- useg = useg->next;
+ /* In the case of fast retransmit, the packet should not go to the end
+ * of the unacked queue, but rather at the start. We need to check for
+ * this case. -STJ Jul 27, 2004 */
+ if (TCP_SEQ_LT(ntohl(seg->tcphdr->seqno), ntohl(useg->tcphdr->seqno)))
{
+ seg->next = pcb->unacked;
+ pcb->unacked = seg;
+ } else {
+ useg->next = seg;
+ useg = useg->next;
+ }
}
} else {
tcp_seg_free(seg);
@@ -566,6 +574,33 @@
ip_output(p, local_ip, remote_ip, TCP_TTL, 0, IP_PROTO_TCP);
pbuf_free(p);
LWIP_DEBUGF(TCP_RST_DEBUG, ("tcp_rst: seqno %lu ackno %lu.\n", seqno,
ackno));
+}
+
+void
+tcp_rexmit_rto(struct tcp_pcb *pcb)
+{
+ struct tcp_seg *seg;
+
+ if (pcb->unacked == NULL) {
+ return;
+ }
+
+ /* Move all unacked segments to the unsent queue. */
+ for (seg = pcb->unacked; seg->next != NULL; seg = seg->next);
+ seg->next = pcb->unsent;
+ pcb->unsent = pcb->unacked;
+ pcb->unacked = NULL;
+
+ pcb->snd_nxt = ntohl(pcb->unsent->tcphdr->seqno);
+
+ ++pcb->nrtx;
+
+ /* Don't take any rtt measurements after retransmitting. */
+ pcb->rttest = 0;
+
+ /* Do the actual retransmission. */
+ tcp_output(pcb);
+
}
void
Index: src/core/tcp_in.c
===================================================================
RCS file: /home/stj2/cvs/nsc/lwip/src/core/tcp_in.c,v
retrieving revision 1.2
diff -u -r1.2 tcp_in.c
--- src/core/tcp_in.c 16 Jul 2004 06:03:25 -0000 1.2
+++ src/core/tcp_in.c 27 Jul 2004 01:31:20 -0000
@@ -817,8 +817,11 @@
in fact have been sent once. */
/* KJM 13th July 2004
I don't think is is necessary as we no longer move all unacked
- segments on the unsent queue when performing retransmit */
- /*
+ segments on the unsent queue when performing retransmit
+
+ STJ 27 July 2004
+ Actually we need to again!
+ */
while (pcb->unsent != NULL &&
TCP_SEQ_LEQ(ntohl(pcb->unsent->tcphdr->seqno) +
TCP_TCPLEN(pcb->unsent),
ackno) &&
@@ -843,7 +846,6 @@
pcb->snd_nxt = htonl(pcb->unsent->tcphdr->seqno);
}
}
- */
/* End of ACK for new data processing. */
- Re: [lwip-users] Recent tcp_rexmit() changes, (continued)
- Re: [lwip-users] Recent tcp_rexmit() changes, Karl Jeacle, 2004/07/23
- Re: [lwip-users] Recent tcp_rexmit() changes, Karl Jeacle, 2004/07/25
- Re: [lwip-users] Recent tcp_rexmit() changes, K.J. Mansley, 2004/07/26
- Re: [lwip-users] Recent tcp_rexmit() changes, Karl Jeacle, 2004/07/26
- Re: [lwip-users] Recent tcp_rexmit() changes, Leon Woestenberg, 2004/07/26
- Re: [lwip-users] Recent tcp_rexmit() changes, K.J. Mansley, 2004/07/26
- Re: [lwip-users] Recent tcp_rexmit() changes, Leon Woestenberg, 2004/07/26
- Re: [lwip-users] Recent tcp_rexmit() changes,
Sam Jansen <=
- Re: [lwip-users] Recent tcp_rexmit() changes, Kieran Mansley, 2004/07/27