|
From: | Josef Bacik |
Subject: | Re: [PATCH] tcp: ack when we get an OOO/lost packet |
Date: | Tue, 18 Aug 2015 10:58:57 -0700 |
User-agent: | Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.7.0 |
On 08/17/2015 05:38 AM, Andrei Borzenkov wrote:
On Thu, Aug 13, 2015 at 4:59 PM, Josef Bacik <address@hidden> wrote:On 08/13/2015 04:19 AM, Andrei Borzenkov wrote:On Wed, Aug 12, 2015 at 6:16 PM, Josef Bacik <address@hidden> wrote:While adding tcp window scaling support I was finding that I'd get some packet loss or reordering when transferring from large distances and grub would just timeout. This is because we weren't ack'ing when we got our OOO packet, so the sender didn't know it needed to retransmit anything, so eventually it would fill the window and stop transmitting, and we'd time out. Fix this by ACK'ing when we don't find our next sequence numbered packet. With this fix I no longer time out. Thanks,I have a feeling that your description is misleading. Patch simply sends duplicated ACK, but partner does not know what has been received and what has not, so it must wait for ACK timeout anyway before retransmitting. What this patch may fix would be lost ACK packet *from* GRUB, by increasing rate of ACK packets it sends. Do you have packet trace for timeout case, ideally from both sides simultaneously?The way linux works is that if you get <configurable amount> of DUP ack's it triggers a retransmit. I only have traces from the server since tcpdump doesn't work in grub (or if it does I don't know how to do it). The server is definitely getting all of the ACK's,
(Sorry was traveling for Linux Plumbers.)
In packet trace you sent me there was almost certain ACK loss for the segment 20801001- 20805881 (frame 19244). Note that here recovery was rather fast - server started retransmission after ~0.5sec. It is unlikely lost packet from server - next ACK from GRUB received by server was for 20803441, which means it actually got at least initial half of this segment. Unfortunately some packets are missing in capture (even packets *from* server), which makes it harder to interpret. After this server went down to 512 segment size and everything went more or less well, until frame 19949. Here the server behavior is rather interesting. It starts retransmission with initial timeout ~6sec, even though it received quite a lot of DUP ACKs; and doubling it every time until it hits GRUB timeout (~34 seconds).
Yeah that's the normal re-transmission timeout. This tcpdump was on a non-patched grub. We only sent 3 dup acks, we have the dup ack counter stuff set to like 13 or something like that so we have to get a lot before it triggers the dup ack retransmit logic.
Note the difference in behavior between the former and the latter. Did you try to ask on Linux networking list why they are so different?
I'll run it by our networking guys when they show up.
OTOH GRUB probably times out too early. Initial TCP RFC suggests 5 minutes general timeout and RFC1122 - at least 100 seconds. It would be interesting to increase connection timeout to see if it recovers. You could try bumping GRUB_NET_TRIES to 82 which result in timeout slightly over 101 sec. Also it seems that huge window may aggravate the issue. According to trace, 10K is enough to fill pipe and you set it to 1M. It would be interesting to see the same with default windows size.
Oh yeah the problem doesn't happen with a normal window size, it's only with the giant window size. I'm not sure where you are getting the 10k number, believe me if I could have gotten around this by just jacking up the normal window size I would have done it. When I set it to the max (64k I think?) I get a transfer rate of around 200 kb/s, which is not fast enough to pull down our 250mb image. With the 1mb window I get 5.5 mb/s, so there is a real benefit to the giant window. Thanks,
Josef
[Prev in Thread] | Current Thread | [Next in Thread] |