lwip-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lwip-users] Network Loss causes network to go down


From: Jan Menzel
Subject: Re: [lwip-users] Network Loss causes network to go down
Date: Thu, 2 Apr 2020 00:16:40 +0200
User-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:68.0) Gecko/20100101 Thunderbird/68.6.0

Hi Dan!
        Its a fundamental thing with TCP and UDP that there is not way to know,
that the link is valid but the other side just did not had anything to
say unless you added it to your protocol. Using cable detection in the
phy might help, but only a little bit as most devices talk to other
devices that are multiple hops away and any problem between the first
hop and the remote side are not covered then.
        I once had a project where a LWIP driven device had to send small
amount of data ever few minutes to a remote server that was located on
the other side of the world. This devices had nobody to maintain it like
watching for status leds or pressing a reset knob.
        We finally solved the issue using multiple points:
- enabled timeouts for receive and transmit to never block. Thats what
we observed first: the transmit loop was blocked due to to many unacked
data.
- use some watchdog/ping command in the protocol to detect if the
connection to the remote side is still valid/active. Thats the only way
to be sure, the remote side is still reachable.
        The F427 has lots of RAM. You might provide more storage to allow more
then one connection.

        Jan

On 01.04.2020 00:45, Bomsta, Dan wrote:
> Thanks Trampas!  You went down the same road I started on with the
> ability to detect the disconnection.  I was told we cannot change the
> hardware, so I am trying to find the correct location in Lwip to “clean
> up” when there is no route.  Just haven’t found it yet.
> 
>  
> 
> The correct solution in using the carrier sense pin.
> 
>  
> 
> Thanks again,
> 
> Dan
> 
>  
> 
> *From:*lwip-users
> <lwip-users-bounces+dan.bomsta=address@hidden> *On Behalf Of
> *Trampas Stern
> *Sent:* Tuesday, March 31, 2020 5:11 PM
> *To:* Mailing list for lwIP users <address@hidden>
> *Subject:* Re: [lwip-users] Network Loss causes network to go down
> 
>  
> 
> Dan,
> 
>  
> 
> I had been using a RMII phy and noticed similar issues.  Specifically I
> could not detect cable connected or disconnected using the RMII, the
> link activity bit does not work for cable connected/disconnected.  I was
> shocked to find that RMII did not provide this functionality.  Sure the
> LED blink but no bits in the RMII register map changes when cable is
> inserted or not. 
> 
>  
> 
> We had problems where it timeouts in setting up the netif/lwip  were
> causing watch dog time outs if cable was unplugged on boot. Then when
> cable was removed and reinserted (not a normal case for our device) we
> were having similar issues as you.  We could not find a way to know if
> the cable was connected or not, except by redesigning hardware.  So we
> are redesigning hardware to use hardware cable connection detection
> using PHY carrier sense pin.   The plan was try and shutdown lwip and
> restart netif and lwip when cable was plugged back in.  
> 
>  
> 
> Trampas
> 
>  
> 
>  
> 
> On Tue, Mar 31, 2020 at 5:51 PM Bomsta, Dan <address@hidden
> <mailto:address@hidden>> wrote:
> 
>     I have am working  on an issue with our device.  First here are the
>     particulars
> 
>     Micro:  STM32F427
> 
>     LwIP: 2.0.3
> 
>     WolfSSL: 3.15.3
> 
>     Microchip managed switch KSZ8863RLLI.
> 
>      
> 
>     If one removes the Ethernet cable from our device (typically during
>     heavy traffic) the network is not usable for many minutes upon
>     reconnecting the Ethernet cable.  I have been printing WolfSSL debug
>     messages and Lwip debug messages without solving the issue yet.
> 
>      
> 
>     No, we do not have the ability to “hardware” determine when the
>     cable is unplugged.  Our system has these devices “daisy chained” so
>     one would need to ping to determine if the network is available. 
>     The last good trace I got is this (then my laptop crashed)
> 
>      
> 
>     ip4_route: No route to 172.20.0.14
> 
>     data=0x0
> 
>     =0x0)
> 
>     sum)
> 
>     4ac (0x14ac, 0x14ac, 0x64080000)
> 
>     )
> 
>     x64080000)
> 
>     ac, 0x14ac, 0x64080000)
> 
>     JSH˛ÓdĘÚ@¨0Ď$ü‡ouóćq
> 
>     tcp_slowtmr: processing active pcb
> 
>     data=0x0
> 
>     =0x0)
> 
>     sum)
> 
>     4ac (0x14ac, 0x14ac, 0x64080000)
> 
>     )
> 
>     x64080000)
> 
>     ac, 0x14ac, 0x64080000)
> 
>     JSH˛ÓdĘÚ@¨0Ď$ü‡ouóćq
> 
>     tcp_slowtmr: processing active pcb
> 
>     data=0x0
> 
>     =0x0)
> 
>     sum)
> 
>     4ac (0x14ac, 0x14ac, 0x64080000)
> 
>     )
> 
>     x64080000)
> 
>     ac, 0x14ac, 0x64080000)
> 
>     JSH˛ÓdĘÚ@¨0Ď$ü‡ouóćq
> 
>     tcp_slowtmr: polling application
> 
>     b
> 
>     data=0x0
> 
>     =0x0)
> 
>     sum)
> 
>     4ac (0x14ac, 0x14ac, 0x64080000)
> 
>     )
> 
>     x64080000)
> 
>     ac, 0x14ac, 0x64080000)
> 
>     JSH˛ÓdĘÚ@¨0Ď$ü‡ouóćq
> 
>     ip4_route: No route to 172.20.0.14
> 
>     data=0x0
> 
>     =0x0)
> 
>     sum)
> 
>     4ac (0x14ac, 0x14ac, 0x64080000)
> 
>     )
> 
>     x64080000)
> 
>     ac, 0x14ac, 0x64080000)
> 
>     JSH˛ÓdĘÚ@¨0Ď$ü‡ouóćq
> 
>     [512942] Link up? 0
> 
>     tcp_slowtmr: processing active pcb
> 
>     data=0x0
> 
>     =0x0)
> 
>     sum)
> 
>     4ac (0x14ac, 0x14ac, 0x64080000)
> 
>     )
> 
>     x64080000)
> 
>     ac, 0x14ac, 0x64080000)
> 
>     JSH˛ÓdĘÚ@¨0Ď$ü‡ouóćq
> 
>     tcp_slowtmr: processing active pcb
> 
>     data=0x0
> 
>     =0x0)
> 
>     sum)
> 
>     4ac (0x14ac, 0x14ac, 0x64080000)
> 
>     )
> 
>     x64080000)
> 
>     ac, 0x14ac, 0x64080000)
> 
>     JSH˛ÓdĘÚ@¨0Ď$ü‡ouóćq
> 
>     tcp_slowtmr: polling application
> 
>     b
> 
>     data=0x0
> 
>     =0x0)
> 
>     sum)
> 
>     4ac (0x14ac, 0x14ac, 0x64080000)
> 
>     )
> 
>     x64080000)
> 
>     ac, 0x14ac, 0x64080000)
> 
>     JSH˛ÓdĘÚ@¨0Ď$ü‡ouóćq
> 
>     ip4_route: No route to 172.20.0.14
> 
>     data=0x0
> 
>     =0x0)
> 
>     sum)
> 
>     4ac (0x14ac, 0x14ac, 0x64080000)
> 
>     )
> 
>     x64080000)
> 
>     ac, 0x14ac, 0x64080000)
> 
>     JSH˛ÓdĘÚ@¨0Ď$ü‡ouóćq
> 
>     tcp_slowtmr: processing active pcb
> 
>     data=0x0
> 
>     =0x0)
> 
>     sum)
> 
>     4ac (0x14ac, 0x14ac, 0x64080000)
> 
>     )
> 
>     x64080000)
> 
>     ac, 0x14ac, 0x64080000)
> 
>     JSH˛ÓdĘÚ@¨0Ď$ü‡ouóćq
> 
>     tcp_slowtmr: processing active pcb
> 
>     data=0x0
> 
>     =0x0)
> 
>     sum)
> 
>     4ac (0x14ac, 0x14ac, 0x64080000)
> 
>     )
> 
>     x64080000)
> 
>     ac, 0x14ac, 0x64080000)
> 
>     JSH˛ÓdĘÚ@¨0Ď$ü‡ouóćq
> 
>     tcp_slowtmr: polling application
> 
>     b
> 
>     data=0x0
> 
>     =0x0)
> 
>     sum)
> 
>     4ac (0x14ac, 0x14ac, 0x64080000)
> 
>     )
> 
>     x64080000)
> 
>     ac, 0x14ac, 0x64080000)
> 
>     JSH˛ÓdĘÚ@¨0Ď$ü‡ouóćq
> 
>     ip4_route: No route to 172.20.0.14
> 
>     data=0x0
> 
>     =0x0)
> 
>     sum)
> 
>     4ac (0x14ac, 0x14ac, 0x64080000)
> 
>     )
> 
>     x64080000)
> 
>     ac, 0x14ac, 0x64080000)
> 
>     JSH˛ÓdĘÚ@¨0Ď$ü‡ouóćq
> 
>     [514966] Link up? 0
> 
>     tcp_slowtmr: processing active pcb
> 
>     data=0x0
> 
>     =0x0)
> 
>     sum)
> 
>     4ac (0x14ac, 0x14ac, 0x64080000)
> 
>     )
> 
>     x64080000)
> 
>     ac, 0x14ac, 0x64080000)
> 
>     JSH˛ÓdĘÚ@¨0Ď$ü‡ouóćq
> 
>     tcp_slowtmr: processing active pcb
> 
>     data=0x0
> 
>     =0x0)
> 
>     sum)
> 
>     4ac (0x14ac, 0x14ac, 0x64080000)
> 
>     )
> 
>     x64080000)
> 
>     ac, 0x14ac, 0x64080000)
> 
>     JSH˛ÓdĘÚ@¨0Ď$ü‡ouóćq
> 
>     tcp_slowtmr: polling application
> 
>     b
> 
>     data=0x0
> 
>     =0x0)
> 
>     sum)
> 
>     4ac (0x14ac, 0x14ac, 0x64080000)
> 
>     )
> 
>     x64080000)
> 
>     ac, 0x14ac, 0x64080000)
> 
>     JSH˛ÓdĘÚ@¨0Ď$ü‡ouóćq
> 
>     ip4_route: No route to 172.20.0.14
> 
>     data=0x0
> 
>     =0x0)
> 
>     sum)
> 
>     4ac (0x14ac, 0x14ac, 0x64080000)
> 
>     )
> 
>     x64080000)
> 
>     ac, 0x14ac, 0x64080000)
> 
>     JSH˛ÓdĘÚ@¨0Ď$ü‡ouóćq
> 
>     tcp_slowtmr: processing active pcb
> 
>     data=0x0
> 
>     =0x0)
> 
>     sum)
> 
>     4ac (0x14ac, 0x14ac, 0x64080000)
> 
>     )
> 
>     x64080000)
> 
>     ac, 0x14ac, 0x64080000)
> 
>     JSH˛ÓdĘÚ@¨0Ď$ü‡ouóćq
> 
>     [517070] Link up? 0
> 
>     tcp_slowtmr: processing active pcb
> 
>     data=0x0
> 
>     =0x0)
> 
>     sum)
> 
>     4ac (0x14ac, 0x14ac, 0x64080000)
> 
>     )
> 
>     x64080000)
> 
>     ac, 0x14ac, 0x64080000)
> 
>     JSH˛ÓdĘÚ@¨0Ď$ü‡ouóćq
> 
>     tcp_slowtmr: polling application
> 
>     b
> 
>     data=0x0
> 
>     =0x0)
> 
>     sum)
> 
>     4ac (0x14ac, 0x14ac, 0x64080000)
> 
>     )
> 
>     x64080000)
> 
>     ac, 0x14ac, 0x64080000)
> 
>     JSH˛ÓdĘÚ@¨0Ď$ü‡ouóćq
> 
>     ip4_route: No route to 172.20.0.14
> 
>     data=0x0
> 
>     =0x0)
> 
>     sum)
> 
>     4ac (0x14ac, 0x14ac, 0x64080000)
> 
>     )
> 
>     x64080000)
> 
>     ac, 0x14ac, 0x64080000)
> 
>     JSH˛ÓdĘÚ@¨0Ď$ü‡ouóćq
> 
>     T
> 
>      
> 
>     Any ideas would be greatly appreciated.
> 
>      
> 
>     Dan
> 
>     _______________________________________________
>     lwip-users mailing list
>     address@hidden <mailto:address@hidden>
>     https://lists.nongnu.org/mailman/listinfo/lwip-users
>     
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.nongnu.org_mailman_listinfo_lwip-2Dusers&d=DwMFaQ&c=tkg6qBpVKaymQv9tTEpyCv5e23C4oKrSdZwjE7Q68Ts&r=-BFNB_Gh7uVsB52WcbiFwUj7o24P6Ed-I3ShWtNr6WI&m=tTdQEdg2QwqZRBlEpezVVPzGuYEtrth1C8oixb1w_pc&s=YKP0gARz9hVNrJIaM7SyZUrn3gq4MsOiny9JGHeSUeA&e=>
> 
> 
> _______________________________________________
> lwip-users mailing list
> address@hidden
> https://lists.nongnu.org/mailman/listinfo/lwip-users
> 

Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]