lwip-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lwip-users] TCP bandwidth limited by rate of ACKs


From: Mason
Subject: Re: [lwip-users] TCP bandwidth limited by rate of ACKs
Date: Wed, 12 Oct 2011 18:01:40 +0200
User-agent: Mozilla/5.0 (Windows NT 5.1; rv:7.0.1) Gecko/20110928 Firefox/7.0.1 SeaMonkey/2.4.1

Kieran Mansley wrote:

> Mason wrote:
> 
>> In the UDP test, any frame larger than 1500 bytes was
>> not passed to lwip. The bogus ACK was sent directly
>> from the RX context instead.
> 
> OK, I'm not at all surprised that there is a big difference between the
> performance of that test, and the TCP performance you first measured.
> 
> In the TCP case using the sockets API, you have:
> 1) 2 copies in each direction (app to/from lwIP, lwIP to/from your
> driver's buffer descriptors)
> 2) all the TCP protocol processing (which is probably not a big deal)
> 3) 2 context switches in each direction (driver to/from lwIP, lwIP
> to/from app)
> 
> This is going to put a big dent in the throughput.   You can avoid a lot
> of that by using the raw lwIP API.  If you have to stick with the
> sockets API you may be stuck with relatively low throughput.

I didn't want to believe Simon when he wrote "From reading the wireshark
capture, it seems that the receiver simply is not fast enough."

So I profiled the system, and, indeed, the 450-MHz CPU is running
full tilt (cf. the profile analysis of the STB receiving 3500 MB
of data over TCP, at the end of this message.)

I need to make sense of the time spent inside _md_kernel_intr_mask,
but I fear I don't have much wiggle-room, aside from using the raw API.

Regards,

Mason


OS21 profile analysis for main.out
(16 bytes per bucket, sampled at 1000 Hz, system wide profile)

 515.859 seconds (100.00%) :  Total duration

Task breakdown:
 218.810 seconds ( 42.42%) :  tcpip_thread
 144.978 seconds ( 28.10%) :  rxapp
 121.828 seconds ( 23.62%) :  RxTask
   0.476 seconds (  0.09%) :  Idle Task
   0.350 seconds (  0.07%) :  STLAYER-GFX/CUR
   0.120 seconds (  0.02%) :  STLAYER-GFX/CUR
   0.092 seconds (  0.02%) :  STDVM_ServiceTa
   0.071 seconds (  0.01%) :  STLAYER-GFX/CUR
   0.041 seconds (  0.01%) :  phy: link

Interrupt level breakdown:
  27.996 seconds (  5.43%) :  Interrupt level 8
   0.968 seconds (  0.19%) :  Interrupt level 13
   0.085 seconds (  0.02%) :  Interrupt level 15

Symbolic breakdown:
 261.467 seconds ( 50.69%) :  _md_kernel_intr_mask 
  38.134 seconds (  7.39%) :  memcpy 
  14.321 seconds (  2.78%) :  stxmac_dma_isr 
  13.661 seconds (  2.65%) :  tcp_input 
   9.797 seconds (  1.90%) :  _md_timer_system_time 
   7.401 seconds (  1.43%) :  _st40_kernel_interrupt_handler600 
   7.133 seconds (  1.38%) :  tcp_receive 
   6.450 seconds (  1.25%) :  _malloc_r 
   5.746 seconds (  1.11%) :  lwip_recvfrom 
   5.469 seconds (  1.06%) :  stxmac_dma_rx_start 
   4.537 seconds (  0.88%) :  _free_r 
   4.333 seconds (  0.84%) :  mutex_release 
   4.216 seconds (  0.82%) :  event_callback 
   4.189 seconds (  0.81%) :  ip_input 
   3.878 seconds (  0.75%) :  stxmac_dma_tx_start 
   3.572 seconds (  0.69%) :  _md_kernel_intr_mask_all 
   3.331 seconds (  0.65%) :  mutex_lock 
   3.315 seconds (  0.64%) :  __divdi3 
   3.016 seconds (  0.58%) :  device_ioctl 
   2.950 seconds (  0.57%) :  _os21_scheduler_lock 
   2.887 seconds (  0.56%) :  ip_output_if 
   2.830 seconds (  0.55%) :  _message_q_remove 
   2.806 seconds (  0.54%) :  __udiv_qrnnd_16 
   2.799 seconds (  0.54%) :  pbuf_alloc 
   2.746 seconds (  0.53%) :  sys_timeouts_mbox_fetch 
   2.701 seconds (  0.52%) :  semaphore_wait 
   2.700 seconds (  0.52%) :  interrupt_unmask 
   2.685 seconds (  0.52%) :  semaphore_wait_timeout 
   2.628 seconds (  0.51%) :  __libc_lock_init_complete_recursive 
   2.514 seconds (  0.49%) :  sys_arch_mbox_fetch 
   2.373 seconds (  0.46%) :  osplus_physical_address 
   2.308 seconds (  0.45%) :  pbuf_header 
   2.190 seconds (  0.42%) :  _message_q_add 
   2.137 seconds (  0.41%) :  _os21_task_is_valid 
   2.121 seconds (  0.41%) :  ethernet_input 
   2.101 seconds (  0.41%) :  recv_tcp 
   1.838 seconds (  0.36%) :  ethernet_api_ioctl 
   1.746 seconds (  0.34%) :  _os21_interrupt_handler 
   1.697 seconds (  0.33%) :  cache_purge_data 
   1.688 seconds (  0.33%) :  lwip_standard_chksum 
   1.654 seconds (  0.32%) :  _st40_cache_purge_data_work_1 
   1.582 seconds (  0.31%) :  __malloc_lock 
   1.551 seconds (  0.30%) :  profile_start_all 
   1.549 seconds (  0.30%) :  pbuf_copy_partial 
   1.485 seconds (  0.29%) :  __malloc_unlock 
   1.479 seconds (  0.29%) :  tcp_output 
   1.450 seconds (  0.28%) :  message_receive_timeout 
   1.444 seconds (  0.28%) :  ip4_addr_isbroadcast 
   1.414 seconds (  0.27%) :  ethernet_ops_do_async_tx 
   1.413 seconds (  0.27%) :  netconn_recv_data 
   1.335 seconds (  0.26%) :  etharp_output 
   1.321 seconds (  0.26%) :  do_recv 
   1.291 seconds (  0.25%) :  tcp_output_alloc_header 
   1.178 seconds (  0.23%) :  sys_arch_unprotect 
   1.175 seconds (  0.23%) :  _os21_scheduler_unlock 
   1.162 seconds (  0.23%) :  inet_chksum_pseudo 
   1.105 seconds (  0.21%) :  sys_arch_protect 
   1.090 seconds (  0.21%) :  semaphore_signal 
   1.086 seconds (  0.21%) :  sys_arch_sem_wait 
   1.073 seconds (  0.21%) :  sys_mbox_trypost 
   1.032 seconds (  0.20%) :  _os21_is_active 
   1.011 seconds (  0.20%) :  locate_handle 
   1.011 seconds (  0.20%) :  rx_task 
   0.982 seconds (  0.19%) :  tcp_recved 
   0.950 seconds (  0.18%) :  release_handle 
   0.947 seconds (  0.18%) :  __libc_lock_release_recursive 
   0.934 seconds (  0.18%) :  message_claim_timeout 
   0.922 seconds (  0.18%) :  tcp_parseopt 
   0.908 seconds (  0.18%) :  message_send 
   0.899 seconds (  0.17%) :  claim_handle 
   0.885 seconds (  0.17%) :  inet_chksum 
   0.881 seconds (  0.17%) :  ip_output 
   0.875 seconds (  0.17%) :  interrupt_mask 
   0.787 seconds (  0.15%) :  __libc_lock_acquire_recursive 
   0.786 seconds (  0.15%) :  pbuf_cat 
   0.760 seconds (  0.15%) :  time_now_set 
   0.728 seconds (  0.14%) :  tcpip_input 
   0.712 seconds (  0.14%) :  _os21_scheduler_exception_context 
   0.680 seconds (  0.13%) :  handle_frame 
   0.660 seconds (  0.13%) :  pbuf_free 
   0.621 seconds (  0.12%) :  osplus_handle 
   0.620 seconds (  0.12%) :  raw_input 
   0.607 seconds (  0.12%) :  icmp_input 
   0.584 seconds (  0.11%) :  _os21_vmem_util_min_page_size 
   0.584 seconds (  0.11%) :  message_release 
   0.568 seconds (  0.11%) :  ethernet_ops_do_async_rx 
   0.564 seconds (  0.11%) :  _md_timer_profiling_hz 
   0.563 seconds (  0.11%) :  netconn_recved 
   0.553 seconds (  0.11%) :  tcpip_thread 
   0.543 seconds (  0.11%) :  _os21_callback_intr_enter 
   0.530 seconds (  0.10%) :  malloc 
   0.510 seconds (  0.10%) :  pbuf_pool_is_empty 
   0.485 seconds (  0.09%) :  task_unlock 
   0.474 seconds (  0.09%) :  lwip_htons 
   0.473 seconds (  0.09%) :  netconn_recv_tcp_pbuf 
   0.435 seconds (  0.08%) :  lwip_read 
   0.434 seconds (  0.08%) :  sys_sem_signal 
   0.424 seconds (  0.08%) :  tcp_send_empty_ack 
   0.420 seconds (  0.08%) :  _os21_scheduler_pri 
   0.401 seconds (  0.08%) :  pbuf_take 
   0.396 seconds (  0.08%) :  low_level_output 
   0.387 seconds (  0.08%) :  lwip_close 
   0.375 seconds (  0.07%) :  task_lock 
   0.350 seconds (  0.07%) :  inet_chksum_pseudo_partial 
   0.340 seconds (  0.07%) :  wrapper 
   0.329 seconds (  0.06%) :  ip_route 
   0.319 seconds (  0.06%) :  tcp_update_rcv_ann_wnd 
   0.259 seconds (  0.05%) :  _scheduler_idle 
   0.253 seconds (  0.05%) :  release_buffer 
   0.252 seconds (  0.05%) :  mac_link_callback 
   0.234 seconds (  0.05%) :  lwip_ntohs 
   0.233 seconds (  0.05%) :  vmem_min_page_size 
   0.217 seconds (  0.04%) :  netconn_getaddr 
   0.217 seconds (  0.04%) :  _md_kernel_text_info 
   0.208 seconds (  0.04%) :  stxmac_dma_rx_abort 
   0.206 seconds (  0.04%) :  tcpip_apimsg 
   0.202 seconds (  0.04%) :  virtlay_NotifyDisplaySettings 
   0.196 seconds (  0.04%) :  lwip_ntohl 
   0.148 seconds (  0.03%) :  pbuf_realloc 
   0.139 seconds (  0.03%) :  _os21_kernel_state 
   0.135 seconds (  0.03%) :  stvtg_InterruptHandler 
   0.134 seconds (  0.03%) :  free 
   0.125 seconds (  0.02%) :  stevt_CallBothNotifyCB 
   0.124 seconds (  0.02%) :  _os21_callback_intr_exit 
   0.093 seconds (  0.02%) :  netconn_gethostbyname 
   0.089 seconds (  0.02%) :  etharp_send_ip 
   0.077 seconds (  0.01%) :  time_plus 
   0.073 seconds (  0.01%) :  sys_mbox_post 
   0.068 seconds (  0.01%) :  STEVT_NotifyWithSize 
   0.061 seconds (  0.01%) :  lwip_htonl 
   0.053 seconds (  0.01%) :  semaphore_value 
   0.051 seconds (  0.01%) :  VDP_IsUpdateNeeded 
   0.048 seconds (  0.01%) :  sys_arch_mbox_tryfetch 
   0.042 seconds (  0.01%) :  time_ticks_per_sec 
   0.042 seconds (  0.01%) :  device_check_args 
   0.042 seconds (  0.01%) :  LayerVideo_VSync 
   0.040 seconds (  0.01%) :  vdp_NotifyDisplaySettings 
   0.039 seconds (  0.01%) :  stlayer_VSyncCallback 
   0.033 seconds (  0.01%) :  STOS_SemaphoreSignal 
   0.028 seconds (  0.01%) :  ipaddr_ntoa_r 
   0.025 seconds (  0.00%) :  stevt_CheckID 
   0.024 seconds (  0.00%) :  task_context 
   0.023 seconds (  0.00%) :  EventChange 
   0.021 seconds (  0.00%) :  FastHardwareManagerTask 
   0.016 seconds (  0.00%) :  STDVMi_ServiceTask 
   0.015 seconds (  0.00%) :  lwip_recv 
   0.012 seconds (  0.00%) :  stxmac_phy_busy_wait 
   0.012 seconds (  0.00%) :  STOS_SemaphoreWaitTimeOut 
   0.010 seconds (  0.00%) :  VIRTLAY_IsUpdateNeeded 
   0.010 seconds (  0.00%) :  stlayer_HardwareManagerTerm 
   0.008 seconds (  0.00%) :  _malloc_trim_r 
   0.007 seconds (  0.00%) :  etharp_query 



reply via email to

[Prev in Thread] Current Thread [Next in Thread]