Hello,
I am using lwip-1.1.0 and the socket api that is based on the netconn api. I
found really much possible race conditions in the files api_lib.c and api_msg.c.
I played a bit with task prioritys, and when the TCP/IP thread has the same or
a lower priority than the application thread, many transmission errors happen
(e.g. some pieces of data were sent two times - resulting in corrupted html
pages with my http server test program). But this transmission errors disappear
when the priority of the TCP/IP thread is higher than the priority of the
application thread.
But there still remains one transmission error problem. In really seldom cases
(every 30 minutes with a continuous TCP data stream of 30MBit/s on my 400MHz
PowerPC board) up to 64kb garbage data is transmitted by lwIP.
After looking closer to api_lib.c, I have found the problem in the function
netconn_write. The problem there is that a variable that is shared between the TCP
thread and the user thread is accessed three times without synchronization of the two
threads. The variable I am talking about is pcb->snd_buf, that is accessed by
using the macro call tcp_sndbuf(conn->pcb.tcp). The function netconn_write tries
to figure out how much space is still there in the send buffer, and transmitts the
data piece by piece to ensure the send buffer does not overflow. The maximum block
size is determined by this code:
if (size > tcp_sndbuf(conn->pcb.tcp)) {
/* We cannot send more than one send buffer's worth of data at a
time. */
len = tcp_sndbuf(conn->pcb.tcp);
} else {
len = size;
}
The data transmission error happens when the thread is interrupted just after
the if() and before the len = tcp_sndbuf() line. When the sendbuffer gets
updated by the TCP thread, and suddenly the sendbuffer is bigger than size, len
is set to a value bigger than size. When you now look a bit further into
netconn_write, you will find the line size -= len; that updates the
remaining number of bytes to transmit. But if len is bigger than size, size
will overflow (it is a 16 bit unsigned integer). In the worst case, lwIP will
now send 64kb of garbage data.
I have made a patch for this particular problem, and for a race condition
problem with the semaphore in the same function.
Regards,
Dennis Kuschel
--- lwip-1.1.0/src/api/api_lib.c Mon Sep 20 17:58:02 2004
+++ lwip-1.1.0/src/api/api_lib.c Tue Mar 01 18:16:30 2005
@@ -621,7 +621,8 @@
netconn_write(struct netconn *conn, void *dataptr, u16_t size, u8_t copy)
{
struct api_msg *msg;
- u16_t len;
+ u16_t len, sndbuf;
+ sys_sem_t sem;
if (conn == NULL) {
return ERR_VAL;
@@ -651,16 +652,16 @@
msg->msg.msg.w.copy = copy;
if (conn->type == NETCONN_TCP) {
- if (tcp_sndbuf(conn->pcb.tcp) == 0) {
+ while ((sndbuf = tcp_sndbuf(conn->pcb.tcp)) == 0) {
sys_sem_wait(conn->sem);
if (conn->err != ERR_OK) {
goto ret;
}
}
- if (size > tcp_sndbuf(conn->pcb.tcp)) {
+ if (size > sndbuf) {
/* We cannot send more than one send buffer's worth of data at a
time. */
- len = tcp_sndbuf(conn->pcb.tcp);
+ len = sndbuf;
} else {
len = size;
}
@@ -685,9 +686,9 @@
ret:
memp_free(MEMP_API_MSG, msg);
conn->state = NETCONN_NONE;
- if (conn->sem != SYS_SEM_NULL) {
- sys_sem_free(conn->sem);
+ if ((sem = conn->sem) != SYS_SEM_NULL) {
conn->sem = SYS_SEM_NULL;
+ sys_sem_free(sem);
}
return conn->err;
_______________________________________________
lwip-users mailing list
address@hidden
http://lists.nongnu.org/mailman/listinfo/lwip-users