emacs-bug-tracker
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Emacs-bug-tracker] bug#5173: marked as done (23.1.50; interrupted conne


From: GNU bug Tracking System
Subject: [Emacs-bug-tracker] bug#5173: marked as done (23.1.50; interrupted connect() not handled properly)
Date: Thu, 25 Mar 2010 09:01:02 +0000

Your message dated Thu, 25 Mar 2010 17:59:58 +0900
with message-id <address@hidden>
and subject line 
has caused the GNU bug report #5173,
regarding 23.1.50; interrupted connect() not handled properly
to be marked as done.

This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
bug report if necessary, and/or fix the problem forthwith.

(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact address@hidden
immediately.)


-- 
5173: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=5173
GNU Bug Tracking System
Contact address@hidden with problems
--- Begin Message --- Subject: 23.1.50; interrupted connect() not handled properly Date: Thu, 10 Dec 2009 00:07:25 +0100 User-agent: Gnus/5.13 (Gnus v5.13) Emacs/23.1.50 (gnu/linux)
Interrupts during connect() in make-network-process aren't handled
properly.  The following recipe to reproduce the problem is rather
complicated.  You'll need:

- Qemu
- a kernel with tuntap support (/dev/net/tun)
- tunctl (from uml-utilities)
- a linux image for Qemu. If you haven't one
  use http://www.nongnu.org/qemu/linux-0.2.img.bz2
- netcat

The problem occurs during connect() and to make this period longer and
more controllable we will use Qemu so that we can stop&resume the
(virtual) TCP stack.  Also note that we are dealing with signals here
and that strace or gdb would interfere with the problem.

* Prepare Qemu

Our goal here is to start Qemu with a virtual network interface
like so:

 qemu -hda linux-0.2.img -net nic -net tap,ifname=qtap0,script=no

Before we can do that we need to create the qtap0 device:

 sudo tunctl -u USER -t qtap0 # replace USER with your user 
 sudo ifconfig qtap0 192.168.255.1 up

192.168.255.1 will most likely work for you but any non-conflicting IP
address will do.

We'll also need netcat on the virtual machine.  So let's copy it to the
image:

  mkdir img
  sudo mount -o loop linux-0.2.img img
  sudo cp -L /bin/netcat img/usr/bin
  sudo umount img

Now try 
  
  qemu -hda linux-0.2.img -net nic -net tap,ifname=qtap0,script=no

This should boot up a linux and present you a shell.
In the shell configure the network device like so:
  
  sh-2.05b# ifconfig eth0 192.168.255.2

Make sure that you can ping that device from the host system with

  ping 192.168.255.2 

If it doesn't work, check the output of route on the guest and the host.

* Test netcat 

Next test netcat.  Inside Qemu do:

  sh-2.05b# netcat -l -p 44444

and on the host:

  netcat 192.168.255.2 44444

Everything you type on the host should be echoed on the guest.  When you
abort with C-c the netcat inside Qemu should also abort.

* Test Function

Now we are almost ready to run real tests. Create a file
connect-eintr.el containing the following function.

(defun testit (vmpid)
  (switch-to-buffer "*Messages*")
  (signal-process vmpid 'SIGSTOP)
  (shell-command 
   (concat (format "(sleep 0.4; kill -SIGSTOP %d; " (emacs-pid))
           (format " sleep 0.1; kill -SIGCONT %d; " vmpid)
           (format " sleep 3; kill -SIGCONT %d;)&" (emacs-pid))))
  (let ((sock
         (make-network-process :name "test" 
                               :service 44444 
                               :host "192.168.255.2"
                               :sentinel (lambda (x y)
                                           (error "sentinel: %s %s" x y)))))
    (process-send-string sock "foo")
    (process-send-string sock "bar")
    (process-send-string sock "baz\n")
    (message "ok")))

The function does the following steps:
1) stop Qemu
2) connect()
3) stop Emacs
4) resume Qemu
5) resume Emacs
6) write some output to the socket

>From 2 to 4 Emacs will be inside connect() and we have plenty of time to
press a key to generate an interrupt.

* Run the test

Before running the function create a listening socket inside Qemu as
above:

  sh-2.05b# netcat -l -p 44444

For the next step we need the process id of Qemu, lets call that QPID.
Use QPID in the following command line:

  emacs -Q -load connect-eintr.el -eval  '(testit QPID)' -f kill-emacs  

This starts Emacs and runs the test.  If you're using X11 and and don't
press any key, Emacs will terminate after a few seconds and foobarbaz
will appear in Qemu.

Restart netcat as above and re-run the test, but this time press a key
after Emacs' frame appears.  This time Emacs will not terminate, but
instead an error message will be visible in the *Messages* buffer.  Also
the netcat process in Qemu will be terminated but without producing any
output.

This latter behavior is wrong.  Emacs should handle interrupts
generated by pressing keys more gracefully.

The problem will also occur if you run Emacs without X11 but the SIGSTOP
will return the terminal to the shell and you have to put Emacs into
foreground again with the fg command.  Your terminal is most likely
messed up at that point but the error message should still be visible.

* Probable Cause of the problem

The cause of the problem is that Emacs closes the socket after being
interrupted in connect().  That approach works with servers which accept
many connections but fails for servers which serve one connection only
as the example with netcat above did.

* Proposed fix

As described here:
http://www.madore.org/~david/computers/connect-intr.html 
the recommended way to handle interrupts during connect() is to use
select() on the socket.  The socket will become writable when the
connection is established or when an error occurs.  The error can be
obtained with getsockopt.

The patch below implements just that. 
The only other addition is the introduction of two macros
EWOULDBLOCK_P and EINPROGRESS_P which have the only purpose
to reduce #ifdef/#ifndef clutter.

Helmut

--- process.c.~1.607.~  2009-12-04 08:01:43.000000000 +0100
+++ process.c   2009-12-09 23:37:19.000000000 +0100
@@ -234,6 +234,18 @@
 #endif /* NON_BLOCKING_CONNECT */
 #endif /* BROKEN_NON_BLOCKING_CONNECT */
 
+#ifdef EWOULDBLOCK
+# define EWOULDBLOCK_P(x) (x == EWOULDBLOCK)
+#else
+# define EWOULDBLOCK_P(x) (0)
+#endif 
+
+#ifdef EINPROGRESS
+# define EINPROGRESS_P(x) (x == EINPROGRESS)
+#else
+# define EINPROGRESS_P(x) (0)
+#endif 
+
 /* Define DATAGRAM_SOCKETS if datagrams can be used safely on
    this system.  We need to read full packets, so we need a
    "non-destructive" select.  So we require either native select,
@@ -3338,9 +3350,8 @@
     {
 #ifndef NON_BLOCKING_CONNECT
       error ("Non-blocking connect not supported");
-#else
-      is_non_blocking_client = 1;
 #endif
+      is_non_blocking_client = 1;
     }
 
   name = Fplist_get (contact, QCname);
@@ -3566,10 +3577,8 @@
          continue;
        }
 
-#ifdef DATAGRAM_SOCKETS
       if (!is_server && socktype == SOCK_DGRAM)
        break;
-#endif /* DATAGRAM_SOCKETS */
 
 #ifdef NON_BLOCKING_CONNECT
       if (is_non_blocking_client)
@@ -3655,26 +3664,44 @@
       ret = connect (s, lres->ai_addr, lres->ai_addrlen);
       xerrno = errno;
 
-      turn_on_atimers (1);
+      turn_on_atimers (1); 
 
-      if (ret == 0 || xerrno == EISCONN)
-       {
+      if (ret == 0
+         || (EWOULDBLOCK_P (xerrno) && is_non_blocking_client)
+         || (EINPROGRESS_P (xerrno) && is_non_blocking_client))
          /* The unwind-protect will be discarded afterwards.
             Likewise for immediate_quit.  */
          break;
-       }
 
-#ifdef NON_BLOCKING_CONNECT
-#ifdef EINPROGRESS
-      if (is_non_blocking_client && xerrno == EINPROGRESS)
-       break;
-#else
-#ifdef EWOULDBLOCK
-      if (is_non_blocking_client && xerrno == EWOULDBLOCK)
-       break;
-#endif
-#endif
-#endif
+      if (xerrno == EINTR) 
+       {
+         /* Unlike most other syscalls connect() cannot be called
+            again.  (That would return EALREADY.)  The proper way to
+            wait for completion is select(). */
+         int sc;
+         fd_set fdset;
+       retry_select:
+         FD_ZERO (&fdset);
+         FD_SET (s, &fdset);
+         QUIT;
+         sc = select (s + 1, 0, &fdset, 0, 0);
+         if (sc == -1)
+           if (errno == EINTR) 
+             goto retry_select;
+           else 
+             report_file_error ("select failed", Qnil);
+         eassert (sc > 0);
+         {
+           int len = sizeof xerrno;
+           eassert (FD_ISSET (s, &fdset));
+           if (getsockopt (s, SOL_SOCKET, SO_ERROR, &xerrno, &len) == -1)
+             report_file_error ("getsockopt failed", Qnil);
+           if (xerrno != 0)
+             errno = xerrno, report_file_error ("error during connect", Qnil);
+           else
+             break;
+         }
+       }
 
       immediate_quit = 0;
 
@@ -3682,9 +3709,6 @@
       specpdl_ptr = specpdl + count1;
       emacs_close (s);
       s = -1;
-
-      if (xerrno == EINTR)
-       goto retry_connect;
     }
 
   if (s >= 0)

--- End Message ---
--- Begin Message --- Date: Thu, 25 Mar 2010 17:59:58 +0900 User-agent: Wanderlust/2.14.0 (Africa) SEMI/1.14.6 (Maruoka) FLIM/1.14.8 (Shij┼Ź) APEL/10.6 Emacs/22.3 (sparc-sun-solaris2.8) MULE/5.0 (SAKAKI)
Closed with this change:

revno: 99750
author: Helmut Eller <address@hidden>
committer: YAMAMOTO Mitsuharu <address@hidden>
branch nick: trunk
timestamp: Thu 2010-03-25 17:48:52 +0900
message:
  Call `select' for interrupted `connect' rather than creating new socket 
(Bug#5173).


--- End Message ---

reply via email to

[Prev in Thread] Current Thread [Next in Thread]