bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#49449: 28: TLS connection never gets to "open" stage


From: Mattias Engdegård
Subject: bug#49449: 28: TLS connection never gets to "open" stage
Date: Thu, 8 Jul 2021 09:59:26 +0200

7 juli 2021 kl. 21.57 skrev Lars Ingebrigtsen <larsi@gnus.org>:

> Yes, it's grown somewhat organically.  :-/

Let me first say that the state of the code is not your fault! It's a product, 
as you say, from organic growth, and it does need a rewrite.

> I'm not able to reproduce this on Debian/bullseye, but on Macos I get
> 
> callback: status = (:error (error connection-failed "connect" :host 
> "elpa.gnu.o\
> rg" :service 443))

Yes, that is my observation too. Obviously the busy-wait part is essential: 
removing it makes the problem go away.
Essentially, the busy-wait postpones the call to wait_reading_process_output so 
that when it is eventually called, gnutls_handshake succeeds on the first try 
instead of first returning GNUTLS_E_AGAIN, which brings us onto a different 
code path.

> There's been several reports in the last week of TLS not
> working on Macos.  Has Apple pushed something new, or...  did something
> else happen lately in this area on Macos?

No, I've been harassed by this bug for quite some time but only now decided to 
dig deeper. Most likely it's just a matter of different timing that the 
process/TLS system doesn't cope with.

First, when the `url-http` call returns we have a Lisp_Process with

 gnutls_p = true
 gnutls_boot_parameters = non-nil
 gnutls_initstage = GNUTLS_STAGE_HANDSHAKE_TRIED (8)

and its file descriptor has a corresponding fd_callback_data with
 flags = FOR_WRITE | NON_BLOCKING_CONNECT_FD

because the asynchronous connect call has not yet been completed.

In the GOOD case (without busy-wait), `wait_reading_process_output` gets called 
right away (because Emacs has nothing else to do) and gnutls_try_handshake 
initially fails with E_AGAIN but p->outfd becomes writable so `delete_write_fd` 
is called to zero the fd_callback_data flags, and when the handshake eventually 
succeeds, the sentinel is called with the "open\n" event.

In the BAD case (with busy-wait), the TLS handshake succeeds right away while 
the descriptor flags still has NON_BLOCKING_CONNECT_FD set, so the sentinel 
isn't called.

Does this jog any memories?






reply via email to

[Prev in Thread] Current Thread [Next in Thread]