`scm_c_read ()' and `swap_buffer' trick harmful

From: Ludovic Courtès
Subject: `scm_c_read ()' and `swap_buffer' trick harmful
Date: Sat, 15 Nov 2008 21:04:32 +0100
User-agent: Gnus/5.11 (Gnus v5.11) Emacs/22.3 (gnu/linux)


I just discovered undesirable side effects of commit
b5cb4464ca4e23d077a9777bbc17835feb0f4374 "Make multi-byte reads on
unbuffered ports more efficient."

An example application that breaks in the presence of this patch are
"custom binary input ports" (aka. CBIPs [0]) in Guile-R6RS-Libs [1].  The
CBIP implementation [2] works as follows:

  1. make_cbip ()
       /* Create a bytevector for use as the CBIP's internal buffer.  */
       SCM bv = scm_r6rs_c_make_bytevector (c_len);
       c_bv = (char *) SCM_R6RS_BYTEVECTOR_CONTENTS (bv);
       c_port->read_pos = c_port->read_buf = (unsigned char *) c_bv;
       c_port->read_end = (unsigned char *) c_bv;

       /* Store BV for later reuse.  */
       SCM_SETSTREAM (port, SCM_UNPACK (bv, and other things));

  2. cbip_fill_input (port)
       if (c_port->read_pos >= c_port->read_end)
           /* Invoke the user's `read!' procedure.  */
           bv = SCM_R6RS_CBIP_BYTEVECTOR (port);

           octets = scm_call_3 (read_proc, bv, SCM_INUM0,
                                SCM_I_MAKINUM (CBIP_BUFFER_SIZE));
           c_octets = scm_to_uint (octets);

           c_port->read_pos = (unsigned char *) SCM_R6RS_BYTEVECTOR_CONTENTS 
           c_port->read_end = (unsigned char *) c_port->read_pos + 

IOW, the CBIP `fill_input' method does *not* directly pass
`c_port->read_buf' to the user's `read!' method but instead passes it
its bytevector, which it assumes to wrap its internal.  Thus, if
`c_port->read_buf' happens to point to something other than BV's
contents, it is just left untouched.

Worse, `cbip_fill_input ()' updates `read_pos' and `read_end' but does
not touch `read_buf', leading to an inconsistent state that will confuse
later `scm_fill_input ()' calls on that port (e.g., in the loop for
`scm_c_read ()'), and possibly to heap corruption.

So where to go from here?  I think this example shows that the
`swap_buffer' trick is too risky, unfortunately.  Thus, we may need to
revert it, at least in 1.8.  Second, I think that a `read' method as a
replacement for `fill_input', as I proposed back then [3], would be
safer; maybe 1.9 would be a nice place to add it.  Neil: what do you






