bug#30066: 'get-bytevector-some' returns only 1 byte from unbuffered por

bug-guile

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#30066: 'get-bytevector-some' returns only 1 byte from unbuffered por

From:	Mark H Weaver
Subject:	bug#30066: 'get-bytevector-some' returns only 1 byte from unbuffered ports
Date:	Thu, 11 Jan 2018 16:55:38 -0500
User-agent:	Gnus/5.13 (Gnus v5.13) Emacs/25.3 (gnu/linux)

address@hidden (Ludovic Courtès) writes:

> Mark H Weaver <address@hidden> skribis:
>
>> address@hidden (Ludovic Courtès) writes:
>
> [...]
>
>>> +  if (SCM_UNBUFFEREDP (port) && (avail < max_buffer_size))
>>> +    {
>>> +      /* PORT is unbuffered.  Read as much as possible from PORT.  */
>>> +      size_t read;
>>> +
>>> +      bv = scm_c_make_bytevector (max_buffer_size);
>>> +      scm_port_buffer_take (buf, (scm_t_uint8 *) SCM_BYTEVECTOR_CONTENTS 
>>> (bv),
>>> +                            avail, cur, avail);
>>> +
>>> +      read = scm_i_read_bytes (port, bv, avail,
>>> +                               SCM_BYTEVECTOR_LENGTH (bv) - avail);
>>
>> Here's the R6RS specification for 'get-bytevector-some':
>>
>>   "Reads from BINARY-INPUT-PORT, blocking as necessary, until bytes are
>>    available from BINARY-INPUT-PORT or until an end of file is reached.
>>    If bytes become available, 'get-bytevector-some' returns a freshly
>>    allocated bytevector containing the initial available bytes (at least
>>    one), and it updates BINARY-INPUT-PORT to point just past these
>>    bytes.  If no input bytes are seen before an end of file is reached,
>>    the end-of-file object is returned."
>>
>> By my reading of this, we should block only if necessary to ensure that
>> we return at least one byte (or EOF).  In other words, if we can return
>> at least one byte (or EOF), then we must not block, which means that we
>> must not initiate another 'read'.
>
> Indeed.  So perhaps the condition above should be changed to:
>
>   if (SCM_UNBUFFEREDP (port) && (avail == 0))
>
> ?

That won't work, because the earlier call to 'scm_fill_input' will have
already initiated a 'read' if the buffer was empty.  The read buffer
size will determine the maximum number of bytes read, which will be 1 in
the case of an unbuffered port.  So, at the point of this condition,
'avail == 0' will occur only if EOF was encountered, in which case you
must return EOF without attempting another 'read'.

In order to avoid unnecessary blocking, there must be only one 'read'
call, and it must be initiated only if the buffer was already empty.

So, in order to accomplish your goal here, I don't see how you can use
'scm_fill_input', unless you temporarily increase the size of the read
buffer beforehand.

Instead, I think you need to first check if the read buffer contains any
bytes.  If so, empty the buffer and return them.  If the buffer is
empty, the next thing to check is 'scm_port_buffer_has_eof_p'.  If it's
set, then clear that flag and return EOF.

Otherwise, if the buffer is empty and 'scm_port_buffer_has_eof_p' is
false, then you must do what 'scm_fill_input' would have done, except
using your larger buffer instead of the port's internal read buffer.  In
particular, you must first switch the port to "reading" mode, flushing
the write buffer if 'rw_random' is set.

Also, I'd prefer to move this code to ports.c in order to avoid adding
more internal declarations to ports.h and changing more functions from
'static' to global functions.

>> Out of curiosity, is there a reason why you're using an unbuffered port
>> in your use case?
>
> It’s to implement redirect à la socat:
>
>   
> https://git.savannah.gnu.org/cgit/guix.git/commit/?id=17af5d51de7c40756a4a39d336f81681de2ba447

Why is an unbuffered port being used here?  Can we change it to a
buffered port?

      Mark

[Prev in Thread]

Current Thread

[Next in Thread]

bug#30066: 'get-bytevector-some' returns only 1 byte from unbuffered ports, Ludovic Courtès, 2018/01/10
- bug#30066: 'get-bytevector-some' returns only 1 byte from unbuffered ports, Ludovic Courtès, 2018/01/10
  - bug#30066: 'get-bytevector-some' returns only 1 byte from unbuffered ports, Andy Wingo, 2018/01/10
    - bug#30066: 'get-bytevector-some' returns only 1 byte from unbuffered ports, Nala Ginrut, 2018/01/10
    - bug#30066: 'get-bytevector-some' returns only 1 byte from unbuffered ports, Andy Wingo, 2018/01/10
    - bug#30066: 'get-bytevector-some' returns only 1 byte from unbuffered ports, Nala Ginrut, 2018/01/10
    - bug#30066: 'get-bytevector-some' returns only 1 byte from unbuffered ports, Ludovic Courtès, 2018/01/11
    - bug#30066: 'get-bytevector-some' returns only 1 byte from unbuffered ports, Mark H Weaver, 2018/01/11
    - bug#30066: 'get-bytevector-some' returns only 1 byte from unbuffered ports, Ludovic Courtès, 2018/01/11
    - bug#30066: 'get-bytevector-some' returns only 1 byte from unbuffered ports, Mark H Weaver <=
    - bug#30066: 'get-bytevector-some' returns only 1 byte from unbuffered ports, Andy Wingo, 2018/01/12
    - bug#30066: 'get-bytevector-some' returns only 1 byte from unbuffered ports, Ludovic Courtès, 2018/01/12
    - bug#30066: 'get-bytevector-some' returns only 1 byte from unbuffered ports, Andy Wingo, 2018/01/12
    - bug#30066: 'get-bytevector-some' returns only 1 byte from unbuffered ports, Ludovic Courtès, 2018/01/13

Prev by Date: bug#30066: 'get-bytevector-some' returns only 1 byte from unbuffered ports
Next by Date: bug#30066: 'get-bytevector-some' returns only 1 byte from unbuffered ports
Previous by thread: bug#30066: 'get-bytevector-some' returns only 1 byte from unbuffered ports
Next by thread: bug#30066: 'get-bytevector-some' returns only 1 byte from unbuffered ports
Index(es):
- Date
- Thread