bug-guile
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#10627: char-ready? is broken for multibyte encodings


From: Mark H Weaver
Subject: bug#10627: char-ready? is broken for multibyte encodings
Date: Sun, 24 Feb 2013 15:14:05 -0500
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.2 (gnu/linux)

Hi Andy,

Andy Wingo <address@hidden> writes:

> On Sat 28 Jan 2012 11:21, Mark H Weaver <address@hidden> writes:
>
>> The R5RS specifies that if 'char-ready?' returns #t, then the next
>> 'read-char' operation is guaranteed not to hang.  This is not currently
>> the case for ports using a multibyte encoding.
>>
>> 'char-ready?' currently returns #t whenever at least one _byte_ is
>> available.  This is not correct in general.  It should return #t only if
>> there is a complete _character_ available.
>
> This procedure is omitted in the R6RS because it is not a good
> interface.  Besides its semantic difficulties, can you think of a sane
> implementation for multibyte characters?

Maybe I'm missing something, but I don't see any semantic problem here,
and it seems straightforward to implement.  'char-ready?' should simply
read bytes until either a complete character is available, or no more
bytes are ready.  In either case, all the bytes should then be 'unget'
before returning.  What's the problem?

The only reason I haven't yet fixed this is because it will require some
refactoring in ports.c.  I guess the most straightforward approach is to
generalize 'get_codepoint', 'get_utf8_codepoint', and
'get_iconv_codepoint' to support a non-blocking mode of operation.

What do you think?

  Regards,
    Mark





reply via email to

[Prev in Thread] Current Thread [Next in Thread]