guile-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: The “binary-friendly” Latin-1


From: Ludovic Courtès
Subject: Re: The “binary-friendly” Latin-1
Date: Tue, 25 Jan 2011 14:21:50 +0100
User-agent: Gnus/5.110011 (No Gnus v0.11) Emacs/23.2 (gnu/linux)

Hello!

>>   1. The notion of a “binary-friendly” ISO-8859-1 encoding?  It’s
>>     actually mostly gone with the iconv change, since every textual
>>     access goes through iconv.  For binary accesses, the right API is
>>     (rnrs io ports) or similar.
>
> An equivalent question is if you care about backward compatibility of
> legacy ports.  Legacy ports returned strings and were once the only option.

You mean if there’s legacy code using a port of unspecified encoding to
read binary data, right?

The iconv change doesn’t break it on GNU/Linux:

--8<---------------cut here---------------start------------->8---
scheme@(guile-user)> (define p (open-bytevector-input-port #vu8(0 1 2 3 255 
128)))
scheme@(guile-user)> (set-port-encoding! p "ISO-8859-1")
scheme@(guile-user)> (read-char p)
$14 = #\nul
scheme@(guile-user)> (read-char p)
$15 = #\soh
scheme@(guile-user)> (read-char p)
$16 = #\stx
scheme@(guile-user)> (read-char p)
$17 = #\etx
scheme@(guile-user)> (read-char p)
$18 = #\ÿ
scheme@(guile-user)> (read-char p)
$19 = #\200
scheme@(guile-user)> (read-char p)
$20 = #<eof>
--8<---------------cut here---------------end--------------->8---

However, an iconv implementation may be free to choke on anything that’s
not strictly Latin-1 per
<https://secure.wikimedia.org/wikipedia/en/wiki/ISO-8859-1#Codepage_layout>,
e.g., everything but “ÿ” in the example above, but that seems highly
unlikely.

Anyway, as soon as you use a non-Latin-1 locale, ports get opened under
that locale’s encoding, which practically makes it impossible to do
binary I/O on the ports.

>>   2. The #f <=> "ISO-8859-1" equivalence for ‘port-encoding’ and
>>     ‘set-port-encoding!’.  Likewise, commit
>>     d9544bf012b6e343c80b76bd5761b1583cc106a3 makes ‘port-encoding’
>>     always return a string and pt->encoding always be non-NULL.
>
> Is the cost of doing the various string comparisons of port-encoding
> strings negligible?  It was put in as a (premature) optimization.

The new code keeps open iconv conversion descriptors for each port and
re-uses them; the only use of pt->encoding is when opening those CDs.

Thanks,
Ludo’.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]