bug-apl
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-apl] How do I convert a byte sequence to Unicode?


From: Elias Mårtenson
Subject: Re: [Bug-apl] How do I convert a byte sequence to Unicode?
Date: Mon, 28 Apr 2014 13:00:35 +0800

To convert byte values to code points, you need to apply an encoding algorithm, and that's kind of messy.

(I believe the rest of GNU APL kind of assumes that UTF-8 is the standard encoding used, which does make things simpler).

I have a suggestion: Make ⎕UCS support a dyadic form where the left-hand side specifies the encoding to use. I.e:

'UTF-8' ⎕UCS 99 100 101 102

Handling multiple encodings is easily done through the libiconv library. I worked on it when I made some improvements to its Common Lisp integration. It's quite simple to use.

Regards,
Elias


On 28 April 2014 12:49, David B. Lamkins <address@hidden> wrote:
That's close, but libfileio[8] returns a sequence of byte values; not
code points.

On Mon, 2014-04-28 at 12:19 +0800, Elias Mårtenson wrote:
> Use the quad function ⎕UCS:
>
>
>       ⎕UCS 'foo⍉bar'
> 102 111 111 9033 98 97 114
>       ⎕UCS 102 111 111 9033 98 97 114
> foo⍉bar
>
>
> Regards,
> Elias
>
>
> On 28 April 2014 12:17, David B. Lamkins <address@hidden> wrote:
>         I can use lib_file_io to read a sequence of byte values from a
>         file
>         containing Unicode text.
>
>         How do I convert that sequence back to a Unicode string in GNU
>         APL?
>
>
>
>
>




reply via email to

[Prev in Thread] Current Thread [Next in Thread]