bug-libunistring
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug-libunistring] Changing the appearance of escapes


From: Bruno Haible
Subject: Re: [bug-libunistring] Changing the appearance of escapes
Date: Thu, 16 Sep 2010 22:39:32 +0200
User-agent: KMail/1.9.9

Hi Ludo,

> Now to actually design and implement something along these lines...

The way I recommend to do it is:
  - For ports with an input direction, store in the port an iconv_t descriptor
    from the given encoding to UTF-8. Similarly, for ports with an output
    direction, store in it an iconv_t descriptor from UTF-8 to the encoding.
    (Why UTF-8 and not UTF-32 = UCS-4? Because on all platforms you can convert
    from UTF-8 to anything and vice versa, but not from UTF-32 from/to anything.
    Solaris for example.)
  - In the input direction you'll also need a small buffer (up to 6 bytes or so)
    for bytes that have already been read from the stream but not yet converted
    to characters. Near this, you'll also have a character or bit that is used
    to implement the CRLF -> LF conversion.
  - The most tricky thing is to handle all possible errors and return values
    from iconv() correctly.
  - In the output direction, an iconv_t can produce a couple of bytes at the
    end, that you need to output before closing the stream. This is needed for
    stateful encodings such as CP1258, UTF-7, or UTF-16 (with BOM). But only
    if you want to support stateful encodings at all. All encodings used by
    locales are stateless.

Bruno



reply via email to

[Prev in Thread] Current Thread [Next in Thread]