[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Lynx-dev] Unicode-marking, &c

From: Thorsten Glaser
Subject: Re: [Lynx-dev] Unicode-marking, &c
Date: Thu, 26 Feb 2009 18:49:02 +0000 (UTC)

Thomas Dickey dixit:

>> Here under Windows there are constant references to the character that
>> begins a 16-bit-wide-character file (FF FE) or UTF-8 file (EF BB BF).

Note that this is not about Windows® though – the Byte Order Mark,
Unicode FEFF, UCS-2BE 0xFE 0xFF, UCS-2LE 0xFF 0xFE, UTF-8 0xEF 0xBB 0xBF,
is a standardised thing.

> Lynx handles _some_ cases - but a url would help, so we can see.


Lynx handles all three poorly: the UTF-8 BOM isn’t stripped, the UCS-2
files end with an ampersand instead of the … (ellipsis).

“It is inappropriate to require that a time represented as
 seconds since the Epoch precisely represent the number of
 seconds between the referenced time and the Epoch.”
        -- IEEE Std 1003.1b-1993 (POSIX) Section B.2.2.2

Attachment: utf8.htm
Description: Text document

Attachment: ucs2le.htm
Description: Binary data

Attachment: ucs2be.htm
Description: Binary data

reply via email to

[Prev in Thread] Current Thread [Next in Thread]