lynx-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: lynx-dev Lynx and Euro symbol support


From: Leonid Pauzner
Subject: Re: lynx-dev Lynx and Euro symbol support
Date: Sat, 20 Feb 1999 12:09:11 +0300 (MSK)

20-Feb-99 00:14 Jacob Poon wrote:
> On Fri, 19 Feb 1999, Leonid Pauzner wrote:

>> I beleave LYCharSets.c tables should be removed one day
>> but they are still used for certain internal needs...
>> (as I understand it connects closely to HTPassEightBit* flags
>> so chars from 160-255 region _may_ be conversed to ISO Latin1 &name; entities
>> and than translated back to 8bit char on a later stage).

> But this should also eventually be replaced by using functions to convert
> character entity references into numeric entity references, which will
> allow CERs (as defined in DTD) mapped to NERs bigger than 255 (eg: €)
> to be processed properly, complete with fallback schemes.  Since all HTML
> 4.0 CERs are mapped to corresponding NERs, it is safe (in terms of data
> integrity) to do only one way conversion.

> As far as HTPassEightBit* flags are concerned, 8+ bit characters should be
> converted directly into NERs, not ISO Latin-1 CERs.

Well, it would be probably nicer to keep lynx internal data in unicode
(2bytes). This eventually require rewriting of all string operations but
there is a deeper problem: some unicodes may be mapped to multichars
combination for your display charset but lynx should know the resulting
number of letters to place "newline" properly. That is why lynx keep
internal data as a stream of bytes, this also covers CJK multubytes and
UTF-8 multibytes.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]