[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
LYNX-DEV ANNOUNCE: lynx2.6 chartrans
LYNX-DEV ANNOUNCE: lynx2.6 chartrans
Thu, 28 Nov 1996 20:45:51 -0600 (CST)
For (hopefully) increasing the usability of Lynx in environments with
multiple charsets ond code pages, I have extended the character
translation mechanism of Lynx. Below is part of a short README.
Please test, and let me know whether it is useful. Currently this
is not a full system with all necessary translation tables, but it
should be now more easily possible to add new charsets etc.
The code is available from
To compile, you need to get lynx-patch-2.6ct-0.1.pch.gz, *and* either
of lynx-newfiles-2.6ct-0.1.zip or lynx-newfiles-2.6ct-0.1.tar.gz.
See the README.chartrans, which is included in the lynx-newfiles-* files
and also readable at the above address.
Note that this does not deal with CJK character sets (but rather only
good old 8-bit charsets and Unicode/UCS2), I tried to leave the previous
processing for CJK charsets intact (but have no way to test whether
I succeeded with that).
The patches were made relative to Lynx2.6 + Composite Patches from
Hiram (last CHANGES date 11-24-96). I normally use Linux+Slang, so
that may be where it works best; but I verified that the stuff also
compiles for a sun4 target.
Output of raw UTF8 (needs of course a termina which understands it)
seems to work better, but not perfect, with Slang. This is a problem
beyond Lynx, a curses replacement which understands multibyte characters
properly would be needed to avoid putting characters in the wrong screen
position. (Does anyone know of such a beast?)
The code is currently a bit ugly, and I am sure there are may
glitches. You can help me find them.
Excerpt from README.chartrans:
- Can (attempt to) translate from any document charset to any display
character set, *IF* the document charset is known by a translation
table (compiled in at installation).
- Old method for specifying translations of Latin1 characters and
SGML entities still supported. (IBMPC-charsets.announce is still
- New method to define character sets: used for input charset as well
as display character set, translation tables compiled in from
separate files (one per charset).
- Unicode (UTF8) support: can (attempt to) decode and translate UTF8 to
display character set, or pass through UTF to display (if terminal
or console understands UTF8). [only tested with Slang so far, does
not always position everything correctly on screen]
- Support for CHARSET attribute on A tag [but not yet on LINK], as in
HTML i18n draft. A link can suggest the target's charset in this way.
- EXPERIMENTAL, currently enabled only for Linux console:
can (attempt to) automatically switch terminal mode and load new
code pages on change of display character set.
- some minor changes: sometimes invalid characters are displayed in a hex
notation Uxxxx (helps debugging, but I also regard it as at least not
worse than showing the wrong char without warning). KOI8 -> other cs
will just strip high bit from cyrillic chars (gives somewhat readable
ASCII, KOI was constructed that way...)
Additions/changes to user interface:
- many new Display Character Sets are available on O)ptions screen.
(also can now use arrow keys, HOME, END for cycling through the list).
- new command line flags:
-assume_charset=... assume this as charset for documents that don't
specify a charset parameter in HTTP headers
-assume_unknown_charset=... in case a charset parameter is not recognized
-assume_local_charset=... assume this as charset of local file: docs
- The "Raw" toggle (from -raw flag, '@' key, or Options screen)
o should work as before for CJK charsets,
o otherwise toggles the assumption "Default remote charset is same
as Display Character Set" on or off.
(Try the "Transparent" Display Character Set for more "rawness".)
[Some notes about compiling etc. snipped, see the URL.]
; To UNSUBSCRIBE: Send a mail message to address@hidden
; with "unsubscribe lynx-dev" (without the
; quotation marks) on a line by itself.
- LYNX-DEV ANNOUNCE: lynx2.6 chartrans,
Klaus Weide <=