lynx-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Lynx-dev] wide-char support with different builds


From: Steve White
Subject: Re: [Lynx-dev] wide-char support with different builds
Date: Tue, 21 Jan 2025 13:37:39 +0100

Hi, Thomas,

I guess I failed to convey the nature of the problem adequately.

By default, on Linux Arch, when Lynx is built out-of-the box, it does not
properly display text in Chinese, Arabic, or Indic languages, and many others. 
This renders it unusable to 80% of the population of the planet.

It makes Lynx look bad.  And Arch is surely not an isolated case.

One option is to leave things as they are, and blame the distro packagers for
not testing their build, identifying the problem, and taking corrective steps.
But I promise you, most packagers will not do this.  They are volunteers,
very busy with many things.

Another option is to make Lynx work properly out-of-the-box, so as to require
no extraordinary attentiveness of packagers.

I do not claim to know just what measures are appropriate.
The obvious, quick remedy would be to set the default configuration to be.
        CHARACTER_SET:utf-8

Unicode is the default used by all other modern browsers.

This ISO-8859-1 defalult was never a good idea, and is now a thing of the past.
It was in the HTTP specs, but is not part of HTML.

See
        https://www.w3.org/TR/html401/charset.html#spec-char-encoding
especially:
"""
The HTTP protocol ([RFC2616], section 3.7.1) mentions ISO-8859-1 as a default 
character encoding when the "charset" parameter is absent from the 
"Content-Type" header field. In practice, this recommendation has proved 
useless because some servers don't allow a "charset" parameter to be sent, and 
others may not be configured to send the parameter.
"""

(It makes some sense to leave the old encoding in place, to provide support
for ancient systems that cannot support Unicode.)


On 18.01.25, Thomas Dickey wrote:
> On Sat, Jan 18, 2025 at 01:49:24PM +0100, Steve White wrote:
> > 
> > I was curious enough to build it myself on the Manjaro system.
> 
> https://lynx.invisible-island.net/lynx_help/body.html#LOCALE_CHARSET
> 
> > I now see what the problem is.  I have a recommendation.
> > 
> > The default configuration has
> >     CHARACTER_SET:iso-8859-1
> > simply changing the value in /etc/lynx.cfg to
> >     CHARACTER_SET:utf-8
> > resolves the issue.
> > 
> > The thing is, in this day and age, the *right* encoding is UTF-8.
> 
> the setting's been there more than 20 years (since January 2004).
> 
> https://lynx.invisible-island.net/current/CHANGES.html#v2.8.5pre.3
> 
> > But the builder of distro software generally has little time to tweek each
> > package they build.  The default configuraton should be UTF-8, if the
> > system can handle it (which all modern systems do.)
> 
> packagers mostly set up their tweaks long ago.
>  
> > It would be much better for your configuration to detect UTF-8 support
> > (however it needs to) and to set the CHARACTER_SET accordingly.
> > 
> > Cheers!
> > 
> > 
> 
> -- 
> Thomas E. Dickey <dickey@invisible-island.net>
> https://invisible-island.net





reply via email to

[Prev in Thread] Current Thread [Next in Thread]