lynx-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: lynx-dev Re: lynx should respect LANG


From: Klaus Weide
Subject: Re: lynx-dev Re: lynx should respect LANG
Date: Wed, 31 May 2000 13:17:30 -0500 (CDT)

On Wed, 31 May 2000, Henry Nelson wrote:

> problems, however.  Off topic for lynx, but you might be interested that
> a certain Windows' FTP program cannot make whole-directory transfers from
> a Solaris [version 2.6, later versions may be fixed] server with Japanese
> NLS setup system-wide.  The reason is that the WSFTP does a "dir" command
> which translates to "/bin/ls -al."  In this case, "ls" has some stats it
> displays using Japanese multibyte characters, which WSFTP confuses to be
> a file.  Since there is no such file, WSFTP gets an error, and then aborts.)

To bring it on topic, how does lynx fare with it?


> > > I can set DCS to either EUC or SJIS, and Lynx will
> > > [just] work, *IF* my terminal emulator is set to receive/send SJIS.
> > 
> > It really shouldn't, if DCS says EUC but the terminal emulator is
> > set to receive SJIS...  I'd guess that the "terminal emulator" (taking
> > this to include everything 'in front of' lynx, including fonts) isn't
> > set up to receive SJIS then, after all.  (Maybe it can magically detect
> 
> Hopefully someone more knowledgeable will shed better light on this.  My
> _guess_ is that SJIS includes some individual bytes that translate to a
> control character/escape sequence in EUC, but that individual bytes in
> EUC do not contain such "dangerous" characters/sequences.

Byte values 0x80 - 0x9F don't occur at all in EUC encoded printable
characters , so that's why terminal emulators (and maybe comm
programs?) can interpret those bytes as control functions.  Well
that must be *why* they were left unused in EUC.  EUC is compatible
to ISO 8859 in that respect.

But that explains why sending SJIS when the terminal expects EUC may
be dangrous, not why sending EUC when the terminal expects SJIS "works"
(i.e., shows the right glyphs).

> > YM, you still have to refresh the screen, while you shouldn't have
> > to if slang was really compiled with KANJI support?
> 
> Yes.  He seems to be getting very good output from Lynx, while I'm still
> seeing glitches.  One difference between our systems is that he never sets
> the KANJI support flag when compiling slang, whereas I always have.

I thought he said he tried both, and got the expected result (works better
*with* SLANG_HAS_KANJI_SUPPORT defined).

> >     LANG=ja_JP.ujis lynx -dump http://www.debian.org > debian.euc.html
> >     LANG=ja_JP.sjis lynx -dump http://www.debian.org > debian.sjis.html
> > 
> >   - Get two copies of a page, in different character encodings.
> > 
> > 
> >     LANG=ja_JP.ujis lynx -dump ~/saved-page.txt \
> >          -assume_local_charset=shift_jis > translated-page.txt
> > 
> >   - Convert a local file (w/ lynx instead of recode, iconv, or similar)
>                                         [Probably nkf or qkc are better.]

My nkf apparently cannot deal with JIS X 0212 characters (which can be
validly encoded in EUC-JP and in ISO-2022-JP-2), while my iconv can.
I guess those characters aren't much used.  (Current lynx doesn't
treat them right, either.  I have made some changes in my copy to
add support.)

I should mention that there already is a way in lynx to specify
the display character set form the command line, but you have to
compile with -DMISC_EXP, and it's undocumented.  Consider this
a plug for (helping test) MISC_EXP.
     lynx -dump -convert_to="text/plain;charset=euc-jp" ...

   Klaus


; To UNSUBSCRIBE: Send "unsubscribe lynx-dev" to address@hidden

reply via email to

[Prev in Thread] Current Thread [Next in Thread]