[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: LYNX-DEV Chartrans patches impressions..
Re: LYNX-DEV Chartrans patches impressions..
Tue, 4 Mar 1997 20:25:54 -0600 (CST)
On Tue, 4 Mar 1997, Hynek Med wrote:
> On Sun, 2 Mar 1997, Klaus Weide wrote:
> > The following should currently work (using -assume_charset and '@' in
> > combination): Your display (C)haracter set is set to "ISO Latin 2".
> > Start lynx with lynx -assume_charset=windows-1250 ..., then when you use
> > '@' it should toggle between "assume unlabelled is iso-8859-1" and
> > "assume unlabelled is windows-1250".
> It works fine with -assume_charset=windows-1250.
> I tried -assume_charset=ISO-8859-2, which didn't work, ISO-8859-1 was
> assumed, so I wrote that -assume_charset doesn't work.. After some
> experiments I found out that the -assume charset flag works only when the
> Display and Assumed charsets differ. If they don't, ISO-8859-1 is assumed.
> Was this meant to be so?
[ See below ]
> > I don't think putting them in lynx.cfg for system administrators would be
> > a good idea. The system administrator should not have any business of
> > setting charset defaults (and thereby, indirectly, language defaults)
> > for what his/her users browse on *remote* systems.
> I don't agree with you here. If the sysadmins don't set there anything
> reasonable, the users won't be able to read the documents with right
> accents (because the documents aren't marked etc.). They aren't able to
> read cyrillic/chinese/hebrew/whatever by now anyway (we don't have the
> fonts etc), so it wouldn't limit them any more than they are limited now -
> it would just help them to see the documents in Eastern european (read:
> local) encodings..
I am thinking about different situations than you apparently are.
For example: Sysadmin who provides dialup access to Lynx, let's say in the
US; doesn't know anything about charsets (because he/she usually doesn't
need to); some of his/her clients like to read Web pages in a foreign
(maybe their own) language, and they have the necessary fonts (if such are
needed for that language).
> > [ self-editing snips ]
> > What I haven't documented anywhere is the following:
> > o IF a document's charset is unlabelled, and the charset to assume
> > for unlabelled documents (via -assume_.. flag) is already the
> > same as the selected display (C)haracter set (so that toggling "Raw"
> > as described above wouldn't make any difference),
> > THEN toggling "Raw" means switch between
> > - assume unlabelled docs are what -assume_... and display (C).s. says
> > - assume unlabelled docs are the default of defaults, i.e. iso-8859-1.
> > The result is that
> > - if you have -assume_charset=iso-8859-2 AND display (C).s. = ISO Latin 2,
> > you should also have -raw for unlabelled iso-8859-2 docs (or use '@').
> > - the behaviour w.r.t. "Raw" on/off is then the same as it was without
> > chartrans code.
> > Whether this is a good idea is up for discussion...
> Well.. I don't think so. I use the -assume_charset flag to override the
> assumption that the document is in ISO-8859-1, because of the many
> unmarked documents that are in ISO-8859-2 or Windows-1250. Why this
> shouldn't work when the assumed charset is the same as my display Charset?
But it does work - as long as you have also 'raw' enabled...
I agree that that is not very intuitive.
OTOH We have that precious '@' key - which we are already conditioned to
use to "get the character set right" - should it do nothing in this case?
Maybe I should reverse the sense of "raw" here - only that would be even
less intuitive (then "raw" enabled would mean "DO translate". Probably not
good. Although for ISO Latin 1 and CJK display Character sets, `-raw'
is _already_ used to turn raw mode OFF, rather than ON. see
comments in lynx.cfg).
> > Yes, translation of all thoses strings where it is needed is not in place.
> > But some testing reveals:
> > flags TITLE OK '=',history,'V' etc. OK
> > (nothing) NO NO
> > -raw YES YES
> > -assume_charset=iso-8859-2 NO NO
> > -assume_charset=iso-8859-2 -raw YES YES
> > -assume_charset=windows-1250 YES NO
> > -assume_charset=windows-1250 -raw YES YES
> > -assume_local_charset=windows-1250 YES NO
> > -assume_local_charset=windows-1250 -raw YES YES
> > -assume_local_charset=iso-8859-2 NO YES
> > -assume_local_charset=iso-8859-2 -raw YES YES
> > Confused now? Well I am..
> Who wouldn't be. :-)
> > The -assume_local_charset comes into play because lynx creates
> > temporary files for '=', history list, 'V' etc. screens and then reads
> > them in.
> OK, it works fine with -assume_local_charset.
Note that it also seems to work fine (according to the table above) in all
cases as long as `-raw' is among the options.
Of course whether titles are displayed correctly for remote documents (in
history lists etc.) should not depend on -assume_local_charset at all, so
that is something to fix.
> BTW, why not to set -assume_local_charset to the one we got by
That is how I had it at first, I think. But distinguishing the two seems
very useful (and obvious) to me. Typically most text in local files on a
machine will be in one specific charset [[unwarranted assumption?]],
depending on installation / locale. That choice is not logically
connected to what assumption to make about remote documents that
Webmasters haven't bothered to label correctly.
The `-assume_local_charset' is kind of like labelling files in the local
filesystem, which is otherwise not possible in general (no HTTP headers;
HTML files are an exception because they allow a META tag).
Of course if your local files use iso-8859-2 character encoding, your Lynx
display Character set is set to ISO Latin 2, and you only browse sites
that use the iso-8859-2 charset, then everything coincides for you, and
you are wondering why all this mess of different options.
> Do we even need a local charset in this stage, when
> without correct local_charset we don't have right "global" documents?
Well it's a bug (or an incomplete implementation..), and I need to fix it.
; To UNSUBSCRIBE: Send a mail message to address@hidden
; with "unsubscribe lynx-dev" (without the
; quotation marks) on a line by itself.