groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Groff] groff.1: enhance documentation for device utf8.


From: Ingo Schwarze
Subject: Re: [Groff] groff.1: enhance documentation for device utf8.
Date: Sat, 2 Aug 2014 18:23:27 +0200
User-agent: Mutt/1.5.21 (2010-09-15)

Hi Bernd,

Bernd Warken wrote on Sat, Aug 02, 2014 at 03:25:24PM +0000:

> commit 5be9ea891e4e8a7225ca1da4f423fbe38aeaae4c
> Author: Bernd Warken <address@hidden>
> Date:   Sat Aug 2 17:25:18 2014 +0200
[...]
> diff --git a/src/roff/groff/groff.man b/src/roff/groff/groff.man
> index fdaaaa7..e46bfb9 100644
> --- a/src/roff/groff/groff.man
> +++ b/src/roff/groff/groff.man
> @@ -488,6 +488,9 @@ ISO \%8859-1.
>  .TP
>  utf8
>  Unicode character set in \%UTF-8 encoding.
> +.
> +This mode has the most useful fonts for TTY mode, so it is the best
> +mode for TTY output.

Hum, that doesn't make much sense to me.

ISO-latin vs. UTF-8 is *not* at all a question of which one is
absolutely better.

If your terminal or output device only supports ISO-latin or is
configured for ISO-latin or some other narrow character or non-UTF
locale, shoving UTF-8 down its throat will not make you happy but
result in gibberish.  Actually, there is too much software already
wrongly assuming that "everything can handle UTF-8".

To handle UTF-8 output, your terminal needs to be specifically
configured for UTF-8.  That may not be possible for all terminals
and in all situations, and certainly many users don't do it.


Regarding defaults, switching groff from ISO-latin by default to
UTF-8 by default is certainly something that shouldn't be attempted
in a casual commit without a discussion.  I'm not sure you are
driving into that direction, but your recent groffer commit might
indicate that you might be - or it might not, i'm not sure.

If a specific output mode is requested from a program (for example
with options like -Tascii or -Tutf8), that mode must be used.
If no specific mode is requested and the program has to decide
which mode to use as a fallback, inspecting the user's locale(1)
environment variables, in particular LC_CTYPE, is a good way to
proceed, while blindly falling back to UTF-8 is a bad idea.
Actually, strictly speaking, if LC_CTYPE, LC_ALL, and LANG are
unset, software ought to fall back to the C/POSIX locale, which
is even less than ISO-latin.  In practice, falling back to ISO-latin
is often convenient (for people in western european countries and in
the english-speaking parts of the world), so some software has been
doing that in the past.  Strictly speaking, it's not quite the
right thing to do, but it's widespread enough to rarely cause
outrage...

Yours,
  Ingo



reply via email to

[Prev in Thread] Current Thread [Next in Thread]