[Groff] Re: preconv supported encodings

From: Werner LEMBERG
Subject: [Groff] Re: preconv supported encodings
Date: Wed, 04 Jan 2006 18:07:14 +0100 (CET)

> When I look at the emacs_to_mime conversion table, it already looks
> like it contains too many entries. Nobody in his sane mind will ever
> write a manpage in CP851 or MAC-ROMAN encoding.


> Thinking about long-term cost of supporting an encoding.  Now is the
> moment when we have complete freedom to decide about the supported
> encodings.  Later, we can no longer restrict the set of supported
> encoding, due to backward compatibility requirement.

I fully agree.

> If you choose a large set like now, you will not have many requests
> for adding a new encoding.  But maintenance will always have to
> support all of them.  You see already how much it costs to support
> CP1047.

Well, cp1047 is a special case since EBCDIC support goes deeper.

> If I were you, I would start with the following set; comment out the
> other entries of emacs_to_mime entries; and comment them in on
> demand only.
>   ISO-8859-1       (for English, Spanish, Norwegian etc.)
>   ISO-8859-2       (for Hungarian etc.)
>   ISO-8859-5       (for Serbian etc.)
>   ISO-8859-7       (for Greek)
>   ISO-8859-9       (for Turkish)
>   ISO-8859-13      (for Latvian etc.)
>   ISO-8859-15      (for French, German, etc.)
>   KOI8-R           (for Russian)
>   EUC-JP           (for Japanese)
>   GB18030          (for simplified Chinese)
>   UTF-8            (for all others)

Well, Big5 should probably be used too, together with EUC-KR.  I'll
also retain the code for UTF-16 nd UTF-32.

> This list contains no CPxxx encodings, in particular no WINDOWS-xxxx
> encodings.  Microsoft continues to extend these encodings over and
> over again, with the result that, say, a text written today in CP950
> on a Windows-XP machine is not readable as CP950 on an earlier
> version of the same OS.  For this reason, the use of these encodings
> for manpages would be suboptimal.



