bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#34862: 27.0.50; Trying to update pinyin.map


From: Eric Abrahamsen
Subject: bug#34862: 27.0.50; Trying to update pinyin.map
Date: Thu, 14 Mar 2019 22:58:14 -0700
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (gnu/linux)

On 03/15/19 07:03 AM, Eli Zaretskii wrote:
>> From: Eric Abrahamsen <eric@ericabrahamsen.net>
>> Date: Thu, 14 Mar 2019 14:49:51 -0700
>> 
>> 
>> As discussed in bug#34215, I'm trying to update the
>> romanization-to-Chinese-character mapping in the
>> file ./leim/MISC-DIC/pinyin.map to use the more complete mapping
>> provided by the Google pinyin input method, licensed under Apache 2.0.
>> This expands the number of characters recognized by Emacs from around
>> 7,000 to around 17,000. (And increases the size of the mapping file from
>> 18K to 53K.)
>> 
>> I'm running into encoding problems when adding the new characters --
>> Emacs says some of the characters can't be written using the existing
>> coding system. The original file has an encoding cookie reading coding:
>> cn-gb-2312, and describing the coding system gives me:
>> 
>> chinese-iso-8bit-dos (alias: cn-gb-2312-dos euc-china-dos euc-cn-dos
>>   cn-gb-dos gb2312-dos)
>> 
>> The characters *can* be encoded using gb18030, and of course utf8. The
>> wikipedia page for gb18030 describes gb2312 as "legacy"[1], and says
>> gb18030 is a superset of 2312.
>> 
>> Is there any reason not to go straight to utf8 for this file? If that's
>> not okay, would gb18030 be acceptable?
>
> I'm not sure I understand the encoding of which file would you like to
> change?  Could you please clarify?

Sorry, I'm trying to add more characters to ./leim/MISC-DIC/pinyin.map,
which is encoded as chinese-iso-8bit-dos, and it can't accept the new
characters with that current encoding. That's the file I'd like to
change.

Thanks,
Eric





reply via email to

[Prev in Thread] Current Thread [Next in Thread]