[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] Add an option to not reduce vocabulary of the Japanese

From: Eli Zaretskii
Subject: Re: [PATCH] Add an option to not reduce vocabulary of the Japanese
Date: Fri, 03 Jun 2022 09:12:34 +0300

> From: Taiju HIGASHI <higashi@taiju.info>
> CC: higashi@taiju.info
> Date: Fri, 03 Jun 2022 12:16:06 +0900
> The Japanese dictionary bundled with Emacs has a small vocabulary.
> For example, to convert "なごや" to "名古屋" (Nagoya) in Kanji, I would
> enter "なご" and convert it to "名古", then enter "や" and convert it to
> "屋".
> Because the Japanese dictionary bundled with Emacs does not have "名古屋
> ".
> The skkdic-convert function in the ja-dic-cnv package generates the
> Japanese dictionary, but the logic includes the dictionary vocabulary
> reduction process.
> So I have created a patch to add an option to skip this reduction
> process. I would be happy to receive your review and feedback.

Thank you for working on this, and for your interest in Emacs.

We don't have a lot of people on board who speak Japanese, so I CC
Kenichi Handa in the hope that he could have some comments on your

Meanwhile, would you like to start the legal paperwork of assigning to
the FSF the copyright for your changes?  Your changes are small, but
they are still borderline larger than we can accept without the
copyright assignment.  If you agree, I will send you the form to fill
and instructions to go with the form.

> * configure.ac: Add "with-ja-dic-reduction" configure argument.

In addition to a configure-time option, I think it would be a good
idea to have a special Makefile rule to regenerate the Japanese
dictionary while skipping or not skipping the vocabulary reduction.
Is such an option available with your changes?  I think it is, but I'm
not certain.  So if needed, could you please add such an option to

> +  Does Emacs reduce the Japanese dictionary?              
> ${with_ja_dic_reduction}

I guess this wording is better:

 Should Emacs reduce Japanese dictionary vocabulary?

> By the way, if I may be honest, I would like to remove this reduction
> process.
> "名古屋" (Nagoya) [0] is the name of one of Japan's major cities and is a
> proper noun.
> I don't think most people, myself included, recognize that the word is a
> composite of "名古" and "屋".
> I am Japanese, so my sense may be different, but I recognize "New York"
> as one word and "Spider-man" as one word.
> In other words, instead of converting "名古" and "屋" respectively, we
> want to convert "名古屋" as it is. It is stressful to have to separate
> the words I imagine in my head from the words I use in Kanji
> conversion. I would like to reduce that frequency at least a little.
> Although the skkdic-reduced-candidates function mechanically eliminates
> words that can be entered by combining them with other words, it does
> not judge the importance of words, so even frequently used words like "
> 名古屋" are eliminated. That is very inconvenient.
> My concern is that Emacs' standard Kanji conversion engine will be
> regarded as useless.
> Despite being based on a dictionary with a sufficient vocabulary
> (SKK-JISYO.L), it generates an inconvenient dictionary by the reduction
> process.
> Most of the people who rated Emacs' standard kanji conversion engine as
> useless are probably unaware of this fact.
> I also rated the standard Emacs kanji conversion engine as
> useless. Because I did not know that fact.
> However, when I learned the facts, I realized that this was a
> misunderstanding and that I had disrespectful feelings toward Emacs.
> This is simply a disrepute due to misunderstanding.

This is something which would need an expert to respond to.  I admit
that I don't even understand the issues you are describing, as I don't
read Kanji and don't speak Japanese.  I hope Handa-san will comment on

> The reduction of dictionaries would reduce the file size by less than
> half. While significant, how important is this in today's computing
> environment?

It isn't too important, IMO.  The reduction in Emacs's memory
footprint, if that is significant, is probably more important.

> My English is not very good, so I apologize if I did not convey my
> intentions.

There's absolutely nothing wrong with your English, so no need to


reply via email to

[Prev in Thread] Current Thread [Next in Thread]