[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: single-key-description no good for Japanese and Chinese chars

From: Drew Adams
Subject: RE: single-key-description no good for Japanese and Chinese chars
Date: Fri, 22 Sep 2006 07:45:25 -0700

    > Question: are 20864 and 20992 the _same_ group of characters
    > (same generic character)?

    No.  split-char tells you the difference.

That's what I thought.

    > If a "key" here does correspond to multiple characters, then
    > I understand that its description won't represent a single
    > character (though I'm not sure what that does for
    > `read-kbd-macro' - does it necessarily break that
    > functionality?).

    As those are not valid character events, they should not be
    given to read-kbd-macro.

I guessed that. That's what I meant by "break that functionality" - in this
case, the inverse relation is lost. Too bad, but I guess there might be no

    > Still, each such _group_ of characters (each generic char)
    > should have its own unique name, it seems to me.

    I agree that unique name is more informative, but I don't
    understand how does it work in an actual code.

I'm not sure what you mean. Do you mean that you don't know how to provide a
uniqe name in the code, or are you asking how my code uses this?

    > When I was debugging this a bit, I think I saw a unique event
    > number for each key in the keymap (I assumed naively that
    > separate such events/keys
    > represented separate, individual characters).

    I don't understand this part.  How did you see such a
    number?  Contents of a keymap is usually a char-table, and
    for efficiency, if all characters in a group are mapped to
    the same thing, there's only a single table entry for them.

IIRC, I just debugged (debug-on-entry) map-keymap (or perhaps a function of
mine that called it), and I stepped through it in detail, to see what was

    > I cited two such event
    > numbers: 20864 and 20992. I don't think I ever saw the same
    > number used more
    > than once. That, to me, shows a unique identifier for each
    > such key in the
    > keymap - whether that key represents a single character or a group of
    > characters (a generic char). So that makes me think that each
    > such key could
    > have a unique name, couldn't it?

    > IOW, the items in the keymap are "keys", and it is they that
    > need unique
    > identifiers. Are you also saying that it is impossible for
    > each such key to
    > have a unique identifier? Regardless of the mapping between keys and
    > characters (or groups of chars), shouldn't we be able to
    > identify (describe)
    > each key uniquely?

    Each valid key should be identified uniquely by name.  But,
    as genenric characters are not a valid key, how it is useful
    being unique by name.

For one thing, it can help identify them to avoid using them for things like
`read-kbd-macro'. If any name is useful at all, a unique name is more
useful. Again, we're talking keys here, not characters - I understand now
that these keys (valid or not - I'm not sure what that means) each represent
multiple characters.

    > If we did that, and if some such keys corresponded to groups
    > of characters,
    > then perhaps the `read-kbd-macro' inverse functionality would
    > be broken for
    > such keys. That would be regrettable, but if there is no
    > alternative, so be
    > it.

    > However, wouldn't the binding of such keys to
    > `self-insert-command' also
    > produce broken behavior? That is, if such a key corresponds
    > to a group of
    > characters, and it is bound to `self-insert-command', how
    > does that work?
    > How does `self-insert-command' know which character of the
    > group to insert?
    > I didn't see anything in the manual about this relation.

    As I wrote before, it doesn't work, it signals an error.

My question is this: Why do these keys have as their binding
`self-insert-command'? That is what screws up my code, for instance. I hear
you saying that they are not valid keys, and you can't use them with
`self-insert-command', and you can't use them (their descriptions) with
`read-kbd-macro' - so why bind them to `self-insert-command'?

Could we perhaps bind them to something else, e.g. `invalid-key-command' or
`multi-char-key' (or whatever), and still make them work somehow as they do
now behind the scene? That would be preferable for me.

My code treats all keys bound to `self-insert-command', and, for now, I need
to match "Character set " against the `single-key-description', and then
remove them from consideration. If they were bound to some other command,
then I wouldn't have that problem, and no one would have the problems of
using keys with `self-insert-command' and their descriptions with
`read-kbd-macro' - these "invalid" keys would simply be out of consideration
because they would not be bound to `self-insert-command'.

    I agree that it is confusing that (lookup-key global-map
    [20864]) returns self-insert-command.  It may be better that
    it returns nil or signal an error.

That would be better for me, at least. But it's not just `lookup-key' -
things like `map-keymap' also (unless they all call `lookup-key').

    >     How did you get the key 20864?

    > By running through the debugger as it executed `map-keymap'.

    Ah, I see.  The docstring of map-keymap should be improved,
    perhaps something like this.

    Call function once for each event binding in keymap.
    function is called with two arguments: the event that is bound, and
    the definition it is bound to.  If the event is an integer,
    it may be a generic character, and that means that all
    actual character events belonging to that generic character
    are bound to the definition.

Maybe for the doc (manual), but I don't think it helps in the doc string.
There is no explanation of what a generic character is, and people will need
lots more info to be able to understand this, IMO.

    The reason why map-keymap doesn't call function for all
    actual characters is that such a code is tooooo inefficient
    (the function will be called more than 100000 times then).

    By the way, I've just fixed bindings.el to make the
    char-table in keymap more tight.

Cool, and thanks for the explanation.

Now, this is important: I said something VERY STUPID in my last email -
please don't do what I asked. I suggested that descriptions such as
"Character set JISX0208.1978 (Japanese)..." could be shortened to something
like japanese-jisx0208-1978-20864, perhaps, or just jp-jisx0208-1978-20864.
PLEASE DON'T DO THAT. My code counts on being able to recognize these
invalid keys and so ignore them. Currently, I match "^Character set "
against them. If you change the names at all, please keep something (some
prefix or suffix string) that lets people identify them easily by their key

I'm a bit disappointed that I won't be able to use these keys with my code,
BTW. I was hoping that this would be a simple bug, and these keys (which I
thought corresponded to individual chars) could just be given unique names.

That would have enabled my code to provide an alternative, poor-man's input
method for all languages (as a side effect). My code lets you complete
against key description-plus-command combinations, and for keys bound to
`self-insert-command', this lets you input the characters by picking them
through completion (mouse-2, or cycle through them and hit RET), whether or
not your keyboard can create them. You can either type part or all of the
key or command name (substring/regexp matching), or you can use `M-q' to
"quote" keys you hit on the keyboard, to get their descriptions in the
minibuffer to match the completion candidates. (The completion candidates
look like this: "w  =  self-insert-command".) This works fine for all but
the "invalid" keys that each correspond to multiple chars - those whose
descriptions begin "Character set ...".

I don't know if something along these lines might be adapted to use with
such languages - perhaps do what I do to first get a "generic char", and
then somehow complete against the chars in that character group? Any ideas
on that or any interest in it? My approach couldn't replace input methods,
obviously but it might be handy when you don't want to spend the time to
learn an input method for a few characters, and it could perhaps be helpful
to people who are handicapped (e.g. find it difficult to deal with whole
keyboards, but can use a mouse). You can see all the possible matching
characters, and just click the ones you want. Pick first a character group
(whatever that corresponds to), and then pick a character within the group -
all via completion. Anyway, if you do happen to be interested:

Thanks - Drew

reply via email to

[Prev in Thread] Current Thread [Next in Thread]