bug#32599: 25.2; Feature request: input PUA characters by name

From: Eli Zaretskii
Subject: bug#32599: 25.2; Feature request: input PUA characters by name
Date: Sun, 26 May 2019 21:52:35 +0300

> From: address@hidden (Janusz S. Bień)
> Cc: address@hidden
> Date: Sun, 26 May 2019 19:33:20 +0200
> > The problems I alluded to start from the fact that we exclude the PUA
> > codepoints from the character property database.
> I understand this but I expected, perhaps incorrectly, that the code for
> it is rather simple.

Simple: yes.  But it is also tedious, as there are many characters.
See characters.el.

With PUA codepoints there's one more complication: their attributes
are not standardized, so there should be a way of defining that
dynamically, not as characters.el does for all the other characters
whose attributes are static.

> > This means you cannot change their case and access
> > their syntax category, for example.  Functions that select a suitable
> > font also ignore PUA codepoints, IIRC.
> So the problem is that the relevant code occurs in several places.

No.  Case-fiddling depends on data set up by characters.el, so once
that data is set, all the rest should "just work".  And similarly for
other attributes.

> > Etc. etc. -- Someone™ should
> > go over all the places where we specify character properties and use
> > them, and make sure PUA codepoints aren't disregarded.
> I understand one has to look for the occurences of a function or
> constant. Definitely there are tools to make this easy/easier (many
> years ago I was experimenting with tags file), now I would expect them
> to be quite sophisticated.

I'm not sure such tools will help here.  I think a thorough audit of
the related data and code is needed.

> How would you approach this problem?

I have no idea, sorry.  I mentioned some of the issues in the hope
that will help interested individuals to find the relevant places
easier.  How to modify the code and data in order to allow use of PUA
codepoints, and on top of that to have the properties of each PUA
codepoint determined from external data which isn't known in advance,
is part of the design problem the person who'd like working on this
will have to solve.  Personally, I'm surprised people use PUA for
these purposes, and even more surprised they expect Emacs to support
this.  But that's me.

