help-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Display of decomposed characters


From: Eli Zaretskii
Subject: Re: Display of decomposed characters
Date: Sun, 28 Feb 2021 20:42:39 +0200

> From: Philipp <p.stephani2@gmail.com>
> Date: Sun, 28 Feb 2021 19:10:57 +0100
> Cc: help-gnu-emacs@gnu.org
> 
> >> The font will always support the composite variant (because it's part
> >> of Latin-1).
> > 
> > That is only relevant if Emacs decides to compose the characters.
> > Then, and only then, will it ask the text-shaping engine to produce
> > glyphs for the base character and the accent together, and then the
> > font could provide a single precomposed glyph for them.
> 
> So in this case the decision to not compose the characters is incorrect or 
> happens too early?

That's one way of looking at the issue.  But it will lead you to the
conclusion that Emacs should send all the text it displays through the
shaping engine, which with the current design of how this stuff works
in Emacs will be much slower than what we have.  IOW, doing something
like that requires redesign of how we display text.

> >> I guess fonts assume that applications will first try to normalize
> >> strings to avoid issues like this?
> > 
> > Normalizing strings before you know whether the font has the
> > precomposed glyphs makes no sense.
> 
> Why? If the font doesn’t support a precomposed character, wouldn’t
> the rendering engine automatically fall back to a decomposed
> representation?

No.  How can it?

The fallback is in the composition code, not in the renderer.  The
latter just lays out the glyphs that it gets from the composition
code.  (Assuming that when you say "rendering engine" you mean the
part in the Emacs display code which handles layout.)

IOW, there's no "font doesn't support" in Emacs.  It works like this:

  . we check whether the current character should compose with the
    following and/or preceding ones
    . if it should compose, then:
      . pass the chunk of text that should compose to the shaping
        engine (e.g., HarfBuzz)
      . if the shaping engine succeeds, render the glyphs it returns
    . otherwise render the original character "normally", i.e. without
      consulting the shaping engine

(The above omits some secondary details in the interests of clarity.)
The "otherwise" part is the fallback you alluded to.  As you see, we
never ask the font, we only talk to the shaping engine.

> IOW, would normalizing strings to NFC before sending them to the rendering 
> engine ever break anything?

Yes, it might.  Shaping engines don't usually decompose characters if
they get codepoints of precomposed ones.

Moreover, some precomposed glyphs don't even have codepoints, so you
cannot even ask the shaper to produce them by passing it a precomposed
character in that case -- such a character doesn't exist.

> > What the text-shaping folks tell us is that we should pass _all_ the
> > text through the text shaper, then the shaper will DTRT in every
> > case.  But this would mean a thorough redesign and reimplementation of
> > how we do that in Emacs, and that is not easy if we want to keep the
> > current flexibility and customizability (which is why the character
> > composition code calls out to Lisp, and that makes sending all the
> > text that way tool expensive to be practical).
> 
> Would it be possible to implement a more minimal change to fix the problem at 
> hand?

Like what?  (And why we are discussing such an issue on the help
list?)

> >> Does it ever make sense to pick different fonts for a base character
> >> and its combining characters?
> > 
> > If the default font doesn't support the combining accent, what else
> > can you do?  Most fonts don't have precomposed glyphs for every
> > arbitrary sequence of base character followed by several combining
> > accents.  So sometimes you will have to compose the accents "by hand",
> > and that is not really possible if they come from different fonts.
> 
> Which is why they shouldn’t come from different fonts. What if Emacs ignored 
> font lookup for combining characters and always picked the font of the 
> previous base character?

What would that produce if the font of the previous character didn't
have a glyph for the accent?  The accent will disappear, or maybe will
be displayed as "tofu", right?  Does that sound like a good strategy?

> >> Wouldn't that fundamentally prevent using combining characters? IIUC
> >> text rendering engines should be able to pick the right glyph if
> >> that didn't happen (assuming they can perform Unicode
> >> normalization).
> > 
> > Unicode normalization is only tangentially relevant here.
> 
> Sure, but in this case it would fix them problem AFICS.

Sorry, I no longer understand what was this about (what does "that"
allude to here?).  That's bound to happen when a response comes more
than a month after the original exchange.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]