[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywh

From: Eli Zaretskii
Subject: Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY))
Date: Tue, 26 May 2020 22:46:08 +0300

> From: Pip Cet <address@hidden>
> Date: Tue, 26 May 2020 18:13:55 +0000
> Cc: address@hidden, address@hidden, address@hidden
> > Assuming that the alternative for selecting the "context" is found,
> > and composite.c is augmented to apply it instead of the regexps, why
> > not use the rest of the automatic composition code to produce the
> > glyphs and display them?
> I chose not to do that for a patch which I have stated repeatedly was
> not in any way a finalized design, and I don't see any good reason to
> do it for a real patch, either, so far.

Why not?  How about trying to do that before giving up?

> (I'll be honest: I strongly suspect that the code is too slow, we know
> it to be buggy, and it's simply too different from what I actually
> want to benefit from sharing the code).
> > The code which does that exists and works,
> (I suspect: slowly)

Any measurements to back that up?  E.g., is scrolling through
etc/HELLO especially slow, once all the fonts were loaded (i.e. each
character in the file was displayed at least once)?

> > and is tested by years of use.
> It's unusable for me in Emacs 26.3.

How so? what doesn't work?  (And why are you using Emacs 26 and not
Emacs 27, where we support HarfBuzz and made several improvements and
bugfixes in the character composition area?)

> > It already solves the problems of look-ahead,
> If it does so efficiently, I'll certainly try reusing that code. But I
> strongly suspect it doesn't.

Why suspect? why not try and see what does and doesn't work, what is
and isn't efficient?

> > of wrapping long lines,
> Very poorly, for my purposes.

How so? what doesn't wrap correctly, and why?

> > and others, including (but not limited to) the dreaded bidi thing.
> Looking for "bidi" in composite.c, the only relevant thing I see is a FIXME.

That's because you look in the wrong place.  Once again, try looking
at etc/HELLO, there are portions of it that need both bidi and
compositions.  I can explain how it works (the code is spread over
several files), but please believe me that it does, it passed the
HarfBuzz developers' eyes most of whom are native Arabic and Farsi
speakers, and wouldn't allow us to display Arabic script incorrectly.

The whole point of using the existing code is that you don't _need_ to
understand how exactly we handle the bidi reordering when character
compositions are required.  It just works, for all you care.  It did
take several iterations to get right at the time; why would you want
to repeat all that, when the code is there to use and extend?

> > Why reinvent that wheel when we already have it, and it works well?
> First, because it doesn't work that well for my purposes;

What doesn't work? please be specific.

> second, precisely because it works well for the purposes of others,
> and I'd like to have as little impact as possible on existing use
> cases. They should just continue working, and so far they do.

You are thinking of breaking those other cases by your changes?  But
we haven't yet established that changes are needed, let alone which
changes.  How do you know you will break anything at all?

> > > Ligatures and kerning (right now, for LTR text). Is that a small
> > > problem because of the lack of RTL support?
> >
> > Yes, of course.
> Why?

Because the features you are talking about should "just work" in
Emacs.  Not only for some use cases and some scripts -- that is not
how we develop features.  Features that work only for some cases are
broken and will draw bug reports.  They make Emacs look unclean and

And there's no need to add such half-broken features because code that
supports much broader class of use cases already exists, you just need
to use it and maybe extend and augment it a bit.

> The code shouldn't break horribly for RTL text (it doesn't).

It _will_ break for RTL text, you just didn't yet see it because you
only tested it in simple use cases.  UAX#9 defines a lot of optional
features, including multi-level directional overrides and embeddings,
it isn't just right-to-left vs left-to-right.

Again, there's no need for you to reinvent this wheel, we already have
it figured out.

> > What's more, we already have the code which implements all
> > that, so I don't understand why you want to bypass it.
> We have something that superficially results in a similar screen
> layout to what I want, but that actually represents display elements
> in a way that makes them unusable for my purposes.

Then please describe what doesn't fit your purpose, and let's focus on
extending the existing code to do what's missing.  Throwing everything
away and starting anew is not the right way, it's a huge waste of
energy and time to implement something that we already have.  It is
also a maintenance burden in the long run.

Please note: I'm not talking about the regexp part -- that part you
anyway will need to decide how to extend or augment.  I'm telling you
right here and now that blindly taking a fixed amount of surrounding
text will not be acceptable.  You can either come up with some smarter
regexp (and you are wrong: the regexps in composition-function-table
do NOT have to match only fixed strings, you can see that they don't
in the part of the table we set up for the Arabic script); or you can
decide on something more complex, like a function.  Either way, the
amount of text that this will pick up and pass to the shaper should be
reasonable and should be determined by some understandable rules.  And
those rules must be controllable from Lisp.

But that is a separate part of the problem that you will need to
solve, and you will need to solve it whether or not you use character
compositions.  What I _am_ saying is that the rest of the machinery
that implements automatic compositions does exactly what you need: it
calls the shaper, handling LTR and RTL text as needed, then lays out
the glyphs the shaper returns in a way that handles all the usual
stuff our users expect, such as line wrapping and truncation.  It is
silly to disregard that code, so please don't.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]