[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywh
From: |
Pip Cet |
Subject: |
Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)) |
Date: |
Wed, 27 May 2020 09:36:52 +0000 |
On Tue, May 26, 2020 at 7:46 PM Eli Zaretskii <address@hidden> wrote:
> > From: Pip Cet <address@hidden>
> > Date: Tue, 26 May 2020 18:13:55 +0000
> > Cc: address@hidden, address@hidden, address@hidden
> >
> > > Assuming that the alternative for selecting the "context" is found,
> > > and composite.c is augmented to apply it instead of the regexps, why
> > > not use the rest of the automatic composition code to produce the
> > > glyphs and display them?
> >
> > I chose not to do that for a patch which I have stated repeatedly was
> > not in any way a finalized design, and I don't see any good reason to
> > do it for a real patch, either, so far.
>
> Why not?
Which part are you asking about? I don't see any good reason because
I've read the composite.c code (I'm not ignoring it), with an eye to
reusing what's reusable, and come up empty.
But you've convinced me I need to do a careful rereading.
> > > The code which does that exists and works,
> >
> > (I suspect: slowly)
>
> Any measurements to back that up?
Yes. With a regexp of "....", the composite.c code takes 175 billion
cycles to display every line of composite.c. My code takes 144 billion
cycles, with a lookahead/lookbehind each set to 128 but limiting it as
described.
> E.g., is scrolling through
> etc/HELLO especially slow, once all the fonts were loaded (i.e. each
> character in the file was displayed at least once)?
> (And why are you using Emacs 26 and not
> Emacs 27, where we support HarfBuzz and made several improvements and
> bugfixes in the character composition area?)
Because I was trying to test your implication that all this was usable
years ago. It wasn't. I'm not using Emacs 26 :-)
> > > It already solves the problems of look-ahead,
> >
> > If it does so efficiently, I'll certainly try reusing that code. But I
> > strongly suspect it doesn't.
>
> Why suspect? why not try and see what does and doesn't work, what is
> and isn't efficient?
I have, now, coming up with the above measurement which confirms my suspicion.
> > > and others, including (but not limited to) the dreaded bidi thing.
> >
> > Looking for "bidi" in composite.c, the only relevant thing I see is a FIXME.
>
> That's because you look in the wrong place.
What's the right place? I'm using all the code in bidi.c, of course,
so as far as I can tell what I'm not doing is using composite.c...
> Once again, try looking
> at etc/HELLO, there are portions of it that need both bidi and
> compositions. I can explain how it works (the code is spread over
> several files), but please believe me that it does, it passed the
> HarfBuzz developers' eyes most of whom are native Arabic and Farsi
> speakers, and wouldn't allow us to display Arabic script incorrectly.
>
> The whole point of using the existing code is that you don't _need_ to
> understand how exactly we handle the bidi reordering when character
> compositions are required.
But that's true without using the existing code!
> It just works, for all you care. It did
> take several iterations to get right at the time; why would you want
> to repeat all that, when the code is there to use and extend?
> > second, precisely because it works well for the purposes of others,
> > and I'd like to have as little impact as possible on existing use
> > cases. They should just continue working, and so far they do.
>
> You are thinking of breaking those other cases by your changes?
No! If I break them, that's a severe bug in my code!
> But
> we haven't yet established that changes are needed,
"Enter"ing ligature glyphs is definitely something we need to do
before any user can reasonably use variable-pitch fonts with ligatures
for displaying English text. Kerning is another such thing. Both don't
work with the current code.
> Because the features you are talking about should "just work" in
> Emacs.
> Not only for some use cases and some scripts -- that is not
> how we develop features. Features that work only for some cases are
> broken and will draw bug reports. They make Emacs look unclean and
> unprofessional.
Not as much as the current lack of support does.
> And there's no need to add such half-broken features because code that
> supports much broader class of use cases already exists, you just need
> to use it and maybe extend and augment it a bit.
I don't think I agree with the "a bit".
> > The code shouldn't break horribly for RTL text (it doesn't).
>
> It _will_ break for RTL text, you just didn't yet see it because you
> only tested it in simple use cases. UAX#9 defines a lot of optional
> features, including multi-level directional overrides and embeddings,
> it isn't just right-to-left vs left-to-right.
I assume bidi.c handles that, as it does for composite.c?
> > > What's more, we already have the code which implements all
> > > that, so I don't understand why you want to bypass it.
> >
> > We have something that superficially results in a similar screen
> > layout to what I want, but that actually represents display elements
> > in a way that makes them unusable for my purposes.
>
> Then please describe what doesn't fit your purpose, and let's focus on
> extending the existing code to do what's missing.
The three main things are:
- "entering" glyphs, instead of treating them as atomic
- providing context automatically rather than by providing specific
regexps for it in advance
- kerning, which requires context for every character
Secondary concerns:
- ligatures that come partly from a display property and partly from
the buffer (composite.c doesn't allow for those, as far as I can tell)
> Please note: I'm not talking about the regexp part -- that part you
> anyway will need to decide how to extend or augment. I'm telling you
> right here and now that blindly taking a fixed amount of surrounding
> text will not be acceptable. You can either come up with some smarter
> regexp (and you are wrong: the regexps in composition-function-table
> do NOT have to match only fixed strings, you can see that they don't
> in the part of the table we set up for the Arabic script);
Again, I think the limits are fixed: 4 characters of history and 500
characters of look-ahead. What am I missing?
> or you can
> decide on something more complex, like a function. Either way, the
> amount of text that this will pick up and pass to the shaper should be
> reasonable and should be determined by some understandable rules. And
> those rules must be controllable from Lisp.
That last part isn't true for the composite.c code, which imposes a
limit of 4 characters of history and 500 characters of look-ahead, as
far as I can tell. But, sure, if that's a requirement, I'll keep it in
mind.
> But that is a separate part of the problem that you will need to
> solve, and you will need to solve it whether or not you use character
> compositions. What I _am_ saying is that the rest of the machinery
> that implements automatic compositions does exactly what you need: it
> calls the shaper, handling LTR and RTL text as needed, then lays out
> the glyphs the shaper returns in a way that handles all the usual
> stuff our users expect, such as line wrapping and truncation.
> It is silly to disregard that code, so please don't.
You've convinced me that it's worth reading it again, more carefully,
but I'm not optimistic I'll come to a different conclusion this time
around.
- Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)), (continued)
- Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)), Pip Cet, 2020/05/23
- Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)), Eli Zaretskii, 2020/05/23
- Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)), Pip Cet, 2020/05/23
- Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)), Eli Zaretskii, 2020/05/23
- Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)), Pip Cet, 2020/05/23
- Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)), Eli Zaretskii, 2020/05/23
- Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)), Pip Cet, 2020/05/23
- Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)), Eli Zaretskii, 2020/05/24
- Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)), Pip Cet, 2020/05/26
- Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)), Eli Zaretskii, 2020/05/26
- Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)),
Pip Cet <=
- Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)), Eli Zaretskii, 2020/05/27
- Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)), Pip Cet, 2020/05/27
- Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)), Eli Zaretskii, 2020/05/27
- Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)), Eli Zaretskii, 2020/05/23
- Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)), Pip Cet, 2020/05/23
- Re: Ligatures (was: Unify the Platforms: Cairo+FreeType+Harfbuzz Everywhere (except TTY)), Eli Zaretskii, 2020/05/24
- Re: Ligatures, Stefan Monnier, 2020/05/23
- Re: Ligatures, Eli Zaretskii, 2020/05/23
- Re: Ligatures, Stefan Monnier, 2020/05/23
- Re: Ligatures, Eli Zaretskii, 2020/05/23