[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[bug #57594] sync hyphenation pattern files with TeX versions
From: |
G. Branden Robinson |
Subject: |
[bug #57594] sync hyphenation pattern files with TeX versions |
Date: |
Sat, 10 Jul 2021 11:08:17 -0400 (EDT) |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Firefox/78.0 |
Update of bug #57594 (project groff):
Severity: 5 - Blocker => 3 - Normal
Status: None => Postponed
Summary: sync groff hyphenation-pattern files with upstream
TeX versions => sync hyphenation pattern files with TeX versions
_______________________________________________________
Follow-up Comment #4:
I've carried this as far as I can for the moment.
To do more is going to require adding support for UTF-8 input in the
hyphenation pattern files, as that's what the TeX hyph-utf8 project uses for
Czech, French, German, and Swedish.
We got lucky with English and Italian, which use ASCII.
On the bright side, finding or writing a UTF-8 input parser we can use for
reading hyphenation pattern files would, unless we're very unlucky, go a long
way toward equipping us to handle UTF-8 generally. If we stick the routines
in libgroff, several pieces of the GNU roff system can use them. The big lift
for troff itself is going to be moving the stuff in src/roff/troff/input.h to
Unicode-safe code points, something in the Private Use Area, I reckon.
Dropping the severity because we've done what we can for now, I think.
commit b2284ab01d2d87507f3bcbd7de2a081efb6528a6
Author: G. Branden Robinson <g.branden.robinson@gmail.com>
Date: Sun Jul 11 00:50:27 2021 +1000
Update English hyphenation patterns.
* NEWS: Add item.
* tmac/hyphen.en: Update file using `hyph-en-us.tex` patterns file from
the TeX hyph-utf8 project.
* tmac/hyphenex.en: Remove explicit hyphenations for words that no
longer require them when using the new patterns. Add one item scraped
from an erratum comment in hyphen.en ("dem-o-crat").
The new patterns likely _will_ change the automatic hyphenation break
points of your English documents. Here is a sample of affected words
found within groff's own documentary corpus.
OLD NEW
=== ===
ar‐range‐ment arrange‐ment
col‐umns columns
con‐struc‐ted con‐structed
cus‐tom‐ized cus‐tomized
def‐i‐ni‐tions de‐f‐i‐n‐i‐tions
der‐i‐va‐tions de‐riva‐tions
hy‐phen‐a‐tion hy‐phen‐ation
ma‐te‐rial ma‐te‐r‐ial
Mi‐cro‐soft Mi‐crosoft
pipe‐lines pipelines
post‐pro‐ces‐sors post‐proces‐sors
pro‐cessed processed
pro‐cesses processes
spa‐ces spaces
Wer‐ner Werner
Partially addresses <https://savannah.gnu.org/bugs/?57594>.
_______________________________________________________
Reply to this item at:
<https://savannah.gnu.org/bugs/?57594>
_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/
- [bug #57594] sync hyphenation pattern files with TeX versions,
G. Branden Robinson <=
- [bug #57594] sync hyphenation pattern files with TeX versions, G. Branden Robinson, 2021/07/10
- [bug #57594] sync hyphenation pattern files with TeX versions, Dave, 2021/07/10
- [bug #57594] sync hyphenation pattern files with TeX versions, Dave, 2021/07/10
- [bug #57594] sync hyphenation pattern files with TeX versions, Dave, 2021/07/20
- [bug #57594] sync hyphenation pattern files with TeX versions, G. Branden Robinson, 2021/07/20
- [bug #57594] sync hyphenation pattern files with TeX versions, G. Branden Robinson, 2021/07/20
- [bug #57594] sync hyphenation pattern files with TeX versions, Dave, 2021/07/28
- [bug #57594] sync hyphenation pattern files with TeX hyph-utf8 project and iconv them, G. Branden Robinson, 2021/07/28
- [bug #57594] sync hyphenation pattern files with TeX hyph-utf8 project and iconv them, G. Branden Robinson, 2021/07/28
- [bug #57594] sync hyphenation pattern files with TeX hyph-utf8 project and iconv them, Dave, 2021/07/30