bug-groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug #66051] [troff] permit special characters to have bespoke hyphenati


From: G. Branden Robinson
Subject: [bug #66051] [troff] permit special characters to have bespoke hyphenation codes
Date: Wed, 31 Jul 2024 17:08:40 -0400 (EDT)

URL:
  <https://savannah.gnu.org/bugs/?66051>

                 Summary: [troff] permit special characters to have bespoke
hyphenation codes
                   Group: GNU roff
               Submitter: gbranden
               Submitted: Wed 31 Jul 2024 09:08:38 PM UTC
                Category: Core
                Severity: 1 - Wish
              Item Group: Feature change
                  Status: None
                 Privacy: Public
             Assigned to: None
             Open/Closed: Open
         Discussion Lock: Any
         Planned Release: None


    _______________________________________________________

Follow-up Comments:


-------------------------------------------------------
Date: Wed 31 Jul 2024 09:08:38 PM UTC By: G. Branden Robinson <gbranden>
This idea is a descendant of bug #42870, which asked for something a little
more modest.

The concept is this.

If we can do this...


.hcode ß ß


...why can't we do this?


.hcode \[ss] \[ss]


This has long produced a diagnostic.


$ printf '.hcode \\[ss] \\[ss]\n' | ~/groff-stable/bin/groff
troff:<standard input>:1: error: hyphenation code must be ordinary character


I suggested an answer in bug #66040, comment 9.

> Because the formatter doesn't know what [hyphenation code] value to give
[the special character].  Under the hood, [a hyphenation code] is just a
character code--in other words, on an ISO 8859 system, the hyphenation codes
for 'a' through 'z' are 97 through 122--but our documentation stands on its
head to avoid saying that.  The trouble is that there is a potentially larger
space of _sui generis_ special characters, by which I mean ones that don't
belong to an equivalence class of a Basic Latin letter.  [... The] German
Eszett [for example] is not.  If we had an Icelandic locale, thorn and eth
would similarly have to have hyphenation codes above 127 decimal.
> 
> The real fun comes when you add letters from multiple ISO 8859 character
sets.  Before long you're going to have collisions.
> 
> So it's good that our documentation does the headstand.  We should not
disclose what the hyphenation code values _are_, we need only to ensure that
they sort into the correct equivalence classes, so that they then interoperate
as desired with the hyphenation patterns.
> 
> When we get support for UTF-8-encoded hyphenation pattern files, things will
become straightforward again.
> 
> In the meantime, what I think I will do is use a `static int` to mint a
sequence number (starting at 256) for hyphenation codes any time a special
character needs one _sui generis_.







    _______________________________________________________

Reply to this item at:

  <https://savannah.gnu.org/bugs/?66051>

_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]