groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

groff 1.23.0.rc2 scope reduction and a `for` request


From: G. Branden Robinson
Subject: groff 1.23.0.rc2 scope reduction and a `for` request
Date: Thu, 22 Dec 2022 20:21:27 -0600

Hi Alex,

At 2022-12-23T01:03:28+0100, Alejandro Colomar wrote:
> On 12/22/22 22:57, G. Branden Robinson wrote:
> > I've already found a typo.  😅
> 
> Fixed.  :D

Hah, you even knew which one??  [pulls]  Aha, no you didn't.  :D  [1]

> BTW, maybe it's me that I don't care too much about having a perfect
> groff_mdoc(7)...  But if the remaining bugs you mentioned earlier this
> week are not regressions, and don't seem grave either, I'd consider
> declaring them non-RC, and releasing already.  I'm fearing the freeze
> is too close already.

Fair.

https://release.debian.org/testing/freeze_policy.html

groff 1.23.0 isn't going to kick any other packages over but I'd feel
better about making, or only narrowly missing, that first milestone,
"Transition and Toolchain Freeze", on 13 January.

No, the remaining bugs I mentioned are not regressions, they're just
aspects of parity I wanted between groff_man(7) and groff_mdoc(7) in
various ways, but even as-is, groff-man-pages.pdf doesn't look bad or
even noticeably different when you shift from man(7) to mdoc(7)
documents and back.  The only loose end from a gross UI and formatting
perspective is the `HF` string, and that requires only a port of the
logic from an.tmac.

As much as I would like to pursue various other mdoc(7) issues, it is
probably a can of vermiculi I should not further open at this time
(apart from the usual man page revisions).  Ingo has gone quiet and he
is certainly going to have opinions.  And if we have hyperlinks for
man(7) documents but not mdoc(7), that's a shame, but all the true
mdoc(7) believers are already using mandoc(1) anyway.  So while I
philosophically don't want mdoc(7) to endure second class support, that
is kind of where things are in groff and always have been, because all
but one of our man pages aren't written in it.  Parity is better now
than at any previous time in the past 20 years--maybe I should regard
that as a victory condition.

Also I am starting to think that really getting generalized PDF
bookmark/hyperlink support is going to require frying a much bigger fish
than we have time for, and there is no express demand for or against
further entrenchment of the current hacks--people care only about
results.

That bigger fish arises from the observation that seemingly everywhere
we prepare groff strings/macros/diversions for handoff to a device
control escape sequence, we end up with some new variation on bespoke
logic to tediously walk a string (more or less).  These sequences of
code are long and surely not easy for the novice to understand.

So one of the things I want to look into for groff 1.24 is giving the
language an actual string iterator--a `for` request.  And a couple of
new conditional expression operators to perform tests on the items
returned by that iterator.

Some background for this is in <https://savannah.gnu.org/bugs/?62264>
("string iteration handles escape sequences inconsistently").

Here's the idea.

.di div
Here's my \f[CI]crazy\f[] diversion!
.di
.
.ds div*scrubbed \" empty
.
.for ch \*[div] \{\
.  if !N \*[ch]  .as div*scrubbed \n*[ch]
.  if '\*[ch]'@' .break
.  if e          .continue
.  \" some crazy stuff you do only on odd pages
.\}

The above is really contrived, but the idea is to communicate as much of
the semantics as I think we could want.

1. No messing with `length` or `substring` operations.
2. Address #62264.  Document that string iteration can hand you back any
   of (A) a Basic Latin character; (B) a special character; or (C) a
   "node" (like a type face or size changing operation, but the details
   aren't important as its "formatty" stuff, not "plain text" stuff).
3. A new 'N' conditional expression operator tests string contents for
   node identity.  I don't know whether this should test just the first
   element of the string or scan the whole thing.  In the example above,
   it doesn't matter--`for` guarantees that the `ch` string is a
   singleton.

   Giving the *roff programmer a way to cope with this is the correct
   way to solve this old chestnut.

     can't transparently output node at top level

4. Need to decide whether the string `ch` is left defined after the for
   loop exits.
5. You should be able to `break` or `continue` a `for` loop just as you
   can a `while` loop.
6. A lingering issue is our other old friend.

     can't translate character code 233 to special character ''e' in
     transparent throughput

   This is <https://savannah.gnu.org/bugs/?63074>.

Here's another use case, way less hypothetical.  It's some Deri magic.

.\" Remove '\%' from string used as bookmark destination
.de an*cln
.  ds \\$1
.  als an*cln:res \\$1
.  shift
.  ds an*cln:res \\$*\"
.  ds an*cln:char \\*[an*cln:res]
.  substring an*cln:char 0 0
.  if '\\*[an*cln:char]'\%' .substring an*cln:res 1
.  rm an*cln:char
..

Here's how you'd do it with `for`.

.ds output \" empty
.
.for ch \*[input] \{\
.  if !'\*[ch]'\%' .as output \*[ch]
.\}

As syntactic sugar goes, I'd say that enables considerable slimming.

This would probably also compel us to clear up our documentation (and
our thinking) a lot with respect to what's really a "character" in
groff.  \- is.  \% is.  \f isn't (if the remainder is well-formed, it
becomes a node).  What about the "leader character" (Ctrl+A)?  Or the
uninterpreted leader character \a?  Many of these things have the word
"character" in their names but, for example, you can't test them with
".if c".

Consider:

.ds string \&\%
.ds char \*[string]
.substring char 0 0
.if c \*[char] .tm it is a character
troff:<standard input>:6: error: expected ordinary or special character, got an 
escaped '&'
.ds char \*[string]
.substring char 1 1
.if c \*[char] .tm it is a character
troff:<standard input>:9: error: expected ordinary or special character, got an 
escaped '%'

Keith Marshall wrote an entire macro file to deal with this sort of
thing.[1]

Maybe consideration of these issues is affecting my priorities and
making me want to get over the release hump so I can work on them.

> I'm discussing (with some Clang developers / WG14 members) on doing
> (2) for a future C standard, and trying to come up with a way that
> backwards compatibility doesn't get in the way.

Good luck, and I hope your efforts reach fruition.

> Uhhh, I don't remember having read an entire book about C or Unix (I
> still have only read <10% of mtk's TLPI; two years after he gave it to
> me as a present :p I'll finish it some day, hopefully).  It's unlikely
> that I'll find the courage to read that about Ada.

Well, it's the rationale document, not the language reference manual, so
it is written less like standardese and more like one of the opinionated
emails we send to each other.  If that helps.  :P

> Which reminds me that I didn't yet fully read groff_man{,_style}(7)...
> At this point, I'll assume it's good enough, and maybe come back to it
> after 1.23, to not nerd-snipe you :)

I snipe myself constantly, which is why it's so hard for me call
something "done", or even "ready"...

> I'm a bit surprised that _Generic(3) is little known, compared to C++
> templates.  Probably C programmers learned from C++, and avoid it,
> even if only for keeping compilation times low.  But _Generic(3) is
> very powerful too.  I implemented some C++ casts in C using it.

The leading underscore doesn't help, I'm sure.  But much of the
reluctance I am sure comes from bias against C++ or against OO languages
in general; from a poor understanding of what type-generic interfaces
can do for you; from the mistaken belief that type-generic interfaces
_are_ inherently OO; from the mistaken(?) belief that a
close-to-the-metal programmer doesn't need to be concerned with such
techniques; and reluctance to learn new things.

> >      "In programming, everything we do is a special case of
> >      something more general -- and often we know it too quickly." --
> >      Alan J. Perlis
> 
> guilty :p

So are we all.

Regards,
Branden

[1]

commit c309b6d8d93343de78f2fa9e1ad2a44860b136c7 (HEAD -> master)
Author:     G. Branden Robinson <g.branden.robinson@gmail.com>
AuthorDate: Thu Dec 22 14:02:08 2022 -0600
Commit:     G. Branden Robinson <g.branden.robinson@gmail.com>
CommitDate: Thu Dec 22 20:17:21 2022 -0600

    intro.3: tfix

diff --git a/man3/intro.3 b/man3/intro.3
index b08eca5ac..e85c0677a 100644
--- a/man3/intro.3
+++ b/man3/intro.3
@@ -89,7 +89,7 @@ Together,
 these are termed an API or
 .IR "application program interface" .
 Types and constants to be shared among multiple APIs
-shopuld be placed in header files that declare no functions.
+should be placed in header files that declare no functions.
 This organization permits a C library module
 to be documented concisely with one header file per manual page.
 Such an approach

[2] 
https://git.savannah.gnu.org/cgit/groff.git/tree/contrib/pdfmark/sanitize.tmac

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]