[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: CHECKSTYLE suggestions: unnecessary quotations and unnecessary \f es

From: G. Branden Robinson
Subject: Re: CHECKSTYLE suggestions: unnecessary quotations and unnecessary \f escape
Date: Mon, 21 Mar 2022 01:07:11 +1100
User-agent: NeoMutt/20180716

Hi, Alex!

At 2022-03-19T17:07:09+0100, Alejandro Colomar (man-pages) wrote:
> While fixing style issues in the man-pages project,
> I'm finding a few recurrent issues that I think you could warn about:
> Unnecessary quotations:
> [
> .I "foo bar"
> .IR foo "bar"
> ]

That is going to be hard to detect from within a macro package.  As
noted in our recent discussion of quotation marks in macro calls, by the
time these arguments get to the `I` and `IR` macros, those macros have
no way of knowing of they were excessively quoted in the calling

I don't have a solution for this problem.  To solve it would require
modifying GNU troff's input parser to track some kind of "extraneous
quote" state.  Since as we saw in our earlier discussion, a sequence of
up to four double quotes can be perfectly valid, my intuition is that
this problem is worse than regex-hard, and the cost might rapidly
outweigh the benefit.

If you need this, it's probably better to just write a regex-based tool
that scans the man page source.  You can then enforce a stricter
discipline, permitting false positives on valid but unusual constructs
that would be better recast.

> Unnecessary escape \f:
> [
> foo \fIbar\fP baz
> ]
> The last one is more difficult to decide when it's unnecessary, but
> you could maybe start with non-formatted lines.

This is also a big challenge, and on my first reflection, even worse, as
you suspect.  The problem is that what you quote is an ordinary text
line, and *roffs don't generally look very far ahead when parsing.
There aren't many ways in the language to peek ahead in the input

The only ways I can think of would be to set up the macro package such
that all text lines get captured into a macro or diversion.  You might
then be able to iterate through the stored content somehow--though I
don't know off the top of my head a way to do this line by line.  I also
don't know how to do something like save some kind of pending input line
into a string for processing with the few simple requests we have for
that.  There's also the problem of interpreting that input well enough
to recognize undesirable constructs--do you want to write a troff in

Again I would attack this with a less perfect but much more tractable
regex-based input scanner.  I would filter out tbl(1) regions and then
flag _any_ font selection escape sequence that isn't on a control line,
meaning a line starting with '.' (that's an over-crudification[1], but I
predict that it will work well for most pages.  I'm attaching a shell
script I've come up with do this.  For groff's own pages, it mostly
turns up use of non-man(7)-standard fonts (not roman, bold, or italic)
and some pages I haven't yet done a thorough revision on.


[1] no-break control character, line continuation, yadda yadda yadda

Description: Bourne shell script

Attachment: signature.asc
Description: PGP signature

reply via email to

[Prev in Thread] Current Thread [Next in Thread]