[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: TAB character in groff output

From: Ingo Schwarze
Subject: Re: TAB character in groff output
Date: Tue, 2 Aug 2022 18:58:38 +0200

Hi Branden,

G. Branden Robinson wrote on Tue, Aug 02, 2022 at 10:42:45AM -0500:
> At 2022-08-02T15:44:21+0200, Ingo Schwarze wrote:

>> In groff, this works for me:
>>  $ printf "a\\\\N'9'b" | groff -T ascii | hexdump -C | head -n 1
>> 00000000 61 09 62 0a 0a 0a 0a 0a  0a 0a 0a 0a 0a 0a 0a 0a |a.b.............|
>> Mandoc behaves differently and treats \N'9' exactly like a literal HT:
>>  $ printf "a\\\\N'9'b" | mandoc | hexdump -C | grep 61
>> 00000050 61 20 20 20 20 62 0a 0a  20 20 20 20 20 20 20 20 |a    b..        |
>> In general, mandoc lets fewer control characters sneak through into
>> output than groff because i worry that control characters in output
>> might occasionally cause reliability or security issues.

> I don't predict high reliability from this technique

Heh.  :-)

I'm used to this statement in the mandoc_char(7) manual page:

     For backward compatibility with existing manuals, mandoc(1)
     also supports the

           \N'<number>' and \[char<number>]

     escape sequences, inserting the character <number> from the
     current character set into the output.  Of course, this is
     inherently non-portable and is already marked as deprecated
     in the Heirloom roff manual; on top of that, the second form is
     a GNU extension.  For example, do not use \N'34' or \[char34],
     use \(dq, or even the plain ‘"’ character where possible.

So i assumed it is well-known to not be the pinnacle of portability.

But since

documents \N without explicitely calling out its inherent portability
problems, maybe i should have mentioned the trap.

Then again, the wording

  "Typeset the glyph with code n in the current font ..."

does provide an *implicit* hint that this can hardly be expected to
be device-independent.

> when attempting it on a platform that uses IBM code page 1047
> as its input encoding. ;-)

I would have expected the *output* font numbering to cause even
more serious trouble than the *input* encoding.  Besides, not being
a masochist to that degree you appear to assume, i prefer this

  printf "a\\\\N'9'b" | groff -T pdf > tmp.pdf

After that, the file tmp.pdf displays three characters:

  The letter "a", the Euro-sign (oops!?), and the letter "b".

I expect we are soon going to dissuade Alejandro from his plan.  :)


reply via email to

[Prev in Thread] Current Thread [Next in Thread]