bug-groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug #64484] [troff] .device should encode special characters in `\[uXXX


From: G. Branden Robinson
Subject: [bug #64484] [troff] .device should encode special characters in `\[uXXXX]` form
Date: Sun, 8 Sep 2024 20:55:57 -0400 (EDT)

Follow-up Comment #25, bug #64484 (group groff):

Hi Deri,

At 2024-09-08T19:22:57-0400, Deri James wrote:
> Follow-up Comment #24, bug #64484 (group groff):
> 
> Just a couple of small problems:-
> 
> A bookmark like "Minus-hyphens are replaced by \[rs][u2010] in
> bookmark text" will look one way in document text but different in a
> bookmark (same is true using \E).

It sounds like it would be helpful to me to write a unit test
illustrating this very point.  So I'll do that.

> We need a way to differenciate when \E | \(rs are being used to
> prevent me from doing the conversion to a UTF-16 character. You can
> choose.

Yeah, once I see the problem concretely in front of me I'll see what I
can think up.

> The special character \0 is blocked as a horizontal motion, which is
> true as are all space characters in groff, but \~ and "\ " are passed
> as a single space, so should \0 be treated the same, if used in \X.

Maybe "\ " shouldn't be...

Horizontal motions' semantics differ slightly from spaces in the _roff_
language itself.  Whether they should collapse together in device
extension command arguments...is a good question.  I realize you're
concerned only with `\0` here, but it makes me wonder why we shouldn't
treat it (and "\ ") the same as `\h`.

I don't think we're ever going to purge all cases of needing to rewrite
a string that is used both as document text and metadata.  That is why I
was so gung-ho to implement a `for` request for iterating strings.

For example, if someone does this:

.ds bm this is my foo\h'1i'bar heading
.sp
.ps +2
.pdfbookmark 1 "\*[bm]"
.ft B
\*[bm]
.sp
.ft
.ps

They're just not going to get exact parity between the section heading
appearing in the document and that in the navigation pane.

At some point we have to document the limitations and reduce our
exposure to potential bug report complaining that we aren't DWIM.
_roff_ isn't a DWIM language.

> Also, we need to coordinate when you make .device, .output and \! work
> same way as \X,

Definitely.  However, things are looking to me, right this second, less
dramatic than I feared (planned?).

I still think `output` and `\!` should validate their inputs and not
pass through garbage bytes (this prohibition is already in place [mostly
or totally], and productive of the infamously inscrutable diagnostics),
but I finally perceive a documentable use case for ".output x X" and
"\!x X"; they should lack any of the "help a brother out with his
Unicode" facilities I've been putting into `\X` (and plan for `device`).
`output` and `\!` really should just put ASCII characters into
device-independent with as little transformation as possible.

A concrete, if whimsical, example might be:


\!x X pdf: exec [/Dest /bm1 /Title (30\[en]50 feral hogs) /Level 1 /OUT
pdfmark


...where I would expect '\[en]' to show up as-is.  No translation.

It's expert mode for experts.

But maybe a little less demanding of extreme expertise once use cases
are documented.

> since I will need to supply a new gropdf and pdf.tmac at the same
> time. If you can either start a new branch on Savannah, which you can
> cherry pick or rebase from when everything is working, or you can live
> with regressions on master for a month or so.

The first two of those (branch+cherry-pick or rebase) sound okay to me.
While you're hammering on gropdf and pdf.tmac, maybe I can get some of
the other 1.24 release goals sorted out.  For the moment I am concerning
myself with the grout state-flushing problem, which to my relief is
looking like it will require minimal changes outside of the formatter:
one additional request to the definition of the `pdfbackground` macro.
And maybe a cautionary note in the gropdf(1) man page about robotically
going to `\X` for that case; the user will then have to accept
responsibility for flushing the fill color.

I may have to come up with a way to kick the node emitter extra hard,
though, to convince it to write out a 'D F' command when it doesn't
think it applies to anything.  I'll see.

You can watch me thinking out loud at

https://savannah.gnu.org/bugs/?66187

if you like.

Possibly we can get away with, for groff 1.24, _only_ a mechanism for
flushing the fill color.  (I'm not happy with my to-do-for-release list
getting longer while I'm trying to shorten it.)  This stuff is so
esoteric that if anyone we don't already hear from on the _groff_ list
is using it, I'd kind of _like_ to provoke them into complaining to us
so we can work out what the use cases and consequent API should be, so
that we can _document and test_ these things.  What a concept, I know.

Regards,
Branden



    _______________________________________________________

Reply to this item at:

  <https://savannah.gnu.org/bugs/?64484>

_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]