[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug #55107] PDFPIC: .psbb: support extraction of MediaBox from pdf file

From: G. Branden Robinson
Subject: [bug #55107] PDFPIC: .psbb: support extraction of MediaBox from pdf files
Date: Sat, 30 Oct 2021 09:46:30 -0400 (EDT)
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Firefox/78.0

Follow-up Comment #5, bug #55107 (project groff):

I'll approximate the conclusion of Deri's mail of 13 October.

[Keith Marshall wrote:]
>> some further (non-trivial) development effort will be required, to support
concealment of trailer dictionaries and cross reference tables within /XRefStm

[Deri James wrote:]
> There are several options which would address this problem, i.e. non
portability of grep and desirability of avoiding groff unsafe mode.

> A) Replace grep with sed/awk (still requires unsafe mode).

> B) Use psbb (requires "non-trivial development").

> C) Use pdfbb (requires hook in input.cpp to call pdfbb and return results).

> D) Convert pdfbb to be a pre-gropdf (i.e. a preprocessor like pre-grohtml)
which would look for .PDFPIC and replace with the appropriate calls to \X'pdf:
pdfpic’ and add vertical space with .sp.

> (A) is obviously the easiest and quickest, (C) and (D) are not too much
work, since the parser required is already in use.

Okay, it's Branden again.  My inclination is (A) to get a short-run fix in
place to get the splinter out of users' paws no matter when groff 1.23.0 gets
released, and (D) for the longer run.

I would emphasize, lest the point be overlooked, that a preprocessor's
interface to the rest of a troff system is a file stream.  Therefore we can
write them in Perl, the shell, or yet another language if necessary.  Perl
seems the most likely alternative since we already have a Perl v5.6.1
dependency, and none on any other popular scripting language except for the
shell, which is a pretty gross language to write a parser in (though I admit
I've done it).

Am I correct in guessing that a bounding box/MediaBox extractor would have a
lot of shared logic for PS and PDF?  If so, one preprocessor could perform
both tasks.  I guess we could call it "grobb" or something, and assign a -B
flag to it in groff(1).  In fact, if we have a preprocessor and claim an
option letter, it's probably best if it's as general-purpose as is reasonable.


Reply to this item at:


  Message sent via Savannah

reply via email to

[Prev in Thread] Current Thread [Next in Thread]