[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug #55107] PDFPIC: .psbb: support extraction of MediaBox from pdf file

From: Keith Marshall
Subject: [bug #55107] PDFPIC: .psbb: support extraction of MediaBox from pdf files
Date: Sat, 30 Oct 2021 10:36:22 -0400 (EDT)
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:93.0) Gecko/20100101 Firefox/93.0

Follow-up Comment #8, bug #55107 (project groff):

[comment #5 comment #5:]
> Am I correct in guessing that a bounding box/MediaBox extractor would have a
lot of shared logic for PS and PDF?
No.  I wrote my proposed extractor as a lex/yacc parser; from its initial
state, it diverges into two entirely distinct branches of execution, on the
basis of whether the first few bytes of the image file are '%!PS-Adobe-' or
'%PDF-', (and aborts, if anything else); the two branches converge only at the
bitter end, when the yacc terminal rule (ultimately) assigns the troff
bounding box registers.

FWIW, my EPS parsing code is feature complete.  OTOH, the PDF parsing for PDF
works only for PDF-1.4 conformant files, (it lacks rules for interpretation of
XRefStm objects, Object streams, and deflated content).  The "significant
development", to which I referred, is extend the existing lex pattern set, and
yacc grammar, to support those additional features, (if required).

Personally, I don't see a justification for implementing psbb as a
preprocessor.  I am willing to pursue an extended lex/yacc implementation,
(subject to Deri actually answering the question I've now asked twice, without
a response: should CropBoxes, or any of PDF's other bounding box attributes,
have precedence over the MediaBox attributes?), but if you insist on pursuing
a solution in Perl ... a disgusting language, IMO ... then I'm out.


Reply to this item at:


  Message sent via Savannah

reply via email to

[Prev in Thread] Current Thread [Next in Thread]