Re: Ghostscript/GhostPDL 9.22 Release Candidate 1

lilypond-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Ghostscript/GhostPDL 9.22 Release Candidate 1

From:	David Kastrup
Subject:	Re: Ghostscript/GhostPDL 9.22 Release Candidate 1
Date:	Tue, 19 Sep 2017 17:35:12 +0200
User-agent:	Gnus/5.13 (Gnus v5.13) Emacs/26.0.50 (gnu/linux)

Ken Sharp <address@hidden> writes:

> At 15:44 19/09/2017 +0200, David Kastrup wrote:
>
>
>>Are there any example documents with thousands of pages and ten
>>thousands of PDF inclusions one could look at?
>
> I would suggest that the fact you want to 'include' tens of thousands
> of PDF files to be the problem, really.

I prefer the term "challenge" myself since there is nothing inherently
problematic apart from the scale of the document.

> I appreciate you are trying to deal with an existing problem, but
> using Ghostscript to do something it wasn't intended for isn't really
> the best idea for solving the problem.
>
> As I've said elsewhere there is a genuine bug which can be exposed
> doing what you want with Ghostscript and it would not surprise me if
> in the long run it causes you another problem.

Neither would it surprise me.  But as I said, we are actually navigating
a compromise between various solutions and tools and the various
unexpected problems they cause.  If you can present a path that will not
cause any problem at all while still producing good documents of the
required size and type, nobody will be happier than myself.

But that does not appear like a feasible option within reach any time
soon, so the possibility of a future problem does not keep me from
trying to deal with a current problem.

> It would be possible to write a tool which could reliably detect
> identical fonts in a PDF file, remove the duplicates and alter the
> references so that the PDF continued to work. In all honesty, if the
> problem is as important as you say, this is probably a better
> solution. A tailored program, specifically designed to solve a
> specific problem is much more likely to work reliably than trying to
> use a general purpose program, designed for a different problem.

TeX is designed for the problem of creating documents and all current
TeX engines offer ways of including externally created inclusions in a
graphic format.  And Ghostscript, far from being a general purpose
program, is designed for executing PostScript code and producing
printable renditions, even if its PDF writer has been created quite
later than its original PostScript interpreting core.

So we are not really using anything at cross-purposes just because we
are employing it at large scale.  To make this a bit less theoretical,
the various versions of the Notation Reference can be found at
<http://lilypond.org/doc/v2.19/Documentation/web/notation>.  For getting
an impression of the content, you may look at the "split HTML" version,
and the PDF is there as well.

> This is extracted from an email I decided earlier not to send:
> -----------------------------------------------------------------------------

[...]

> Assuming that you are using TeX throughout for your documentation,
> then it seems to me that you should be creating your final document by
> appending the various TeX documents together and then producing a
> final PDF, instead of appending multiple PDF files.

This is a misconception of our document creation process.  There is only
a single TeX document (actually a Texinfo document), but interspersed
with the main text it includes example output from several thousands of
individual LilyPond runs.  LilyPond's current native output format for
this purpose is PostScript which is converted to PDF using pdftops
(namely, Ghostscript).  Those PDF files are inserted into the final PDF
file while it is being generated by a TeX engine from the Texinfo input.

Producing at first a DVI file and turning that into a single PostScript
file then converted into PDF rather than using a PDF-producing (and
including) TeX engine sounds like a workable idea until you realize that
the DVI/PostScript path is badly equipped working with Unicode-range
fonts: PostScript is only part of "legacy" TeX workflows centered around
8-bit encodings.

> Presumably you want to show some parts of Lilypond as well,

Not "as well": this is actually the principal problem.  All the rest is
a single document, consequently not having a lot of font overhead.

> so I would create EPS figures for those. It will of course increase
> the number of font inclusions again, but in the case of Lilypond I
> don't think that you can be merging the fonts anyway, because Lilypond
> always uses glyphshow, and pdfwrite will create a uniquely named font
> for each usage.

I am not into the details here (Masamichi-san?), but this font merging
of the included files is _exactly_ responsible for the reported space
savings.

> So you aren't gaining any benefit from exploiting the Ghostscript bug
> with the Lilypond output.

But we are.  I hope that Masamichi-san (or Kurt?) can provide the
details here in order to give you a better picture.

> So by maintaining the text and layout in TeX, inserting EPS figures as
> required, and only producing PDF as the last step in the process you
> would create a file which (as I understand it) would only contain a
> single instance of each font.
>
> in short I'm not really suggesting that you change anything except
> your working practices, and maintain your files as TeX files rather
> than as PDF.

Those files maintained as TeX files are not "included" as PDF but are
_produced_ as the final PDF output from a single TeX run that includes
hosts of PDF files resulting from LilyPond runs (which LilyPond first
writes out in PostScript and has then converted to PDF using ps2pdf).
The _TeX_ files are never converted into "intermediate PDF".  There may
be something like a dozen exceptions or so, but the vast bulk of
"intermediate PDF" is generated from LilyPond's PostScript output via
ps2pdf.

> Because I don't have any knowledge of your workflow (or TeX) I cannot
> say if this is reasonable, it may well not be.

It's assuming a different problem than the one we are dealing with.  So
obviously my attempts at explanation have been assuming too much prior
knowledge, putting us on different pages and talking about different
problems.  I apologize for wasting your time in that manner: we may well
disagree about how to best solve LilyPond's problems, but we should be
actually talking about the same problems for this to mean anything.

-- 
David Kastrup

[Prev in Thread]

Current Thread

[Next in Thread]

Re: Ghostscript/GhostPDL 9.22 Release Candidate 1, (continued)

Prev by Date: Re: Ghostscript/GhostPDL 9.22 Release Candidate 1
Next by Date: Re: Ghostscript/GhostPDL 9.22 Release Candidate 1
Previous by thread: Re: Ghostscript/GhostPDL 9.22 Release Candidate 1
Next by thread: Re: Ghostscript/GhostPDL 9.22 Release Candidate 1
Index(es):
- Date
- Thread