[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [gs-devel] Ghostscript/GhostPDL 9.22 Release Candidate 1

From: Knut Petersen
Subject: Re: [gs-devel] Ghostscript/GhostPDL 9.22 Release Candidate 1
Date: Tue, 19 Sep 2017 07:48:41 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.3.0

Am 19.09.2017 um 02:27 schrieb Perry Hutchison:
There is a tool for using this method of removing duplicate fonts.
As I see it, the availability of a separate tool to do the same thing
is a reason to _not_ provide a duplicate capability in Ghostscript.
Those who want that processing (despite the risks that Ken mentioned)
can use extractpdfmark.

Masamichi described extractpdfmark in a misleading way. pdfmark is a helper program that generates a postscript file from a pdf document. Both, the pdf document and the ps generated by pdfmark, are then processed by ghostscript to generate a final pdf. extractpdfmark is only used if the source pdf containes links / hyperrefs that would otherwise be broken during the final pass of ghostscript.

A typical way to write a musicological document, a collection of songs, a 
lilypond manual etc is described below:

1. Write your music.
2. Use "lilypond --bigpdf" to generate pdfs. Internally lilypond generates 
postscript files and then runs ghostscript to generate pdfs.
3. Write the pdf(la)tex/lua(la)tex/xe(la)tex document that uses the pdfs 
generated in step 2
4. Use pdf(la)tex/lua(la)tex/xe(la)tex to generate a pdf.
5. If necessary, use extractpdfmark to extract pdfmarks from the pdf generated 
in step 4 (extractpdfmark generates a postscript file)
6. Use ghostscript to generate the final pdf from the pdf generated in step 4 
and the postscript file generated in step 5.

This sounds a bit complicated, but the reduction of file size is significant. In 2014 
this was discussed here on the ghostscript bugzilla. 

Without the use of lilyponds "--bigpdf" option our notation manual had a size 
of 26 MB after step 4.
With the introduction of the "--bigpdf" option and steps 5 and 6 the file size 
after step 4 increased to 116 MB, but the size of the pdf generated in step 6 was only 
5.9 MB. That means we were able to eliminate more than 20MB of duplicated fonts.

Another example is gotlandstoner, a collection of folk tunes from Sweden. If 
you remove the PDFDontUseFontObjectNum option book 3 has a file size of 
13.706.324 bytes.  If a ghostscript with the PDFDontUseFontObjectNum option 
enabled is used that boils down to 2.447.232 bytes.

In an earlier message in this thread Ken Sharp wrote: "Risking incorrect output for the minimal benefit of a slightly smaller file seems unwise to me." Yes, the default should be not to enable PDFDontUseFontObjectNum. But as I pointed out above: The benefit of the PDFDontUseFontObjectNum is not only a "slightly smaller file", the benefit is a very significant reduction of file size often exceeding 80%.

I understand why the default behavior of ghostscript changed. But could anyone 
who advocates to remove the PDFDontUseFontObjectNum be so kind to give a clear 
explanation why keeping it would be a bad idea?


reply via email to

[Prev in Thread] Current Thread [Next in Thread]