bug-groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug #66323] [gropdf] rendering differences between PostScript and PDF o


From: Deri James
Subject: [bug #66323] [gropdf] rendering differences between PostScript and PDF output?
Date: Fri, 18 Oct 2024 17:39:22 -0400 (EDT)

Follow-up Comment #33, bug #66323 (group groff):

[comment #25 comment #25:]
> Thanks. By comparing what your dump does and what my attempts didn't, I
managed to get an equivalent setup locally for -Tpdf as I have for -Tps (not
that it didn't fight me every step of the way). As a bonus, ps2pdf(?) no
longer spends minutes unpicking the chinese font.
> 
> The test dataset was obtained by appending
[...] 
> As I have it configured, -Tps and -Tpdf use the exact same metrics and the
exact same font files.

Not quite. As I have pointed out before, in your new* pdfs, grops and gropdf
are now using the same meta-data but using different embedded fonts. gropdf
uses the font pointed to in the download file, whereas ghostscript uses the
fonts pointed to in /usr/share/ghostscript/*/Resource/Font, which are
different versions of the URW fonts, since grops does not embed the base35
fonts in the postscript it produces, so ghostscript embeds its own copy of the
font. However, the difference in the glyphs in both versions is probably
minimal, it is the meta-data which has the major difference, and you have
sorted that out fine.

In the two attached pdfs I have got gropdf and ghostscript using the same
fonts.

> The trends I see are thus:
> 
> newpdf6ps6.pdf has everything, but neither -Tpdf outputs have STIX or
TW-Sung-98_1 characters.

Bug in gropdf (Ingo was correct, it bit me). Change line 3393 of current
gropdf to:-

    $foundry=$1 if $fontnm=~m/^(.)-/;

Fixed in HEAD, git pull.

> On -Tpdf, the на pair is overkerned. (And I'd say it's /over/kerned not
just "more kerned", it definitely looks weird.)

Wrong font. Not embedded, see above.

> On -Tpdf everything's, like, inconsistently shifted around by like 1pt. This
is blameless but still weird.

Wrong font. You can see this by running pdffonts:-

[derij@pip nabi (master)]$ pdffonts newpdf6ps6.pdf 
name                                 type              encoding         emb
sub uni object ID
------------------------------------ ----------------- ---------------- ---
--- --- ---------
JNMFLI+NimbusRoman-Italic            Type 1C           Custom           yes
yes yes     31  0
EMWDZW+NimbusMonoPS-Regular          Type 1C           Custom           yes
yes no      33  0
JNUELA+NimbusMonoPS-Bold             Type 1C           Custom           yes
yes no      35  0
LTFQDS+NimbusRoman-Bold              Type 1C           WinAnsi          yes
yes no      37  0
JJSGTG+NimbusRoman-Regular           Type 1C           Custom           yes
yes no      39  0
GDRORT+NimbusRoman-Italic            Type 1C           WinAnsi          yes
yes no      47  0
GZVNUO+NimbusRoman-Regular           Type 1C           Custom           yes
yes yes     45  0
ELKFXB+STIX-Regular                  Type 1C           Custom           yes
yes yes     70  0
WYEFFC+TW-Sung-98_1                  Type 1C           Custom           yes
yes yes     72  0
TDCAQH+TW-Sung-98_1                  Type 1C           Custom           yes
yes yes     74  0
XRBVSF+Symbola                       Type 1C           Custom           yes
yes no      66  0
BDRMJB+topaz                         TrueType          WinAnsi          yes
yes no      76  0
SDXHZD+DejaVuSans                    Type 1C           Custom           yes
yes yes     68  0

Ghostscript has embedded all fonts (emb=yes). Notice the groff T and C fonts
have Nimbus* names.

[derij@pip nabi (master)]$ pdffonts newpdf95c.pdf 
name                                 type              encoding         emb
sub uni object ID
------------------------------------ ----------------- ---------------- ---
--- --- ---------
NFOCCZ+Symbola                       Type 1            Custom           yes
yes no      70  0
BUYXVJ+DejaVuSans                    Type 1            Custom           yes
yes no      74  0
STIX-Regular                         Type 1            Custom           no  no
 no      77  0
TW-Sung-98_1                         Type 1            Custom           no  no
 no      80  0
UNSAWY+topaz                         Type 1            Custom           yes
yes no      84  0
Times-Bold                           Type 1            Custom           no  no
 no      87  0
Times-Italic                         Type 1            Custom           no  no
 no      90  0
Times-Roman                          Type 1            Custom           no  no
 no      93  0
Courier                              Type 1            Custom           no  no
 no      96  0
Courier-Bold                         Type 1            Custom           no  no
 no      99  0

Not all fonts embedded. T and C because no -P-e, and STIX and TW-Sun because
of gropdf bug fixed above.

> groff HEAD justifies (fills? whatever the nomenclature is) the "all comments
were applied" line differently so it breaks it mid-word earlier. Conversely
with the polyomino link line.

The difference in the "all comments" line is a difference between 1.22.4 and
1.23.0, since if I run pdfmom -Tps and default on HEAD the word breaks in the
same place. So not a difference between grops and gropdf, but a difference in
groff versions.

The polyomino difference has been fixed in HEAD.

> groff HEAD -mom spaces the 2.2.1. heading. Good!
> 
> I can't attach site-font/ because it blows the max file size many times
over, so it's at
https://lfs.nabijaczleweli.xyz/0025-groff-Tpdf-fonts/site-font.tar.zst. This
is for groff HEAD on sid. This should ease testing significantly.
> 
> (file #56544, file #56545, file #56546)

I have generated a grops and gropdf version using your site-font (thanks, it
helps to debug) using HEAD and the fixes outlined above, called dj-newpdf6ps6
and dj-newpdf95c. The gropdf version has the following advantages, bookmark
page numbers match the printed page numbers, selecting text from the pdf
results in the correct unicode characters being used:-

o another paper on advice of the sponsor).
 😩😖🙄
 ⋈⋈▷
 ⟕⟖⟗
 不用烤箱教你在家做面包简单一蒸蓬松宣软
 ␤

Whereas the ghostscript version shows:-

t to another paper on advice of the sponsor).
 😩😖🙄
 ⋈⋈▷
 ⟕⟖⟗
 不用烤箱教你 在家做面包 简 单 一 蒸蓬松宣软

Which has introduced spaces in the Sung font, and ignored Topaz. Unicode
characters are allowed in pdf  metadata (Author/Title) and bookmarks (HEAD
only). The Contents page is automatically positioned after the title page. On
a raspberrypi4b (Debian 11) -Tpdf takes just over 12 seconds and -Tps takes
almost 3 1/2 minutes. The -Tps version is smaller, but if you run the -Tpdf
version through ps2pdfwr as a final step (.6 second) it produces the smallest
version..

Differences.

The contents page starts at different positions on the page (maybe an issue
with mom).
Diffpdf shows  pixel differences which change at different zoom levels.

As an example "va_arg" on physical page 4 (page number 2) of dj-newpdf95c.pdf
with its equivalent physical page 3 of dj-newpdf95c-ps.pdf, at the top of the
page, after the line "Replace". Diffpdf does not like this word, so let's see
how it is produced in both pdfs.

First this is the source from nab3359.mom:-

.defcolor darkgreen rgb #006400  /# sourced from ps.tmac
(0x64=100 * 257=25700)

.XCOLOR    darkgreen green

.ds va_arg   \*[green]\f(CBva_arg\fP\*[black]

.QUOTELINE 2 red
.in +1cm
.nf
\*[va_arg], \*[va_end], and \*[va_copy].  If access to the varying arguments
is desired, the called funct
ion

These are the generated grout commands:-

f8
V103603         /# Vertical position from top of page.
H85038          /# Horizontal position in millipoints.
mr 0 25700 0    /# RGB colour
tva_arg
f5
mr 0 0 0
t,

Grops produces this postscript:-

 0 0.392 0 Cr/F7 10.5/NimbusMonL-Bold@0 SF(va_arg)85.038 103.603 Q 0 0 0 Cr
F6(,)

 (0.392=25700/65535) rounded to 3dp.
 (85.038 103.603 = x,y in points)

The origin groff uses for positioning 0,0 is top left, but pdf origin is
bottom-left, so the conversion is to subtract groff's y coordinate from the
current media length (a4 document = 842.000-103.603 = 738.397).

Ghostscript reads the postscript and produces this in the
dj-newpdf95c-ps.pdf.

0 0.39209 0 rg
q
10 0 0 10 0 0 cm BT
/R11 10.5 Tf
1 0 0 1 85.0379 738.287 Tm
(va_arg)Tj
ET

Both the RGB colour and x,y position are not exactly the same as requested by
the postscript (0.392 v. 0.39209 and 85.038,738.397 v. 85.0379,738.287)

Gropdf reading the same grout above produces this in the dj-newpdf95c.pdf:-

1 0 0 1 85.038 738.397 Tm
0 Tc
0.000 0.392 0.000 rg
/F8 10.5 Tf
0.000 Tw [ (va_arg)] TJ

As you can see from the above postscript has requested green .329 and gs is
using colour .32909.
Postscript requests x,y for "va_arg" at 85.038,738.397 and gs is using
85.0379,738.287.
gropdf, on the other hand, sets the values exactly, ghostscript does a very
good job approximating the postscript. I believe it is these differences which
diffpdf is highlighting, but it would be incorrect to assume it was gropdf
which was inaccurate in glyph placement/colour.


(file #56553, file #56554)

    _______________________________________________________

Additional Item Attachment:

File name: dj-newpdf95c-ps.pdf            Size: 84KiB
    <https://file.savannah.gnu.org/file/dj-newpdf95c-ps.pdf?file_id=56553>

File name: dj-newpdf95c.pdf               Size: 128KiB
    <https://file.savannah.gnu.org/file/dj-newpdf95c.pdf?file_id=56554>


    AGPL NOTICE

These attachments are served by Savane. You can download the corresponding
source code of Savane at
https://git.savannah.nongnu.org/cgit/administration/savane.git/snapshot/savane-a0d195b6c3392c5f36ab8952df55e848831b569e.tar.gz


    _______________________________________________________

Reply to this item at:

  <https://savannah.gnu.org/bugs/?66323>

_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]