groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Heirloom] Using the Symbola font in Heirloom troff


From: T. Kurt Bond
Subject: Re: [Heirloom] Using the Symbola font in Heirloom troff
Date: Wed, 5 Aug 2020 09:14:48 -0400

Interesting.

On Wed, Aug 5, 2020 at 8:55 AM Richard Morse <pukku@mac.com> wrote:

> Hi! The issue arises before it even gets to the PostScript.
>
> If you run the following commands:
>
>         .do xflag 3
>         .lc_ctype UTF-8
>         .fp 5 Symbola Symbola ttf
>         .ft Symbola
>         ❊ works
>         .sp
>         🂡 char
>         .sp
>         \U'1F0A1' uesc
>         .sp
>         \[u1F0A1] name
>         .sp
>
>
> Through Heirloom as `troff test.roff | less`, you can see that the output
> is (in part, once the heading is all set up):
>
>         H72000
>         V12000
>         CPSspoked8teardroppropellerstar
>         wh11510cw
>         h7670co
>         h5140cr
>         h4010ck
>         h5560cs
>         n12000 0
>         H72000
>         V36000
>         h6660cc
>         h4490ch
>         h5760ca
>         h5220cr
>         n12000 0
>         H72000
>         V60000
>         h6660cu
>         h5760ce
>         h4550cs
>         h3920cc
>         n12000 0
>         H72000
>         V84000
>         CPSu1F0A1
>         wh11270cn
>         h5760ca
>         h5220cm
>         h8660ce
>         n12000 0
>
> You’ll notice that the star character, which works in the PDF, and the
> named character (remember that, inside the font file, u1F0A1 is the
> character name) both show up in ‘CPS’ statements. But the two other places
> you would expect to see something (from the actual character and the \U
> escape), it is entirely missing. You have the ‘H72000’ command, the ‘V’
> command (with the vertical offset), and then it goes immediately into the
> latin text (seemingly without even including the space that should exist?).
>
> So for whatever reason, it isn’t seeing the character as something that
> should be output.
>
> Ricky
>
> > On Aug 5, 2020, at 1:30 AM, T. Kurt Bond <tkurtbond@gmail.com> wrote:
> >
> > Looking at the postscript output there is a "/uni1F0A1 9429 def" and a
> "/uni1F10A" in a "/Encoding-@15@36 [...] def"; is that part of the font
> machinery?  (I'm sadly ignorant of PostScript, alas.)
> >
> > Looking at troff/troff.d/otf.c I see that there is a struct WGL that
> contains female and male entries.  At the beginning of the struct is a
> comment that consists of "/* WGL4 */".  Googling that led to Windows Glyph
> List 4.  Taking a leap, I added the unicode characters FEMALE SIGN and MALE
> SIGN to my test document.  Those show up fine in the final PDF output.
> Maybe this is connected?  At this point I suspect without much evidence
> that characters that are not in the StandardStrings array, the
> MacintoshStrings array, or the WGL array don't get output.  Maybe.  I'll
> have to investigate some more.
> >
> > On Tue, Aug 4, 2020 at 11:10 PM Richard Morse <pukku@mac.com> wrote:
> > Hm. Just for my edification, I tried a few things.
> >
> > I’m on a Mac, and I don’t know when I compiled Heirloom troff, but it
> was a year or two ago, so something things may be different.
> >
> > I downloaded the Symbola font from fontlibrary.org. The version I got
> was .ttf, not .otf.
> >
> > The various things that you tried did not work for me either. \[u1F0A1]
> did work, but that’s because (according to fret, at least), that’s the
> font’s internal name for the symbol, which is not guaranteed to be true
> across all fonts, so you can’t really use that for a “fallback” system.
> >
> > Looking at the output of troff without going through dpost, it looks
> like it is completely ignoring the character. I tried explicitly setting
> LC_CTYPE to ‘en_US.UTF-8’ and ‘UTF-8’ (both in the terminal, and using the
> .lc_ctype command), but that had no effect.
> >
> > I wonder if troff has a compiled in list of unicode characters that it
> understands, and if you try to use one it deems invalid it just ignores it?
> (This may be borne out by
> https://github.com/n-t-roff/heirloom-doctools/blob/master/troff/troff.d/unimap.c
> , but I don’t really know enough about the code to be certain.)
> >
> > Ricky
> >
> > > On Aug 4, 2020, at 10:14 PM, T. Kurt Bond <tkurtbond@gmail.com> wrote:
> > >
> > > In Emacs M-x describe-coding-system tells me the coding system for
> saving the buffer is utf-8-unix.  I don't have any LC_* environment
> variables set, but LANG=en_US.UTF-8.
> > >
> > > I'm not very knowledgeable about the insides of Unicode fonts,
> unfortunately.
> > >
> > > On Tue, Aug 4, 2020 at 4:27 PM Richard Morse <pukku@mac.com> wrote:
> > > Huh. I’m afraid I’m out of my depth then; you might check and see if
> your LC_* environment variables are set to something incompatible with
> utf-8 (or, maybe, check and make sure the file in UTF-8, not UCS-16 or
> something if you’re on Windows), but hopefully someone with more experience
> and knowledge will speak up…
> > >
> > > Ricky
> > >
> > > > On Aug 4, 2020, at 3:59 PM, T. Kurt Bond <tkurtbond@gmail.com>
> wrote:
> > > >
> > > > And if I add "and explicit unicode character reference \U'1F0A1'" to
> the
> > > > file, that character doesn't show up either.
> > > >
> > > > On Tue, Aug 4, 2020 at 2:47 PM Richard Morse <pukku@mac.com> wrote:
> > > >
> > > >> According to the Heirloom Troff manual, I think that you cannot just
> > > >> insert Unicode characters (although maybe if your LC* environment
> variables
> > > >> are set correctly, you can?). It says:
> > > >>
> > > >>> Both nroff and troff allow references to specific Unicode
> characters
> > > >> with the \U'X' escape sequence;
> > > >>> it causes the character at position U+X to be printed (X is a
> > > >> hexadecimal number). For troff,
> > > >>> it is required that this character is available in one of the fonts
> > > >> mounted at this point.
> > > >>> As an example, \U'20AC' prints the Euro character €. When register
> .g is
> > > >> set to 1 Unicode
> > > >>> characters can also be accessed with \[uXXXX] where XXXX is a four
> digit
> > > >> hexadecimal number.
> > > >>
> > > >> So I think you would need to use `\U'1F0A1'` for the character to
> show up?
> > > >>
> > > >> Ricky
> > > >>
> > > >>
> > > >>> On Aug 4, 2020, at 12:28 PM, T. Kurt Bond <tkurtbond@gmail.com>
> wrote:
> > > >>>
> > > >>> (The heirloom-doctools README.md
> > > >>> <
> https://github.com/n-t-roff/heirloom-doctools/blob/master/README.md>
> > > >> says
> > > >>> to ask Heirloom doctools questions on this list.)
> > > >>>
> > > >>> I'd like to use the Symbola font in Heirloom troff.   I tried the
> > > >> following:
> > > >>>
> > > >>> .do xflag 3
> > > >>> .\" fp 5 Optima Optima-Regular ttf
> > > >>> .fp 5 Symbola Symbola otf
> > > >>> .LP
> > > >>> Here is some normal text.
> > > >>> .\" PLAYING CARD ACE OF SPACES is Unicode 0x1F0A1
> > > >>> .ft Symbola
> > > >>> 🂡 And some normal text. ❊
> > > >>> .ft P
> > > >>> More normal text.
> > > >>>
> > > >>> That's a literal PLAYING CARD ACE OF SPADES Unicode character at
> the
> > > >> start
> > > >>> of the line between the two .ft requests.  That character does not
> show
> > > >> up
> > > >>> in the troff output, even through the EIGHT TEARDROP-SPOKED
> PROPELLER
> > > >>> ASTERISK Unicode character at the end of the line *does* show up,
> > > >>> as CPSuni274A where the CPS<name> outputs the character of that
> name.
> > > >> The
> > > >>> Symbola font is embedded in the PDF output (created from the
> PostScript
> > > >>> output), and the text "And some normal text" and the EIGHT
> > > >> TEARDROP-SPOKED
> > > >>> PROPELLER ASTERISK Unicode character are in the Symbola font in
> the troff
> > > >>> output.
> > > >>>
> > > >>> However, if I manually add a CPSuni1F0A1 to the troff output,
> *that*
> > > >> character
> > > >>> *does* show up.
> > > >>>
> > > >>> Any ideas as to why the literal PLAYING CARD ACE OF SPADES Unicode
> > > >>> character in the document source is being ignored and not written
> to the
> > > >>> troff output?
> > > >>>
> > > >>> I actually have a document that needs to use the PLAYING CARD ACE
> OF
> > > >> SPADES
> > > >>> Unicode character.  The ultimate goal is to have the Symbola font
> used
> > > >> as a
> > > >>> fallback font, which should happen automatically in Heirloom
> troff, since
> > > >>> it searches all the fonts when a font is missing a character, but
> I made
> > > >>> the example use the Symbola font directly because that shows the
> problem
> > > >>> directly.
> > > >>>
> > > >>> --
> > > >>> T. Kurt Bond, tkurtbond@gmail.com, https://tkurtbond.github.io
> > > >>
> > > >>
> > > >
> > > > --
> > > > T. Kurt Bond, tkurtbond@gmail.com, https://tkurtbond.github.io
> > >
> > >
> > >
> > > --
> > > T. Kurt Bond, tkurtbond@gmail.com, https://tkurtbond.github.io
> >
> >
> >
> > --
> > T. Kurt Bond, tkurtbond@gmail.com, https://tkurtbond.github.io
>
>

-- 
T. Kurt Bond, tkurtbond@gmail.com, https://tkurtbond.github.io


reply via email to

[Prev in Thread] Current Thread [Next in Thread]