[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Groff] ubuntu, groff and utf-8
From: |
Werner LEMBERG |
Subject: |
Re: [Groff] ubuntu, groff and utf-8 |
Date: |
Wed, 23 Feb 2005 07:18:06 +0100 (CET) |
> Readin info groff I found how to do utf-8 output, but nothing about
> input :(
This is correct, unfortunately. groff doesn't yet support UTF8 input.
You have to convert your file first to something groff can understand.
Below is a small perl script which does that. Note that it doesn't
`fake' glyphs, this is, it doesn't construct, say, `Amacron' from an
`A' and a `macron' glyph. Any volunteer for this?
Werner
======================================================================
#! /usr/bin/perl -w
#
# uni2groff.pl
#
# Convert input in UTF8 encoding to something groff 1.19 or greater
# can understand. It simply converts all Unicode values >= U+0080
# to the form \[uXXXX].
#
# Usage:
#
# perl uni2groff.pl < infile > outfile
#
# You need perl 5.6 or greater.
use strict;
binmode(STDIN, ":utf8");
while (<>) {
s/(\P{InBasicLatin})/sprintf("\\[u%04X]", ord($1))/eg;
print;
}
# EOF