lout-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: @Eq, column-width variable, CJK/unicode support?


From: David Kuehling
Subject: Re: @Eq, column-width variable, CJK/unicode support?
Date: 23 Sep 2004 12:37:34 +0200

>>>>> "jp8swe" == jp8swe  <address@hidden> writes:

> Hi!  Just switched from latex last week (got tired of the poor 118n
> support of tex). 

I also made the switch 6 months ago after getting more and more upset
with TeX.  So far I haven't regretted it too much.. although Lout has
some problems of its own...

> Another thing I'd like to do is a document where I can switch from one
> to two columns with a minimum of hassle (ie. only switching the
> ColumnNumber property). I have a fair number of plots in the file that
> I would like to be scaled to the column width (so that it scales down
> when using the two column mode). Is there a variable containing the
> column width that I can use with @Scale?

Hmm, something like 

   @DP
   {} @Scale @IncludeGraphics { ... }
   @DP

should always scale to fill the currently available width (user guide
p171).  Never try this within a @Display, though.

> OK, now the biggest problem. What about CJK support?? Pretty please
> with sugar on top! I'd need all three of them! Just left-to-right CJK
> support would be quite enough. With CJK support I guess unicode would
> come in handy as well...

I asked these questions in my first mail to this list (actually only
about the "J" in "CJK"), seems that there is nothing like this available
currently.  I also need that, but not too urgently (the project I need
that for won't have anything to print before 2006 I guess, until then
everything is XML and the jTeX output module works most of the time).
So I currently consider implementing this myself.

First I thought about implementation via Lout's filters, which would
generate Lout symbols for every CJK character which would in turn
generate postscript code, character by character.  Probably with some
scaling like `1.0f @Wide @Scale' -- per character.  *very* inefficient
though and might not work well in all situations (eg getting this into
the databases that are used for translating words like "Figure",
"Appendix" etc).  Especially I'm not sure about whether the default CJK
PostScript fonts (Ryumin-Light and GothicBBB-Medium) should be typeset
non-proportionally with all characters the same widths.  I frequently
read japanese texts that seem to be typeset with proportional fonts.

Proportional typesetting would make the filter-script much more
difficult, and getting this work nicely with the current Lout fontsize
etc is even more a hurdle.

I also considered, doing this the CJK-TeXish way: splitting japanese
fonts into subfonts,each with some 96 chars (like eg on JIS code-plane).
Then to reference a character (with the right widths from the font
metrics) could be done via some font-switching code.  Decoding japanese
input coding would still be difficult.  Two methods seem possible: (1)
again use some filter script.  (2) define each japanese character in
its, say EUC-JP code via `def'.  Lout's `def' allows names with multiple
non-alphanumeric characters which should do the job.  Problem is, that
Lout may classify some charcodes > 128 as letters (latin-1 accented
characters etc, expert guide, p13) which would interfere with the
japanese charcode definitions.  Don't know wheter that can be disabled.
This is also quite a hack and will interfere with Lout's font handling
code.  It will also again lead to problems with getting japanese
characters into those standard language-dependent strings.  Another
problem is Lout's limit on the total number of fonts (256 I think).
Chinese, Japanese and Korean won't fit into 256 subfonts.  At least not
when using multiple font styles (mincho vs. gothic etc.)

Another problem is Japanese line breaking.  Some simple algorith seems
to be applicable here (with modern, proportional japanese typesetting,
older typesetting style with fullstops hanging outside right margin etc
might be more difficult to achieve).  Just define a list of characters
that are not allowed to remain as the last character of the line, and a
list of characters that must not start a line.  This seems to be
sufficient, at least for Japanese.

That algorithm can both be implemented with a filter script and even
with the `def'-style decoding: Just define all characters, that mustn't
start a line, as operators that bind with the previous character into
one unbreakable compound.

After all those considerations I'm almost at the point where I want to
badly hack the Lout source code: Make everything unicode (32bit per
char), allow UTF-8 as only input coding system.  Add Unicode->Whatever
transcoding tables for fonts, maybe add some method for defining
fontsets consisting of multiple Postscript fonts (so that eg one can
typeset Latin, Japanese, Chines and Korean with the default Roman font).
Also the hyphenation engine would need to be hacked to support those
primitive Japanese line breaking rules.

Not sure about whether vertical typesetting could be implemented easily.
Well, one simple method would be rotating the font and rotating the page
in opposite direction.  Heck that's simple :).  Hacking lout's galley
flushing algorithm is definitely one of the things I do *not* want to
do.  :).

Sorry for that lenghtly vapourware description.  It might help
motivating me if I know that at least one other person requres CJK in
Lout.  And knowing whether my implementation ideas seem sensible or like
nonsese to others.  If you have some time, I would definitely need help
on the "C" and "K" sides of CJK (typesetting rules, postscript font
encodings etc).

Any input appreciated.

regards,

David
-- 
GnuPG public key: http://user.cs.tu-berlin.de/~dvdkhlng/dk.gpg
Fingerprint: B17A DC95 D293 657B 4205  D016 7DEF 5323 C174 7D40


reply via email to

[Prev in Thread] Current Thread [Next in Thread]