Re: [Texmacs-dev] DRDs, converters, LaTeX

texmacs-dev

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Texmacs-dev] DRDs, converters, LaTeX

From:	Nix N. Nix
Subject:	Re: [Texmacs-dev] DRDs, converters, LaTeX
Date:	07 May 2003 09:28:45 -0600

On Wed, 2003-05-07 at 04:46, Joris van der Hoeven wrote:
[...]
> the LaTeX output converter. If Nix has time, then he could

I'll give it a whirl.

> try to complete that and move part of the tables in
> tmtex-preamble.scm to the DRD. I have not had time yet to deal
> with the question of initial environments, but the convertion
> routines (now in Data/Convert) have been reorganized in such
> a way that this may soon be added. The soon being not that soon,
> because I have to do some non-TeXmacs stuff this month.
> I also did not yet study the gluiing bug in $a$$b$ and
> a^{b}^{c}-like constructs.

Take a look at #1453.  I believe the problem lies in the fact that
tmtex.scm performs not only conversion of TeXmacs structures to LaTeX
structures, but also takes care of the spacing for the LaTeX
structures.  This is bad.  I believe that tmtex should be responsible
only for tokenizing strings and translating TeXmacs constructs.  The
spacing of the LaTeX file should be solely up to texout.  So I removed
the tmtex code that adds spaces to certain strings, and restricted the
string functions of tmtex to tokenizing and fishing out symbols.  That
is, tmtex no longer does

"15+16b" -> " 15 + 6 b ", but

"15+16b" -> (!concat "15" "+" "16" "b")

This way, texout alone gets to decide whether there should be a space
between "16" and "b", around the "+", etc.  This way, all the spacing
rules are contained in one function (called (texout-want-space ...) ).
How does this fix our problem ?

Well, it doesn't fix the math gluing problem, but it does fix the rprime
rsup problem.  Here's why:

When you perform (* is insertion point)

*
$*$
$a^\dag*$
$a^\dagn blah blah blah*$

in TeXmacs, due to the spacing that tmtex was adding, the resulting tree
sent to texout was 

(!concat "a" (!sup (!concat (dag) " n")) ...)

where the space in front of the "n" was added by tmtex.  In contrast,
tmtex-ing

(concat a (rprime "<dag>") (rsup "n"))

Does not pass through the spacing function, because "<dag>" and "n" are
in 2 separate nodes, rather than in 1 string.  So, when the tree goes to
texout, it looks like this:

(!concat "a" (!sup (!concat (dag) "n")) ... )

where there is no space in front of the "n", so texout blindly produces
\dagn.

Now, if texout always received a tree where the strings do not have
spaces tacked on by tmtex, then it can be made to decide whether to add
spaces between certain tokens.  The decision whether to add a space
between any two tokens x1 and x2 can be made by the following function
(helpers not shown, but they are in the patch):

(define (texout-want-space x1 x2) ;; spacing rules
  (and (not (or (equal? x2 ",")
                (equal? x1 " ")
                (equal? x2 " ")
                (func? x2 '!nextline)
                (equal? x2 "'")
                (func? x2 '!sub)
                (func? x2 '!sup)
                (func? x1 '&)
                (func? x2 '&)
                (and (func? x1 '!math) (func? x2 '!math))
                (and (texout-env? x1) (list? x2))
                (and (list? x1) (texout-env? x2))
                (and (equal? x1 "'") (not (list? x2)))))
       (or (func? x1 'tmop) (func? x2 'tmop)
           (and (not (list? x1)) (tex-symbol? x2))
           (and (not (list? x2)) (tex-symbol? x1))
           (and (list? x1) (list? x2))
           (and (not (list? x1)) (not (list? x2))))))

These rules reflect pretty closely how TeXmacs currently spaces LaTeX
files.  However, if desired, these rules can be changed. They can be
changed easily, because they are all located in one place.  The only
other place that /might/ affect spacing is where the tokenization rules
for strings are located (in (tmtex-string-produce ...)).

(tmtex-string-produce ...) contained a simple rule for tokenizing:

(not (char? x))

That is, a string ends when one encounters a non-character (mostly a
list like (dag) or (beta), etc.  I have refined this rule to tokenize in
the following fashion:

(define (tmtex-string-break? x start)
  (or (not (char? x))
      (and (tmtex-math-mode?) 
           (or (in? (char->string x) '("+" " " "-" ":" "=" "," "?" ";" "(" ")"
                                       "[" "]" "{" "}" "<" ">" "/"))
               (and (char-alphabetic? x) (char-numeric? start))
               (and (char-alphabetic? start) (char-numeric? x))))))

This results in:

57b -> ("57" "b")
1+1 -> ("1" "+" "1")
<alpha>+<beta> -> (("alpha") "+" ("beta"))
<dag>n -> ((dag) "n")

This way, we can pass better tokens to texout, and let texout decide how to 
space them.

As for the gluing problem, I believe a fix is simple:

...
        ((and (output-test-end? "$") (not (output-test-end? "\\$")))
         (output-remove 1)
+        (output-text " ")
         (texout x)
         (output-text "$"))
...

That is, put a space when gluing to math environments.

> Best wishes, Joris

Likewise to everybody.

[Prev in Thread]

Current Thread

[Next in Thread]

[Texmacs-dev] DRDs, converters, LaTeX, Joris van der Hoeven, 2003/05/07
- Re: [Texmacs-dev] DRDs, converters, LaTeX, Nix N. Nix <=
  - Re: [Texmacs-dev] DRDs, converters, LaTeX, Joris van der Hoeven, 2003/05/07
- Re: [Texmacs-dev] DRDs, converters, LaTeX, Leo, 2003/05/07
  - Re: [Texmacs-dev] DRDs, converters, LaTeX, David Allouche, 2003/05/07
  - Re: [Texmacs-dev] DRDs, converters, LaTeX, Joris van der Hoeven, 2003/05/08
    - [Texmacs-dev] Embedded Lisp / Scheme, Karl M. Hegbloom, 2003/05/09

Prev by Date: [Texmacs-dev] DRDs, converters, LaTeX
Next by Date: Re: [Texmacs-dev] DRDs, converters, LaTeX
Previous by thread: [Texmacs-dev] DRDs, converters, LaTeX
Next by thread: Re: [Texmacs-dev] DRDs, converters, LaTeX
Index(es):
- Date
- Thread