[Axiom-developer] Design Thoughts on Semantic Latex (SELATEX)

Fateman [0] raised a set of issues with the OpenMath

approach. We are not trying to be cross-platform in this

effort. Axiom does provide an algebraic scaffold so it is

possible that the selatex markup might be useful elsewhere

but that is not a design criterion.

Fateman[1] also raises some difficult cross-platform issues

that are not part of this design.

Fateman[2] shows that parsing tex with only syntactic markup
succeeded on only 43% of 10740 inputs. It ought to be posible
to increase this percentage given proper semantic markup.

(Perhaps there should be a competition similar to the deep

learning groups? PhDs have been awarded on incremental

improvements of the percentage)

This is a design-by-crawl approach to the semantic markup

idea. The hope is to get something running this week that

'works' but giving due consideration to global and long-term

issues. A first glance at CRC/NIST raises more questions

than answers as is usual with any research.

It IS a design goal to support a Computer Algebra Test Suite

(http://axiom-developer.org/axiom-website/CATS). It is very

tedious to hand construct test suites. It will be even more

tedious to construct them "second-level" by doing semantic

markup and then trying to use them as input, but the hope is

that eventually the CRC/NIST/G&R, etc will eventually be

published with semantics so computational mathematics can

stop working from syntax.

===========
Consideration 4: I/O transparency

Assume for the moment that we take a latex file containing

only formulas. We would like to be able to read this file so

it has computational mathematics (CM) semantics.

It is clear that there needs to be semantic tags that carry the

information but these tags have to be carefully designed NOT

to change the syntactic display. They may, as noted before,

require multiple semantic versions for a single syntax.

It is also clear that we would like to be able to output formulas

with CM semantics where currently we only output syntax.

===========

Consideration 5: I/O isomorphism

An important property of selatex is an isomorphism with

input/output. Axiom allows output forms to be defined for a

variety of targets so this does not seem to be a problem. For

input, however, this means that the reader has to know how

to expand \INT{3} into the correct domain. This could be done

with a stand-alone pre-processor from selatex->inputform.

It should be possible to read-then-write an selatex formula,

or write-then-read an selatex formula with identical semantics.

That might not mean that the I/O is identical though due to

things like variable ordering, etc.

===========

Consideration 6: Latex semantic macros

Semantic markup would be greatly simplified if selatex provided

a mechanism similar to Axiom's ability to define types "on the fly"

using either assignment

TYP:=FRAC(POLY(INT))

or macro form

TYP ==> FRAC(POLY(INT))

Latex is capable of doing this and selatex should probably include

a set of pre-defined common markups, such as

\FRINT ==> \FRAC\INT

===========

Consideration 7: selatex \begin{semantic} environment?

Currently Axiom provides a 'chunk' environment which surrounds

source code. The chunks are named so they can be extracted

individually or in groups

\begin{chunk}{a name for the chunk}

anything

\end{chunk}

We could provide a similar environment for semantics such as

\begin{semantics}{a name for the block}

\end{semantics}

which would provide a way to encapsulate markup and also allow

a particular block to be extracted in literate programming style.

===========

Consideration 8: Latex-time processing

Axiom currently creates specific files using \write to create

intermediate files (e.g. for tables). This technique can be used

to enhance latex-time debugging (where did it fail?).

It can be used to create Axiom files which pre-construct domains
needed when the input file with semantic markup is read.

This would help a stand-alone selatex->inputform preprocessor.

===========

Consideration 9: Design sketches

It is all well-and-good to hand-wave at this idea but a large

amount of this machinery already exists.

It would seem useful to develop an incremental test suite that

starts with "primitive" domains (e.g. INT), creating selatex I/O.

Once these are in place we could work on "type tower" markup

such as \FRAC\INT or \POLY\COMPLEX\FLOAT.

Following that might be pre-existing latex functions like \int, \sum,

\cos, etc.

To validate these ideas Axiom will include an selatex.sty file and

some unit tests files on primitive domain markup. That should be
enough to start the bikeshed discussions.

Ideas? Considerations? Suggestions?

Tim

[0] Fateman, Richard J.
"A Critique of OpenMath and Thoughts on

Encoding Mathematics, January, 2001"

https://people.eecs.berkeley.edu/~fateman/papers/openmathcrit.pdf

[1] Fateman, Richard J.
"Verbs, Nouns, and Computer Algebra, or What's Grammar Got to

do with Math? ", December 18, 2008
https://people.eecs.berkeley.edu/~fateman/papers/nounverbmac.pdf

[2] Fateman, Richard J.

"Parsing TeX into Mathematics",
https://people.eecs.berkeley.edu/~fateman/papers/parsing_tex.pdf

From:	Tim Daly
Subject:	[Axiom-developer] Design Thoughts on Semantic Latex (SELATEX)
Date:	Thu, 18 Aug 2016 14:45:10 -0400