Fateman [0]
raised a set of
issues with the
OpenMath
approach. We
are not trying to
be cross-platform
in this
effort. Axiom
does provide an
algebraic scaffold
so it is
possible that
the selatex markup
might be useful
elsewhere
but that is not
a design
criterion.
Fateman[1] also
raises some
difficult
cross-platform
issues
that are not
part of this
design.
Fateman[2] shows
that parsing tex
with only
syntactic markup
succeeded on only
43% of 10740
inputs. It ought
to be posible
to increase this
percentage given
proper semantic
markup.
(Perhaps there
should be a
competition
similar to the
deep
learning
groups? PhDs have
been awarded on
incremental
improvements of
the percentage)
This is a
design-by-crawl
approach to the
semantic markup
idea. The hope
is to get
something running
this week that
'works' but
giving due
consideration to
global and
long-term
issues. A first
glance at CRC/NIST
raises more
questions
than answers as
is usual with any
research.
It IS a design
goal to support a
Computer Algebra
Test Suite
tedious to hand
construct test
suites. It will be
even more
tedious to
construct them
"second-level" by
doing semantic
markup and then
trying to use them
as input, but the
hope is
that eventually
the
CRC/NIST/G&R,
etc will
eventually be
published with
semantics so
computational
mathematics can
stop working
from syntax.
===========
Consideration 4:
I/O transparency
Assume for the
moment that we
take a latex file
containing
only formulas. We
would like to be
able to read this
file so
it has computational
mathematics (CM)
semantics.
It is clear that there
needs to be semantic
tags that carry the
information but these tags
have to be carefully
designed NOT
to change the syntactic
display. They may, as noted
before,
require multiple semantic
versions for a single
syntax.
It is also clear that we would
like to be able to output
formulas
with CM semantics where
currently we only output syntax.
===========
Consideration 5: I/O
isomorphism
An important property of selatex
is an isomorphism with
input/output. Axiom allows output
forms to be defined for a
variety of targets so this does not
seem to be a problem. For
input, however, this means that the
reader has to know how
to expand \INT{3} into the correct domain.
This could be done
with a stand-alone pre-processor from
selatex->inputform.
It should be possible to read-then-write
an selatex formula,
or write-then-read an selatex formula
with identical semantics.
That might not mean that the I/O is
identical though due to
things like variable ordering, etc.
===========
Consideration 6: Latex semantic macros
Semantic markup would be greatly
simplified if selatex provided
a mechanism similar to Axiom's ability to
define types "on the fly"
using either assignment
TYP:=FRAC(POLY(INT))
or macro form
TYP ==> FRAC(POLY(INT))
Latex is capable of doing this and
selatex should probably include
a set of pre-defined common markups, such
as
\FRINT ==> \FRAC\INT
===========
Consideration 7: selatex \begin{semantic}
environment?
Currently Axiom provides a 'chunk'
environment which surrounds
source code. The chunks are named so they
can be extracted
individually or in groups
\begin{chunk}{a name for the chunk}
anything
\end{chunk}
We could provide a similar environment
for semantics such as
\begin{semantics}{a name for the block}
\end{semantics}
which would provide a way to encapsulate
markup and also allow
a particular block to be extracted in
literate programming style.
===========
Consideration 8: Latex-time processing
Axiom currently creates specific files
using \write to create
intermediate files (e.g. for tables).
This technique can be used
to enhance latex-time debugging (where
did it fail?).
It can be used to create Axiom files which
pre-construct domains
needed when the input file with semantic
markup is read.
This would help a stand-alone
selatex->inputform preprocessor.
===========
Consideration 9: Design sketches
It is all well-and-good to hand-wave at
this idea but a large
amount of this machinery already exists.
It would seem useful to develop an incremental
test suite that
starts with "primitive" domains (e.g. INT),
creating selatex I/O.