axiom-developer
[Top][All Lists]

## Re: [Axiom-developer] Design of Semantic Latex

 From: Tim Daly Subject: Re: [Axiom-developer] Design of Semantic Latex Date: Wed, 5 Oct 2016 05:06:44 -0400

Axiom calls COQ (for Spad code) and ACL2 (for Lisp code) at build time
in order to run proofs. It is hard enough trying to construct proofs by hand
despite the notion that Spad is "nearly mathematical". Implementation
details matter a lot. We do, however, have the types already available.
Even with types the Spad-to-COQ gap is a long jump at best, a PhD
at worst.

I'm not sure how a dictionary between automated theorem provers and
latex would be useful. Fateman has already shown that raw latex lacks
enough information for an unambiguous parse (see prior references).

I'm suggesting that the latex be "augmented" by a semantic tags package
containing tags that do not change the print representation but contain
additional semantic information. E.g., for Axiom, \withtype ...

$\withtype{ \int{sin(x)} }{x}{EXPR(INT)}$

prints as 'sin(x)' but becomes

integrate(sin(x),x)

for Axiom.

That way the latex prints "as is" but can be post-processed by machine
to feed CAS systems. Of course this would be a trial-and-error process
as the CAS will inevitably fail on missing semantics, requiring yet-
another-tag somewhere. There is no such thing as a simple job.
I have a trivial implementation working but there is much to do.

As for trying to feed a Deep Neural Network proofs... I have spent a
fair amount of time on DNNs (part of the TIRES project, see prior refs).
I am clueless how they could possibly be applied to proofs.

Tim

On Wed, Oct 5, 2016 at 3:08 AM, James Davenport wrote:

Sorry – I’ve not been much involved: other projects. But I just saw this – haven’t looked in any detail yet.

DeepAlgebra - an outline of a program

Authors: Przemyslaw Chojecki

Categories: cs.AI math.AG

\\

We outline a program in the area of formalization of mathematics to automate theorem proving in algebra and algebraic geometry. We propose a construction of a dictionary between automated theorem provers and (La)TeX exploiting syntactic parsers. We describe its application to a repository of human-written facts and definitions in algebraic geometry (The Stacks Project). We use deep learning techniques.

Sent: 01 September 2016 13:25
To: Richard Fateman
Cc: Tim Daly; axiom-dev; Ralf Hemmecke; James Davenport; Mike Dewar; address@hidden; D Zwillinger; address@hidden
Subject: Re: Design of Semantic Latex

The weaver program can now process a latex document.

The end result is a tree structure of the same document.

There is still more to do, of course. Much more.

It is clear that the semantics of the markup tags are all in

the weaver program. This is obvious in hindsight since the

markup needs to be transparent to the print representation.

The parser needs to know the 'arity' of tags since \tag{a}{b}
would parse one way, \tag{a}, for a 1-arity tag and another

way \tag{a}{b} for a 2-arity tag. The code needs to be generalized

to parse given the arity.

The weaver program is structured so that the tree-parse output

is independent of Axiom. The Axiom rewrite will take the tree

as input and produce valid Axiom inputforms. This should make

it possible to target any CAS.

Onward and upward, as they say....

Tim

On Sat, Aug 27, 2016 at 1:28 PM, Tim Daly <address@hidden> wrote:

On Sat, Aug 27, 2016 at 12:14 PM, Richard Fateman <address@hidden> wrote:

Take up a book on complex analysis and see what problems you have
as you try to encode the statements, or especially the homework
problems. I tried this decades ago with the text I used,
https://www.amazon.com/Functions-Complex-Variable-Technique-Mathematics/dp/0898715954
but probably any other text would do.

My last project at CMU (Tires) involved work on machine learning
using natural language (and Good-Old-Fashioned-AI (GOFAI)).

I'm not smart enough to make progress in natural language.

I think the emphasis on handbook or reference book representation
is natural, and I have certainly pursued this direction myself.  However
what you/we want to be able to encode is mathematical discourse. This
goes beyond "has the algorithm reproduced the reference value for an
integration."   Can you encode in semantic latex a description of the geometry
of the (perhaps infinitely layered) contour of a complex function?  You
might wonder if this is important, but then note that questions of this sort
appear in the problem section for chapter 1.

Like any research project, there has to be bounds on the ambition.

At this point, the goal is to modify the markup to disambiguate a latex

formula so the machine can import it. Axiom needs to import it to create
a test suite measuring progress against existing knowledge.

What you're describing seems to be a way to encode topological issues

dealing with the structure of the space underlying the formulas. I have no

idea how to encode the Bloch sphere or a torus or any other space except

by referencing an Axiom domain, which implicitly encodes it.

If the formula deals with quantum mechanics then the algorithms have an

implicit, mechanistic way of dealing with the Bloch sphere. So markup that

uses these function calls use this implicit grounding. Simllarly, markup that

uses a branch cut implicitly uses the implementation semantics.

Axiom and Mathematics have one set of branch cuts, Maple and Maxima

have another (at far as I can tell). So the markup decisions have to be

carefully chosen.

Here's the challenge then.  Take a mathematics book and "encode"
it so that a program (hypothetically) could answer the problems at
the end of each chapter.

That's a much deeper can of worms than it appears. I spent a lot of

time in the question-answering literature. I have no idea how to make

progress in that area. The Tires project involved self-modifying lisp

based on natural language interaction with a human in the limited

domain of changing a car tire. See

(The grant ended before the projected ended. Sigh)

Tim

P.S. Tires is self-modifying lisp code. It "learns" by changing itself.

The initial code (the seed code) becomes "something else". One

interesting insight is that two versions of the seed code will diverge

based on "experience". That implies that you can't "teach by copy",

that is, you can't teach one system and then "just copy" it to another

existing system since their experiences (and the code structure)

will differ. Any system that "learns" will fail "teach by copy", I believe.

That means that AI will not have the exponential growth that everyone

seems to believe.