axiom-developer
[Top][All Lists]

## [Axiom-developer] noweb, pamphlets, and TeXmacs

 From: root Subject: [Axiom-developer] noweb, pamphlets, and TeXmacs Date: Sat, 23 Nov 2002 12:17:52 -0500

All,

I've written some tutorial notes on the pamphlet idea to try to get
everyone at the same level of discussion. Essentially what Bill
has been pursuing is a way to integrate noweb and TeXmacs so that
we can support pamphlet file documents. As there is some confusion
about what each part is I've decided to write it out in full.
Feel free to complain about mistakes.

Bill's ideas are essentially correct. His note is attached.

=================
RE: NOWEB
=================

Knuth and Dijkstra advocated literate programming to try to solve
the problem of marrying the knowledge behind a program with the
text of the program itself. Knuth wrote Web which was designed to
work with Pascal thus:

.web formatted document
|   |
|   ------> tangle ----> pascal code ----> compile ---> execute
----------> weave  ----> tex format  ----> latex   ---> read

As this was Pascal-specific various other language-specific versions
were generated, e.g. CWeb for C.

Norman's innovation is that we don't need to be language specific.
With just a few additional tags above TeX we gain great power.

Since Axiom uses many forms of code (Makefiles, C, lisp, boot, spad, etc)
this is a key idea. We need to be able to embed many things transparently.
If we remove the language-specific options and simplify things we
can reduce the problem to this:

.noweb formatted document
|   |
|   ------> notangle ----> any code    ----> compile ---> execute
----------> noweave  ----> tex format  ----> latex   ---> read

Norman's implementation is called noweb. In essence, a noweb document
consists of alternations of code chunks and text blocks. A code
block is marked by:

<<(some string)>>=

code

@

Code chunks continue until encountering an @ in column 1 or another
chunk marker (the <<(some string)>>= tag).

The trailing equal sign marks this as a "definition" of the (some
string) block. Lack of a trailing equal sign marks this as a "use"
of the (some string) block. Uses are expanded by notangle.

Another important idea is that multiple occurences of the definition
string are concatenated into one definition thus:

<<a>>=
code 1
@
....
<<a>>=
code 2
@
....
<<a>>  ==> expands into:
code 1
code 2

We use this idea extensively in the documentation of code.

Text blocks are all that are not code chunks. Text blocks are tex
formatted document blocks.

=================
RE: TEXMACS
=================

TeXmacs is neither emacs not TeX but is an interesting cross-product
of the two ideas. Joris set out to make a useful front-end to a
computer algebra system and ended up with a generally useful tool.
It communicates with many computer algebra systems and is able to
properly format the math output in TeX style yet retain it as a
live object that can be handed back to the underlying system. In
addition, TeXmacs is able to properly format a large subset of
Tex and Latex documents.

TeXmacs, as Bill has been pointing out, is an excellent target for
an Axiom front-end. It already can talk directly to Axiom's interpreter
and embed the output into the TeXmacs buffer. It can already display
the .tex output from noweb.

Support for native noweb format would be most useful. The subtle
distinction that Bill was mentioning is that currently we can take
the "tex format" output and display it in TeXmacs. However, we would
like to fully support noweb as a standard format. This implies a couple
changes.

As mentioned above noweb does:

.noweb formatted document
|   |
|   ------> notangle ----> any code    ----> compile ---> execute
----------> noweave  ----> tex format  ----> latex   ---> read

If TeXmacs understood the noweb format fully it would need to have
the following features:

0) The ability to recognize and format a code chunk.
1) The ability to recognize the <<defn>>=, concatenation, and <<use>>
features of the code chunks.
2) The ability to create a "notangled" buffer from the current buffer
that would contain the formatted code.

Ideally you could make changes in the formatted code and have the
changes reflected back into the original buffer. Some of these
changes could be problematic.

3) The ability to create a "noweave" buffer from the current buffer
that would contain the formatted document.

The same comment as above applies. It would take some careful
design to properly "untangle" some changes.

4) Bill has suggested that the folding mechanism know about the code
chunks and be able to fold and unfold them. Perhaps the way to
make the "untangle" work would be to ignore the separate buffer
idea above and just use folding. I have no opinion about either
path yet.

It is very important that NO changes occur in the code chunks.
If TeXmacs or noweb or any other tool does not understand the
format it must maintain "transparency". That is, it must NOT
try to format things in the code chunks. Other tools have special
needs (e.g. Makefiles care about tabs) and you can't change
the code chunks because they will be output to other tools.

5) There are other ideas, not yet exposed, that it would be nice to
have supported. I guess I need to talk more about the pamphlet
idea in depth.

=================
RE: PAMPHLET FILES AND THE PRESENT
=================

Pamphlet files are now the native format for code and documentation.
There are no longer any Makefile, C, lisp, boot or spad files in the
system. All file formats have been subsumed into pamphlet files.

Currently .pamphlet documents, except for a recent patch, are
normal .noweb formatted documents. They have very little structure
at the moment.

Here is the way things currently interact:

.pamphlet formatted document
|   |
|   ------> notangle ----> any code    ----> compile ---> execute
----------> noweave  ----> tex format  ----> latex   ---> read

Pamphlet files are currently being used to document the internals
of Axiom. A file written originally in Boot is now written as a
pamphlet file. The pamphlet file is expanded and the rest of the
compile process takes place thus:

Originally:

foo.boot -> (translate) -> foo.lisp -> (compile) -> foo.o (load) .....

Now:

foo.pamphlet
|   |
|   -> notangle -> foo.boot -> (translate) ....
-----> noweave  -> foo.tex  -> latex -> read

=================
RE: PAMPHLET FILES AND THE FUTURE
=================

However, pamphlet files have a larger purpose besides documenting
internals of the system. Axiom has a large amount of algebra code
written in SPAD, a high level language. Much of the research behind
this code is hidden away in libraries. I'm hoping to use literate
programming to join these two threads, the theory and the implementation,
into a unified whole and then expand it beyond a simple join.

The end vision of using literate programming in Axiom is that you can
receive a "Booklet" which gives the theory and implementation of some
area of math, say linear algebra.  The "Booklet" is composed of
"pamphlets" (not the same concept as a chapter but that's close
enough).

Suppose you have an Axiom system. If you receive a Booklet you can
"drag and drop" the Booklet onto the system. It decomposes the Booklet
into Pamphlets, follows the references to pick up required pamphlets,
compiles the code, expands the user documentation into the proper
format, sets up example files for use, runs test cases to ensure that
functions work, adds the documentation to the theory tree, and washes
the dirty dishes.

Booklet format or Pamphlet format would be the standard format
for submission to an "Axiom Journal". This journal would allow
people to test code that was submitted with the theory. After all,
we expect Physics and Chemistry experiments to be reproduced and
validated; why not Computational Mathematics?

Booklets can be composed from a running system in (at least) two
directions.

First, you compose a set of Pamphlet files "across the system" so that
you could document, say, all of the matrix facilities currently
available.

Second, you compose a set of Pamphlet files "thru the system" so that
you could document, say, the integration mechanism from the top level
function all the way to the implementation details.

Thus you can insert and extract Booklets with Axiom making it easier
to share knowledge.

Future:

Linear Algebra Booklet
|
|-> NullSpace.pamphlet
|   |   |   |
|   |   |   -> notangle -> nullspace.spad ->
|   |   -----> noweave  -> nullspace.tex  -> latex -> read
|   |--------> userdocs    -> update Axioms user documentation
|   |--------> testcase    -> run test cases
|   |--------> examples    -> input files
|   |--------> textbook    -> update Axiom's current textbook
|   |--------> proofs      -> ACL2, MetaPRL files
|-> Pivots.pamphlet
......

Huge dream, I realize, but except for the dishes, I see no technical
reason why it can't be done.

This implies, of course, that Pamphlets can be decomposed into a
finer level of detail which is still under development.

=================
RE: PAMPHLET FILES AND THE NEAR TERM
=================

All of which implies a huge amount of work. It would be great
to have a front-end that supported both the current and future
directions.

RE: NOWEB CHANGES

Currently noweb needs to expand the chunk definition syntax
to handle some more general scheme such as a URL. We need to
be able to extract code chunks from other pamphlets so that
you can have the following situation:

pamphlet A:  (the definition document)
...
<<foo>>=
...

pamphlet B:  (the using document)
...
<<pamphlet:/path/A#foo>>
...

It would be useful if this could happen for text blocks also
so that generally useful descriptions could be inserted into
multiple pamphlets. Since the text blocks currently have
no label this becomes problematic. We need to develop text
labels so we can follow a uniform scheme. Multiple text blocks
containing essentially the same information already exist in
the system. This needs to be fixed.

For larger references (e.g. whole pamphlets) I'm currently
using the bibliography environment. However, I plan to have
a new Latex tag, say PAMPHLETREFS, that have a bibtex-like
reference set. Tags in this environment point to other
pamphlet files. Perhaps the "URL syntax" proposed above
could use the \PCITE{} tag instead:

pamphlet A:  (the definition document)
...
<<foo>>=
...

pamphlet B:  (the using document)
...
<<\pcite{3}{foo}>>
...

Anybody who understands bibtex and would like
to take a shot at this is welcome.

RE: TEXMACS CHANGES

Currently TeXmacs could take the following steps, probably as
a joint effort, to support Axiom:

1) Recognize noweb format
2) Integrate commands to notangle and noweave
3) Possibly either support
a) folding out code
b) notangle, noweave to "dependent" buffers
c) backport changes to "dependent" buffers to the original document
d) possibly all of the above
4) Integrate noweb.sty
Eventually this will evolve into Axiom.sty as we need to add
more latex macros, like \begin{theorem}, \begin{userdoc},
\begin{pamphletrefs}, etc

Perhaps we can lay out a more detailed plan that includes various
steps we can all work on.

I'm willing to help with any steps taken in this direction.
Feedback is welcome.

Tim

------------------------- forwarded note ---------------------------
On Saturday, November 23, 2002 3:19 AM Joris van der Hoeven
> ...
> Well, as I understand it, the pamphlet format is
> a LaTeX with special escape sequences for dealing
> with code or other special markup. Therefore,
> I think that the best way of importing such files
> is to first convert it to standard LaTeX
> (with possible pamphlet-specific commands),
> with a language like Perl, and next convert
> the result to TeXmacs using the standard input filter.

Yes and no. Tim, please correct me if I make a mistake
here...

The pamphlet format is really noweb input format. As
Norman Ramsey defines it, the input to noweb is quite
language independent and very simple. noweb is a
simplified version of Knuth's web ("no" for Norman,
I guess). All we have are named "code" chunks e.g.

<<name>>= ... <<othername>> ... @

which may reference other code chunks, e.g.
<<othername>> above, embedded in a text stream. Text
primary operations to be done on this file. One is
"weave" which extracts just the text stream (no code)
and the other is "tangle" which expands a given code
chunk (by default starting with the root chunk <<*>>=)
by including all of the other code chunks referenced
in that chunk, recursively. It is possible to
generate different results from the same input file
by specifying a different root for tangle.

It is true that the text stream is usually LaTex
code but I don't think that is a requirement of
noweb. The code chunks can also be in any language.

I believe Tim Daly defined the term "pamphlet" to
refer to the noweb input files that he is using in
the open source axiom project. These will (I presume)
always have a LaTex text stream part plus code
chunks in several different languages: makefile
script, C, lisp, SPAD (axiom specific), etc. I think
Tim has in mind also using such pamphlet files to
exchange axiom code between users.

And of course we also plan to use TeXmacs as
a front-end to axiom itself as a high level user
interface capable of entering and displaying
mathematics in a rich graphics format.

So when importing a pamphlet file into TeXmacs,
it is desirable to interpret the text stream part
of the input file as LaTex and convert it
appropriately, but it is also important to retain
the code chunks in their place in the original
file. What I was suggesting below was that it
seemed natural to me to treat these chunks as
"folded" into the TeXmacs document. That way,
when the folds are collapsed (closed), the
document would have the appearance of LaTex applied
to the weave output and would print that way. But
one could open a folded code chunk and edit it.
The only new thing would be expanding code chunks
during a "tangle" export. This could be done
easily just by extracting all code chunks and
then calling notangle.

>
> > Perhaps it would be nicer if TeXmacs was able to
> > expand and collapse folds on demand. It is not
> > really clear to me hold folding is intended to
> > work in TeXmacs. I wasn't able to find any
> > documentation about it and my experiments with
> > it so far have not produced a clear picture.
> > Perhaps it is still largely in the planning stage?
>
> Yes, this will be dealt with sometime next year.
>

Would you be interested in having someone (me) help
to accelerate that schedule? Are there other people
interested in the "fold" concept?

> > ...
> > Perhaps it would help to be able to look at some
> > existing styles that do something similar to what
> > we want. What would you recommend?
>
> I think that we first need to know what you already
> have.

There are LaTex "styles" and TeXmacs "styles". These
are different, right? So far I think Tim has only
made use of only relatively standard LaTex style
files.

The reason I mentioned TeXmacs styles is because
that is the only way thing that I could find at
this time that interacts with how folded text is
displayed. Perhaps that is not the way you intend
to go with folds?

> Also: how much documentation does already exist
> in the pamphlet format?
>

We are only at the beginning of the project. Did
you have in mind some other format?

Regards,
Bill Page.