[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Axiom-developer] rhxtangle
From: |
Ralf Hemmecke |
Subject: |
[Axiom-developer] rhxtangle |
Date: |
Sun, 06 Aug 2006 02:04:59 +0200 |
User-agent: |
Thunderbird 1.5.0.5 (X11/20060719) |
Hi Gaby,
it turned out that it is not totally easy to mimic the behaviour of
notangle, in particular if multiline chunks have to be indented correctly.
Below you find rhxtangle.pl.pamphlet. I hope you find it useful to drop
the initial noweb dependency for the configuration phase of Axiom.
Ralf
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%
% rhxtangle.pl
% Copyright (C) 06-Aug-2006 Ralf Hemmecke <address@hidden>
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%
% noweave -delay rhxtangle.pl.pamphlet > rhxtangle.tex
% latex rhxtangle.tex
% makeindex rhxtangle
% latex rhxtangle.tex
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\documentclass{article}
\usepackage{axiom}
\usepackage{noweb}
\usepackage{makeidx}
\makeindex
\usepackage{hyperref}
\newcommand{\file}[1]{\textsf{#1}}
\newcommand{\email}[1]{\url{#1}}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Definition taken from allprose.sty
\makeatletter
\def\xnamedefstyle{\textsc}
address@hidden
address@hidden
address@hidden@\xnamedefstyle{#1}]}}
address@hidden
\expandafter\def\csname x#1\endcsname{\@@xnamedef{#1}{#2}{#3}}}
\def\@@xnamedef#1#2#3{%
address@hidden
\defineterm{#2}%
\footnote{\href{#3}{\useterm{#2}: \url{#3}}}%
\expandafter\gdef\csname !x#1\endcsname{}%
}{\useterm{#2}}}
\def\rhxterm{%
address@hidden
\def\useterm##1{##1}%
\def\defineterm##1{##1}%
}
\makeatother
\IfFileExists{rhxterm.sty}{\usepackage{rhxterm}}{\rhxterm}
\xnamedef{Automake}{http://www.gnu.org/}
\xnamedef{Axiom}{http://www.axiom-developer.org/}
\xnamedef{Noweb}{http://www.eecs.harvard.edu/~nr/noweb}
\xnamedef{Perl}{http://www.perl.org}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\title{A Poor Man's NoTangle}
\author{Ralf Hemmecke}
\begin{document}
\maketitle
\begin{abstract}
In order to drop the dependency of the configuration step of
\xAxiom{} on \xNoweb{} we present a \xPerl{} program that basically
behaves like a simple version of the \texttt{notangle} program from
\xNoweb{}.
\end{abstract}
\tableofcontents
\section{Introduction}
\xAxiom{} is written in a literate programming style using \xNoweb{}.
That means that (nearly) all sourcefiles of \xAxiom{} are actually
\LaTeX{} files with additional code chunks that are of the form
\begin{verbatim}
@<<code chunk name@>>=
some code comes here
@@
\end{verbatim}
These special files are known to \xAxiom{} developers as
\defineterm{pamphlet} files and come with the extension
\texttt{.pamphlet}. The source file of this text is one example of a
\useterm{pamphlet} file.
Since also the configuration files for \xAxiom{} should be written as
\useterm{pamphlet} files, it would be nice to depend on as few
prerequisites as possible. \xNoweb{} is not by default installed on
every computer. We replace a dependency on \xNoweb{} by a dependency
on \xPerl{} since \xAutomake{} from the GNU Autotools depends on
\xPerl{}, too and writing the script is relatively easy.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{What do we want?}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
The script should be called via
\begin{verbatim}
perl rhxtangle.pl file.ext.pamphlet > file.ext
\end{verbatim}
in order to tangle the code in the same way \texttt{notangle} does.
The output of \texttt{rhxtangle} and \texttt{notangle} should be
identical modulo the translation of tabs to spaces done by
\texttt{notangle} and modulo spaces at the end of a line.
Currently \texttt{rhxtangle} does not translate tabulators to spaces
and removes trailing spaces and tabulators.
In order to keep our script simple, we impose some restrictions to the
file format.
\begin{enumerate}
\item We only accept one file on the command line and no options.
\item Code chunks \emph{must} be ended by an \verb'@' sign in the
first column.
\item Inside code chunks an \verb'@' sign is forbidden in the first
column.
\item Double square brackets may not appear inside double angle
brackets.
\item A code chunk name may not contain \verb'@<<' or \verb'@>>', not
even if they are escaped by an \verb'@' sign.
\item Double angle brackets, i.e. \verb'@<<', that should not be
considered as part of a code chunk use \emph{must} be escaped by
\verb'@'.
\item No translation of initial whitespace takes place, so it is
important for Makefile to already contain tabulator characters at
the corresponding places.
\end{enumerate}
The last item is a major difference between our implementation and
\texttt{notangle}.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{How to extract code chunks?}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
The input file can basically be seen as a collection of code chunks.
Our script works in two passes.
<<*>>=
<<global variables>>
<<read the code chunks from stdin>>
<<write code chunks to stdout>>
@
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Reading the code chunks from standard input}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
We are going to store every chunk into a hashtable that is indexed by
the chunk name.
<<global variables>>=
%chunks = ();
$chunkname = "";
@
%$
Our script skips everything that is not between an initial code chunk
definition of the form \verb'@<<chunk definition@>>=' and the following
\verb'@' in the first columm.
<<read the code chunks from stdin>>=
while (<>) {
chomp; # strip off trailing newline character
if (/^@<<(.+)@>>=\s*$/) { #chunk definition
$chunkname = $1;
if (! defined($chunks{$chunkname})) {$chunks{$chunkname} = []}
next;
}
if (/^@/) {$chunkname = ""; next} #chunk end
if ($chunkname eq "") {next} #skip non-code-chunk lines
push @{$chunks{$chunkname}}, $_; #append $_ at the end of the list
}
@
%$
After executing the above code, the hash variable \verb'%chunks' %$
contains all the code chunk lines where lines that came from
chunks with identical name have already been joined.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Writing the code chunks to standard output}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Writing out the code is done recursively starting with the code chunk
\verb'@<<*@>>'. Since [[printCodeChunk]] does not add a final newline,
we do it afterwards explicitly and thus trigger a flushing of an
internal buffer.
<<write code chunks to stdout>>=
&printCodeChunk('*', '');
&printout("\n");
<<printCodeChunk>>
@
The function \verb'printCodeChunk' takes the name of the chunk and the
amount of whitespace that must be printed after each newline in a
multiline code chunk.
If \texttt{notangle} sees the use of a code chunk in a line it
replaces that with the text from that code chunk. Where a trailing
newline in the code chunk text is removed.
For single line chunk text the replacement is simple. For multiline
text, \texttt{notangle} adds spaces after a newline so that the chunk
is basically shifted as a whole.
If the first \verb'<' of the code chunk use starts in column $n$ and
the corresponding code chunk represents multiple lines, then after
each newline $n$ spaces are added.
Let us take the following file, which we name \file{example.nw}.
<<example.nw>>=
@<<*@>>=
Text1
@<<C1@>>@<<C1@>>Text2
Text3
@@
@<<C1@>>=
TextC11
@<<C2@>>@<<C2@>>
TextC12
@@
@<<C2@>>=
TextC21
TextC22
@@
@
Running
\begin{verbatim}
notangle example.nw > ex.1
\end{verbatim}
yields the following output.
\begin{verbatim*}
Text1
TextC11
TextC21
TextC22TextC21
TextC22
TextC12TextC11
TextC21
TextC22TextC21
TextC22
TextC12Text2
Text3
\end{verbatim*}
For a Makefile one usually adds the command line switch \verb'-t8'.
The command
\begin{verbatim}
notangle -t8 example.nw > ex.2
\end{verbatim}
yields the following text, where we have replaced tabulator characters
by arrows in order to show them here. (Note that there appears
\emph{no} tabulator character in the third line.
\begin{verbatim*}
Text1
TextC11
TextC21
|------> TextC22TextC21
|------> TextC22
TextC12TextC11
|------> TextC21
|------> TextC22TextC21
|------>|------> TextC22
|------> TextC12Text2
Text3
\end{verbatim*}
Our program behaves differently. Instead of adding a fixed amount of
spaces \texttt{rhxtangle} concatenates initial whitespace that appears
in the input line and only adds spaces to account for positioning
the second use of \verb'@<<C1@>>' in the file \file{example.nw}.
Executing
\begin{verbatim}
perl rhxtangle.pl example.nw > ex.3
\end{verbatim}
results in a file that is identical to \file{ex.1}.
In the function [[printCodeChunk]] we use an auxiliary function
[[printout]] which delays the actual output and removes the escape
character \verb'@' and trainling spaces.
<<printCodeChunk>>=
<<printout>>
@
The function [[printCodeChunk]] is called recursively for each use of
a code chunk inside a line.
<<printCodeChunk>>=
sub printCodeChunk {
my($chunkname, $indentation) = @_;
my($nextIndentation, $chars, $rest) = ($indentation, "", "");
my($chunkNameUse) = ("");
my(@lines, $line);
if (! defined($chunks{$chunkname})) {
print STDOUT "\nrhxtangle: Undefined chunkname @<<$chunkname@>>\n";
die "rhxtangle: Undefined chunkname @<<$chunkname@>>\n";
}
@lines = @{$chunks{$chunkname}};
if (scalar(@lines) == 0) {return}
<<handle first of the @lines array and prepare for next>>
while (scalar(@lines) > 0) { #more than one line left
&printout("$line\n$indentation"); #print leftover from last round
<<handle first of the @lines array and prepare for next>>
}
&printout($line); #print leftover from last round
}
@ %def printCodeChunk
%$
Note that there is no newline printed at the end of [[printCodeChunk]].
The following code chunk treats exactly one input line.
It scans for code chunk uses and replaces them by the corresponding
text by recursively calling the function [[printCodeChunk]].
<<handle first of the @lines array and prepare for next>>=
$line = shift @lines;
$nextIndentation = $indentation;
while ($line =~ /^(.*?)(.)?@<<(address@hidden)@>>(.*)/) {
$chars = "$1$2";
$chunkNameUse = $3;
$rest = $4;
&printout($chars);
<<replace non-tabulator characters in 'chars' by spaces>>
$nextIndentation .= $chars;
if ($2 eq "@") { # the @<< is escaped --> no chunk use
&printout("@<<");
$nextIndentation .= " "; # for @<<
$line = "$chunkNameUse@>>$rest";
next;
}
$chars = "@<<$chunkNameUse@>>";
<<replace non-tabulator characters in 'chars' by spaces>>
&printCodeChunk($chunkNameUse, $nextIndentation);
$nextIndentation .= $chars;
$line = $rest;
}
@
%$
<<replace non-tabulator characters in 'chars' by spaces>>=
$chars =~ s/[^\t]/ /g;
@
%$
The [[printout]] function first buffers strings that it gets as input
until a newline character is detected. The newline character triggers
the actual output.
Before the line is actually written to standard output, trailing
whitespace are removed and escaped sequences are resolved.
<<global variables>>=
$printoutBuffer = '';
@
<<printout>>=
sub printout {
my($str) = @_;
while ($str =~ /(.*?)\n(.*)/) {
$printoutBuffer .= $1;
$str = $2;
<<flush printoutBuffer>>
print "\n";
}
$printoutBuffer .= $str;
}
@ %def printout
<<flush printoutBuffer>>=
$printoutBuffer =~ s/\s*$//;
$printoutBuffer =~ s/@(@<<|@>>)/$1/g;
print $printoutBuffer;
$printoutBuffer = '';
@
%$
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Tests}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Testing \texttt{rhxtangle} basically means to compare its output with
the output of \texttt{notangle}. There should be only spacing
differences for relevant files.
So we produce below a short script that could be extracted via
\begin{verbatim}
notangle -Rrhxtangletest.pl rhxtangle.pl.pamphlet > rhxtangletest.pl
\end{verbatim}
and run via
\begin{verbatim}
perl rhxtangletest.pl
\end{verbatim}
That program lists relevant files and compares the output with that
ouf \texttt{notangle}.
In fact, if the function [[tab2spc]] is applied in the function
[[printout]] before actually writing to stdout, the script
\texttt{rhxtangle} would be closer to \texttt{notangle} called without
the option \verb'-t8'.
We believe however that \texttt{rhxtangle} is good enough as a ``poor
man's replacement for \texttt{notangle}''.
<<rhxtangletest.pl>>=
<<tab2spc>>
@files=`find . -name '*.pamphlet'`;
for $f (@files) {
chomp $f;
$f =~ s/\.pamphlet$//;
if ($f =~ /\.bib$/) {next}
#print ":: $f\n";
if ($f =~ /Makefile/) {$opt="-t8"} else {$opt=''}
@no = `notangle $opt $f.pamphlet`;
@rh = `perl rhxtangle.pl $f.pamphlet`;
if (scalar(@no) != scalar(@rh)) {
print "Different number of lines. [$f]\n";
}
while (scalar(@no) > 0) {
$n = shift @no; $n =~ s/\s*$//;
$r = shift @rh; $r =~ s/\s*$//;
if ($n ne $r) {
$nn = &tab2spc($n); $n =~ s/[\t]/_/g;
$rr = &tab2spc($r); $r =~ s/[\t]/_/g;
if ($nn ne $rr) {
$nn =~ s/[\t]/_/g;
$rr =~ s/[\t]/_/g;
print "[[[$f]]]\n";
print "n $nn\n";
print "r $rr\n";
}
}
}
}
@
<<tab2spc>>=
sub tab2spc {
my($s) = @_;
my($p) = index($s, "\t");
while($p != -1) {
# $q = (8 - ($p % 8)); print "$q <-- $p\n";
$sp = ''; for (1 .. (8 - ($p % 8))) {$sp .= " "}
$s =~ s/\t/$sp/;
$p = index($s, "\t");
}
$s;
}
@ %def tab2spc
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% We want the name of the section and the hypertarget command on the
% same page, but \printindex issues \clearpage. Thus we do it by hand
% her and redefine \clearpage.
{\let\rhxclearpage\clearpage%
\clearpage%
\renewcommand{\clearpage}{\def\clearpage{\rhxclearpage}}%
\hypertarget{sec:Index}{}%
\printindex
}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\end{document}
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- [Axiom-developer] rhxtangle,
Ralf Hemmecke <=