[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Christmas wish: Literate Elisp

From: arthur miller
Subject: Christmas wish: Literate Elisp
Date: Thu, 12 Dec 2019 15:45:50 +0000


I have a question/proposal about Elisp and literate programming.
Well, more of a proposal then a question, but here it is:

From Wikipedia: "Literate programming is a programming paradigm
introduced by Donald Knuth in which a computer program is given
an explanation of its logic in a natural language, such as English,
interspersed with snippets of macros and traditional source code,
from which compilable source code can be generated."

Emacs already supports a form of literal programming in form of
org mode and babel where we can insert code for programming
languages in-between #+BEGIN_SRC and #+END_SRC markers,
which is super nice and cool feature.

However I got a thought that LISPs (lisp-like languages), have natural
code markers, since all code is enclosed with parenthesis. Thus one
could see '(' and ')' as code markers in literate-programming style,
more of as Knuth proposed. In other words, LISP (or at least Elisp)
does not need special markers to denote start and end of code. Unlike
Haskell, there is no need to use '\begin_code' or '>' to differentiate
code from text (comments).

My proposal is to slightly change Elisp parser to treat lines that start
with any other printable character but '(' as a start of comment and to
simply ignore the line, just as it treats ';' as a comment. Code blocks
would still be parsed as they are now, and ';' would still mean a comment,
wherever it is encountered, it is just that anything that does not
belong in a code-block (lists) is a comment. For example consider this mail,
if this would be thrown into parser all lines to this point would be simply
ignored since they don't start with '(' or are white spaces.

                (while (very-cool)

Then I could have Elisp code in between and continue to write this mail
and later on just use this as a source code. Wouldn't that be a step
toward true and more cool literate programming language? Below is another
snippet of imaginary Elisp. If we could do this, then this email would be
a working literate Elisp, where those two snippets are code and text of
this mail is just ignored.

                ; this-is-some-other-fun

What would this achieve

More then highly increased coolness factor, it would be a small quality
of life improvement. For example this would actually make it slightly
easier to use org-mode for Elisp programming. For example for us that
use org-mode to structure Emacs init file, we could just throw in our org
file directly, instead of using babel to entangle it into Elisp file first.
If every printable character but '(' starts a comment line, then everything
in org-file but Elisp code would be simply ignored, and only Elisp executed.

If we think other way around, it would also let us use pure Elisp for literate
programming without org-mode whatsoever, albeit it would be a cool feature to
use org-headings and similar to structure the code. It might make code more
structured and thus more readable. When I think in terms of Elisp as a starting
point rather then in terms of org-mode as a starting point, that could result in
adding org-mode organizational features directly to Elisp. One could even mark
say not-implemented functions as todo items, use calendar etc.

We could also entangle other languages within pure Elisp code without using org
mode whatsoever. Either within some code markers for processing them out to
separate files, or without code markers just as documentation or whatever. I
don't have some better example of use-case at the moment.

I don't mean that it is incredibly slow to entangle files, but it would be
slightly more efficient to process Elisp entangled in org mode. I also don't
think it is hard to type ';' at the beginning of a line to start a comment line.
But it is a small convenience and thus quality of life improvement that probably
does not need much changes to a parser but has quite a dramatic effect on how
source code looks in human eye (at least mine, if you don't mind that I count
myself as a part of the species :-)). It would let us use org-mode as a standard
Elisp source code format, which might be just a perceived convenience rather
then some real extra useful thing that does not exist yet.

Some thoughts about implementation

I think that in terms of cost effectiveness with implementation in mind, it
probably isn't that much work to implement this, but honestly I have no idea.
I believe it can't be much work, but I am not sure so I should really put an
exclamation mark to word probably in paragraph above. Feel free to educate me
about cost of making it work. I was looking myself in C source to see if I could
test this myself before I post here, but I couldn't find where you have implemented
parser. I am sorry, but I am that bad :-(.

Essentially when parsing literate Elisp, if I may call it so, what
parser has to do is to simply not flag random printable characters, on lines
that does not belong to code-blocks, as errors in source code. Instead just treat
them as if it has seen a ';'.

It means there are just two classes of printable characters: '(' that opens
a code block, and everything else that opens a comment block. Well almost.
Parsing code blocks would not need to be changed at all, and ';' in code blocks
would still mean that rest of line is a comment, and all code with comments
would still continue to work as it does now. It would only affect new code that
is written in this style. However new Elisp code wouldn't be backward
compatible with old versions of Emacs.

As extra, one could keep current Elisp parser and make new one and use as
in Haskell, en extra 'l' in suffix to denote a literate program, '.lel'. Though
it kind-a looks fun, I don't think it wouldn't be needed, I don't think this
change would need different parser, to ensure backward compatibility. I don't
think that is very important since if we would write Elisp that needs to run on
older versions, we can just let be to write it in literal form.


As I can think of, it would maybe make spotting errors slightly harder, for
example I could type a random character before an opening parenthesis and
comment out entire line, but those kind of errors are easily spotted on first code
run. Another drawback would be probably syntax highlighting. It could probably
become much harder to detect comments in code since there is no ';' to mark a
comment-line. Maybe I am wrong about this one, it is just a fast thought.

Final thought

I have no idea if somebody else has already thought about this and found that it
can't work. It seems like a very straight-forward and simple thing so I am
probably not the first one to think the thought, and there is probably some
reason why it is not done that I am not aware of. In that case just ignore this
email. It was just a thought I got yesterday. There might be something I am not
aware of that makes this impossible, I am not that familiar with Emacs source to
try to implement this myself unfortunately. It is just an idea, a thought for
discussion, but it would be cool if it can work.

I am sorry for very long email, I hope you have at least somewhat enjoyed my
rather ramblings and Christmas wishes, and please ask Santa to excuse my
English, I am not native English speaker, it is just my 3rd language if it is in
any defense or my horrible writing.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]