Modern conventions for structuring Emacs Lisp libraries

emacs-devel
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Modern conventions for structuring Emacs Lisp libraries

From:	Thorsten Jolitz
Subject:	Modern conventions for structuring Emacs Lisp libraries
Date:	Sat, 05 Oct 2013 15:31:37 +0200
User-agent:	Gnus/5.130002 (Ma Gnus v0.2) Emacs/24.3 (gnu/linux)
Hi List,

as a second take on this topic, I would like to make the attached
proposal for improving the conventions for structuring Emacs Lisp
source-code files:



               _________________________________________

                MODERN CONVENTIONS FOR EMACS LISP FILES

                            Thorsten Jolitz
               _________________________________________


                            <2013-10-05 Sa>


Table of Contents
_________________

1 Motivation
2 Critics of oldschool conventions
.. 2.1 General Considerations
.. 2.2 Real-world examples
..... 2.2.1 Last Line Pathology
..... 2.2.2 Wasted First Level Weakness
..... 2.2.3 Trailing Colon Weakness
..... 2.2.4 Comment Character as Headline-Level-Signals Weakness (Pathology?)
..... 2.2.5 Free Text Meta-Data Weakness
3 Summary





1 Motivation
============

  The motivation for this proposal is twofold:

  Org-mode everywhere: I use Org-mode all day long for all kinds of tasks
                       and really appreciate the Org way of structuring
                       files (with easy navigation and visibility
                       cycling). I want that for my (Emacs Lisp) source
                       code files too (maybe I'm not the only one).
  Current Pathologies: of old-school conventions. Looking at Emacs sources
                       and other Elisp libraries from the Org-mode
                       point-of-view, a few pathologies of the oldschool
                       conventions for Emacs Lisp are quite obvious, as
                       well as some misunderstandings that result from the
                       weaknesses of those oldschool conventions

  The idea is to
  1. fix the pathologies of the oldschool conventions
  2. improve some conventions that give bad results in practise
  3. write a library (e.g. modern-conventions.el) that transforms modern
     (org-mode style) elisp files into files in the (fixed) oldschool
     style, so that both styles can be considered equivalent (and are
     accepted for MELPA and other package repos)


2 Critics of oldschool conventions
==================================

2.1 General Considerations
~~~~~~~~~~~~~~~~~~~~~~~~~~

  I would guess that many people either don't really structure their
  elisp files, or they use page breaks in combination with
  outline-minor-mode (folding the function/variable definitions rather
  than folding explicitly defined headlines).

  However, there is this convention that

  ,----
  | ^;;; Text
  `----

  starts a first level headline, and every additional ';' defines one
  higher level (with e.g."^;;;;; Text" defining a third-level headline),
  and its application is somehow enforced by the loose conventions for
  writing the header part of an Elisp library.

  Therefore, when outline-minor-mode with [outshine.el] extensions is
  activated in the Emacs Lisp buffer, it can be viewed and treated (wrt
  navigation and visibility cycling) just like an Org-mode buffer that
  is structured with hierarchical headlines.

  I do this all the time, since I tend to use [navi-mode.el] (based on
  outshine) for buffer navigation. So the pathologies and weaknesses of
  the oldschool conventions stare me in the face every day.


  [outshine.el] https://github.com/tj64/outshine

  [navi-mode.el] https://github.com/tj64/navi


2.2 Real-world examples
~~~~~~~~~~~~~~~~~~~~~~~

  [DISCLAIMER: I have no intention to blame the authors of the cited
  libraries for using "bad style" or so, I just need some real-world
  examples for my critics so I simply use arbitrary Emacs source files]


2.2.1 Last Line Pathology
-------------------------

  Elisp files should (and do) end with a line like this:

  ,----
  | ;;; dired.el ends here
  `----

  Considering that "^;;; Text" defines a first level headline, this is a
  clear pathology: this line is *not* a headline, it is just a comment
  signalling 'end-of-file'.


2.2.2 Wasted First Level Weakness
---------------------------------

  It is customary to start the code section of a library, which begins
  after the header comment section, with this first level headline:

  ,----
  | ;;; Code:
  `----

  This leads to very unbalanced weights of the first level headlines,
  i.e. to bad file structuring.

  ,----
  | 7 matches for "^;;; " in buffer: help.el.gz
  |       1:;;; help.el --- help commands for Emacs
  |      25:;;; Commentary:
  |      30:;;; Code:
  |     280:;;; `User' help functions
  |     967:;;; Automatic resizing of temporary buffers.
  |    1041:;;; Help windows.
  |    1193:;;; help.el ends here
  `----

  The above example, produced by typing '1' in the associated
  *Navi-buffer* of help.el (-> show headlines up-to first-level), shows
  this weakness quite clearly:

  1. in a file with some 1200 lines of code, there are 7 1st-level
     headlines
  2. at most 3 (if not only 2) of 7 headlines should really be 1st-level
     headlines:

  ,----
  | ;;; help.el
  | [;;; Commentary:]
  | ;;; Code:
  `----

  1. Correct structuring would result in a file with two 1st-level
     headlines, the first one containing some 30 lines, the second one
     containing some 1170 lines of code.

  Starting the code section with a ";;; Code:" headline completely
  wastes the first level for further file structuring. Therefore my
  proposal:

  Let the first line in the file be a 1st-level-headline:

  ,----
  | ;;; help.el --- help commands for Emacs
  `----

  all other headlines in the header-comment section are subheadlines of
  this one, i.e. level 2 or higher. The next 1st-level-headline then
  implicitly starts the code-section, but *must not* be named ";;;
  Code:" because it is clear form the context that the code sections
  starts here:

  ,----
  | 8 matches for "^;;; " in buffer: navi-mode.el
  |       1:;;; navi-mode.el --- major-mode for easy buffer-navigation
  |     246:;;; Requires
  |     251:;;; Mode Definitions
  |     282:;;; Variables
  |     862:;;; Defuns
  |    1672:;;; Menus and Keys
  |    1963:;;; Run Hooks and Provide
  |    1970:;;; navi-mode.el ends here
  `----

  [NOTE: I was forced to reconvert the navi-mode.el file to oldschool
  headlines and to include the pathological last line to be able to add
  it to MELPA -> there should only be 7 headlines in this listing
  really!]

  It does not matter if you structure your file by code criteria
  (Variables, Defuns, Keys ...) or by content criteria, or a mix of
  both, like in this example:

  ,----
  | 41 matches for "^;;; " in buffer: org.el
  |       1:;;; org.el --- Outline-based notes management and organizer
  |      26:;;; Commentary:
  |      63:;;; Code:
  |     278:;;; Version
  |     316:;;; Compatibility constants
  |     318:;;; The custom variables
  |    4121:;;; Miscellaneous options
  |    4152:;;; Functions and variables from their packages
  |    4210:;;; Autoload and prepare some org modules
  |    4578:;;; Variables for pre-computed regular expressions,  all buffer 
local
  |    5216:;;; Some variables used in various places
  |    6486:;;; Cycling
  |    7087:;;; Saving and restoring visibility
  |    7123:;;; Folding of blocks
  |    7213:;;; Org-goto
  |    7416:;;; Indirect buffer display of subtrees
  |    7507:;;; Inserting headlines
  |    7825:;;; Promotion and Demotion ...
  `----

  the important thing is to have a balanced structure with a sufficient
  number of real 1st-level-headlines whose headline-text carries real
  information about the content. You can see above that org.el sticks to
  the

  ,----
  | ;;; Code:
  `----

  convention, but then ignores it immediately - otherwise, all
  subsequent headlines should be 2nd-level really.


2.2.3 Trailing Colon Weakness
-----------------------------

  ,----
  | ;;; help.el
  | [;;; Commentary:]
  | ;;; Code:
  `----

  In oldschool Elisp files, headlines often end in colons. In Org-mode,
  this is not usual, and the Org-mode conventions are better in this
  case. Taking into account that with [outorg.el] (based on outshine)
  every Elisp file can be converted to an Org-mode file with a
  keystroke, and subsequently be exported from Org-mode to HTML, LaTeX,
  and many other backends, it becomes clear why this is a bad
  convention: the colon is not needed to signal a headline, and almost
  always looks bad in the exported output formats.


  [outorg.el] https://github.com/tj64/outorg


2.2.4 Comment Character as Headline-Level-Signals Weakness (Pathology?)
-----------------------------------------------------------------------

  Its a bad idea to use the Elisp comment char ';' to signal that:

  - this is a headline
  - this headline has level X

  The otherwise fantastic library dired.el illustrates why:

  ,----
  | 40 matches for "^;;;;?;?;?;?;?;?;? " in buffer: dired.el.gz
  |       1:;;; dired.el --- directory-browsing commands -*- lexical-binding: t 
-*-
  |      26:;;; Commentary:
  |      35:;;; Code:
  |      37:;;; Customizable variables
  |     193:;;; Hook variables
  |     433:;;;   ;;
  |     434:;;;   ;; Files that are group or world writable.
  |     435:;;;   (list (concat dired-re-maybe-mark dired-re-inode-size
  |     493:;;; Macros must be defined before they are used,  for the byte 
compiler.
  |    1282:;;;  Might as well not override the user if the user changed this.
  |    1283:;;;  (setq buffer-read-only t)
  |    2101:;;; Functions for extracting and manipulating file names in Dired 
buffers.
  |    2233:;;; Functions for finding the file name in a dired buffer line.
  |    2328:;;; COPY NAMES OF MARKED FILES INTO KILL-RING.
  |    2449:;;; utility functions
  |    3199:;;; Commands to mark or flag files based on their characteristics 
or names.
  |    3477:;;; Sorting
  |    3631:;;;;  Drag and drop support
  |    3734:;;;;  Desktop support
  |    3779:;;; Start of automatically extracted autoloads.
  |    3782:;;;;;;  dired-do-search dired-do-isearch-regexp dired-do-isearch
  |    3783:;;;;;;  dired-isearch-filenames-regexp dired-isearch-filenames 
dired-isearch-filenames-setup
  |    3784:;;;;;;  dired-hide-all dired-hide-subdir dired-tree-down 
dired-tree-up
  |    3785:;;;;;;  dired-kill-subdir dired-mark-subdir-files dired-goto-subdir
  |    3786:;;;;;;  dired-prev-subdir dired-insert-subdir 
dired-maybe-insert-subdir
  |    3787:;;;;;;  dired-downcase dired-upcase dired-do-symlink-regexp 
dired-do-hardlink-regexp
  |    3788:;;;;;;  dired-do-copy-regexp dired-do-rename-regexp dired-do-rename
  |    3789:;;;;;;  dired-do-hardlink dired-do-symlink dired-do-copy 
dired-create-directory
  |    3790:;;;;;;  dired-rename-file dired-copy-file dired-relist-file 
dired-remove-file
  |    3791:;;;;;;  dired-add-file dired-do-redisplay dired-do-load 
dired-do-byte-compile
  |    3792:;;;;;;  dired-do-compress dired-query dired-compress-file 
dired-do-kill-lines
  |    3793:;;;;;;  dired-run-shell-command dired-do-shell-command 
dired-do-async-shell-command
  |    3794:;;;;;;  dired-clean-directory dired-do-print dired-do-touch 
dired-do-chown
  |    3795:;;;;;;  dired-do-chgrp dired-do-chmod dired-compare-directories 
dired-backup-diff
  |    3796:;;;;;;  dired-diff) "dired-aux" "dired-aux.el" 
"066bb17769887a7fbc0490003f59e4b3")
  |    3797:;;; Generated autoloads from dired-aux.el
  |    4299:;;;;;;  "dired-x" "dired-x.el" "ce753ade80ea9f4e64ab3569e3a5421e")
  |    4300:;;; Generated autoloads from dired-x.el
  |    4336:;;; End of automatically extracted autoloads.
  |    4342:;;; dired.el ends here
  `----

  Typing '8' in the associated *Navi-buffer* (show headlines up to level
  8) shows a wild mix of real headlines and comments. This is because it
  is only natural for people to get creative with comment characters
  [1], because they feel free to use them as it pleases them - its just
  about comments in the end.

  But with those oldschool conventions, comments and headline syntax
  clash, and people might forget about it or not be aware about it.

  Therefore I propose Org-style (= Outshine) headlines as modern
  alternative:

  ,----
  | Outshine headlines are outcommented Org-mode headlines
  `----

  for example:

  ,----
  | 18 matches for "^;; \*\*? " in buffer: iorg-scrape.el
  |       1:;; * iorg-scrape.el --- elisp glue code for `picoLisp/lib/scrape.l'
  |       2:;; ** MetaData
  |      17:;; ** Commentary
  |      22:;; ** ChangeLog
  |      24:;; * Requires
  |      31:;; * Mode and Exporter definitions
  |      33:;; ** Mode definitions
  |      84:;; * Variables
  |      85:;; ** Hooks
  |      86:;; ** Vars
  |     111:;; ** Customs
  |     114:;; * Functions
  |     115:;; ** Non-interactive Functions
  |     201:;; ** Commands
  |     855:;; * Menus and Keys
  |     856:;; ** Menus
  |     857:;; ** Keys
  |     946:;; * Run hooks and provide
  `----

  They are more readable (its much easier to spot the headline level),
  the comment-chars do only their core job - outcomment text in code
  buffers -, all the headline related info is contained in the '*'
  chars, and Org-mode users feel right at home in their source-code
  buffers.

  A big plus: this is /major-mode agnostic/. All 3 libraries (outshine,
  outorg and navi-mode) adapt to the specific comment-syntax defined for
  a major-mode, thus the 'outshine-way' of structuring source-code files
  is not restricted to Emacs Lisp files. As an example, here is a
  PicoLisp file (comment-start character # instead of ;):

  ,----
  | 14 matches for "^## \*\*? " in buffer: geometry.l
  |       1:## * geometry.l --- OpenGis Simple Features
  |      15:## ** Commentary
  |      20:## * Spatial Reference System
  |      30:## * Geometry (abstract root class)
  |     113:## * Point
  |     133:## * Curve
  |     184:## ** LineString
  |     230:## * Surface
  |     264:## ** Polygon
  |     310:## * GeometryCollection
  |     336:## ** MultiPoint
  |     356:## ** MultiCurve
  |     402:## ** MultiSurface
  |     472:## * File Local Variables
  `----

  Conversion between oldschool and outshine headlines would be
  exceedingly easy - as long as the oldschool files don't mix comments
  and headlines, i.e don't have "invalid" syntax (in a loose sense).


2.2.5 Free Text Meta-Data Weakness
----------------------------------

  Elisp source files start with a comment-header that gives some
  half-standardized meta-info about the file as well as usage
  instructions for the library. While the usage instructions a very
  library specific and might even be missing, the meta-data is
  obligatory and should as harmonized as possible. Package repos use
  special Elisp parser libraries to read-out the meta-data from these
  sections.

  This is quite ok, but still much weaker than it could be. The problem
  is that the free text format of the meta-data sections makes it just
  too easy to introduce variations and desviations that are completely
  accidental of just a matter of taste and don't add anything to the
  contained information, but rather make it much more difficult to
  extract and set this information programmatically.

  Lets use help.el as an example for a free-text meta-data section
  (please note how old this file is, and that since these old days many
  styles for writing these sections have evolved):

  ,----
  | ;;; help.el --- help commands for Emacs
  |
  | ;; Copyright (C) 1985-1986,  1993-1994,  1998-2013 Free Software
  | ;; Foundation,  Inc.
  |
  | ;; Maintainer: FSF
  | ;; Keywords: help,  internal
  | ;; Package: emacs
  |
  | ;; This file is part of GNU Emacs.
  |
  | ;; GNU Emacs is free software: you can redistribute it and/or modify
  | ;; it under the terms of the GNU General Public License as published by
  | ;; the Free Software Foundation,  either version 3 of the License,  or
  | ;; (at your option) any later version.
  |
  | ;; GNU Emacs is distributed in the hope that it will be useful,
  | ;; but WITHOUT ANY WARRANTY; without even the implied warranty of
  | ;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
  | ;; GNU General Public License for more details.
  |
  | ;; You should have received a copy of the GNU General Public License
  | ;; along with GNU Emacs.  If not,  see <http://www.gnu.org/licenses/>.
  `----

  As mentioned before, with 'outorg.el' every well structured Elisp file
  is an Org-mode file too, because 'M-x outorg-edit-as-org' (org M-#
  M-#) presents the file converted from Elisp to Org in a temporary
  Org-mode edit buffer (while 'M-x outorg-copy-edits-and-exit' or M-# in
  the edit buffer converts the edited Org text back to Elisp and copies
  it to the original source-code buffer).

  Thus, Org-mode's well developed functionality for storing, reading and
  writing meta-data is easily available for the Elisp programmer too:
  *property-drawers* and their *Property API*.

  Therefore I propose to introduce a 2nd-level headline

  ,----
  | ;; ** MetaData
  `----

  directly below the first 1st-level-headline of the source file (i.e.
  its first line) and store the file's meta-data in a property-drawer
  attached to this headline:

  ,----
  | ;; * iorg-scrape.el --- elisp glue code for `picoLisp/lib/scrape.l'
  | ;; ** MetaData
  | ;;   :PROPERTIES:
  | ;;   :copyright: Thorsten_Jolitz
  | ;;   :copyright-since: 2013
  | ;;   :version:  0.9
  | ;;   :licence:  GPL3+
  | ;;   :licence-url: http://www.gnu.org/licenses/
  | ;;   :part-of-emacs: no
  | ;;   :git-repo: https://github.com/tj64/iorg
  | ;;   :git-clone: address@hidden:tj64/iorg.git
  | ;;   :authors: Thorsten_Jolitz
  | ;;   :contact: <address@hidden>
  | ;;   :keywords: emacs org-mode picolisp iorg scrape
  | ;;   :END:
  |
  | ;; ** Commentary ...
  `----


  The advantages would be:

  1. With point on the "** MetaData" headline, a single 'M-# M-#' would
     offer this headline for editing in Org-mode in the
     *outorg-edit-buffer*, and Org-mode's functionality for [editing
     properties] (maybe even in [column view]) would be available to the
     Elisp programmer.

     Since it is easy to define /allowed values/ for Org-mode
     properties, for some properties (like license, licence-url,
     part-of-emacs, keywords) a fixed set of possible values could be
     given, helping to reduce accidental variation even further.

  2. Since a [Property API] for Org-mode's properties exists, reading
     and writing them from an Emacs Lisp program becomes almost trivial.

  3. Recently, ways to export these property drawers were added to the
     new Org-mode exporter, thus the meta-info can easily be exported to
     text-formatting backends like HTML or LaTeX for sharing with
     (presenting to) others. In the *outorg-edit-buffer* full Org-mode
     functionality is available, including Org-mode's export facilities.

  With establishing a few rules for this MetaData section (about how to
  format author names and emails e.g., expecially in the case of
  multiple authors), the relevant parts of the comment-header of Elisp
  files could really be converted into meta-DATA, human and
  machine-readable at the same time [2].


  [editing properties]
  http://orgmode.org/manual/Properties-and-Columns.html

  [column view] http://orgmode.org/manual/Column-view.html#Column-view

  [Property API]
  http://orgmode.org/manual/Using-the-property-API.html#Using-the-property-API


3 Summary
=========

  Being an Org-mode user and looking at Elisp libraries from the
  Org-mode users perspective every day (via /outshine/ and /navi-mode/),
  I recognized quite of few weaknesses of the oldschool conventions for
  structuring Emacs Lisp files. Therefore I propose to partially fix
  these weaknesses for the oldschool libraries and, at the same time,
  introduce alternative modern conventions (the /outshine/ way) for
  structuring Elisp libraries. Once the fixes and modern conventions
  have been agreed upon, an Elisp library (modern-convention.el) could
  be written that converts one style into the other, making them
  equivalent and thus both acceptable for official and unofficial
  package repos.



Footnotes
_________

[1] especially the very personal .emacs files show a great level of
sometimes almost artistic creativity with comment chars.

[2] in Org-mode there is the :ARCHIVE: tag that, among other things,
keeps headlines folded during global visibility cycling. It is planned
to implement this for outshine.el too, such that comment- and
meta-data sections can be tagged and stay completely out of the way of
the programmer's view on the source-buffer, except they are explicitly
unfolded.


--
cheers,
Thorsten
[Prev in Thread]
Current Thread
[Next in Thread]
Modern conventions for structuring Emacs Lisp libraries, Thorsten Jolitz <=
- Re: Modern conventions for structuring Emacs Lisp libraries, Richard Stallman, 2013/10/05
  - Re: Modern conventions for structuring Emacs Lisp libraries, Daniel Colascione, 2013/10/05
    - legalese haters club, Stephen J. Turnbull, 2013/10/06
    - Re: Modern conventions for structuring Emacs Lisp libraries, Richard Stallman, 2013/10/06
  - Re: Modern conventions for structuring Emacs Lisp libraries, Thorsten Jolitz, 2013/10/06
    - Re: Modern conventions for structuring Emacs Lisp libraries, Stefan Monnier, 2013/10/06
- Re: Modern conventions for structuring Emacs Lisp libraries, Richard Stallman, 2013/10/05
  - Re: Modern conventions for structuring Emacs Lisp libraries, Thorsten Jolitz, 2013/10/06
    - Re: Modern conventions for structuring Emacs Lisp libraries, Stefan Monnier, 2013/10/06
    - Re: Modern conventions for structuring Emacs Lisp libraries, Xue Fuqiao, 2013/10/06
Prev by Date: Re: [PATCH] package.el: check tarball signature
Next by Date: Re: ielm changes: display standard-output in buffer
Previous by thread: Text Mode Menu
Next by thread: Re: Modern conventions for structuring Emacs Lisp libraries
Index(es):
- Date
- Thread