[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
GSoC project "Hyphenation"?
GSoC project "Hyphenation"?
Tue, 27 Mar 2012 16:01:30 +0000
Gnus/5.13 (Gnus v5.13) Emacs/23.1 (gnu/linux)
time and time again I have searched for "Emacs" and "hyphen-
ation", and so little results came up that I looked up "hy-
phenation" again to make sure that I hadn't misspelled it.
It seems that it is not a feature often asked for as the
typical workflow of text processing in Emacs usually in-
volves TeX or something similar, but I do find myself often
in need to hyphenate texts like mails or output of console
programs. With Google Summer of Code around, I'd like to
propose the following idea "Hyphenation in GNU Emacs":
1. Research, define and qualify "use cases"
Where in the Emacs world could hyphenation be used, where
must it not be and how would it be used in a typical
workflow? For example, in TeX documents or program
sources, automatic hyphenation is probably only useful in
comments if at all. In text modes, paragraphs are writ-
ten, filled, edited, refilled, killed, yanked, etc. In
HTML and other languages, it might be useful to add soft
hyphens to individual or all words. In all modes, it
might be handy to show possible hyphenations for the word
These use cases can be ordered according to their (pos-
itive) effect on user productivity and difficulty of im-
plementation. At this stage the mentor would decide
which of these use cases would have to be implemented as
part of this project.
2. Research and define a high-level interface and syntax
Based on the use cases, how would the user specify the
hyphenation "locale" wanted? How does that relate to
other language-specific customizations? How would edit-
ing and filling functions query the hyphenation of a par-
ticular word? How would automatically hyphenated words
be marked up in buffers and on disk?
3. Implement a dummy backend and set up tests
Compile a list of hyphenated words from free sources and
implement a backend that uses them. Set up a test suite
that compares the results generated by other backends
4. Implement the frontend
This involves amending the editing and filling functions
so that the use cases identified in 1. can be fulfilled
with the limited word list of the dummy backend. This
would also serve as the mid-term evaluation point.
5. Identify possible backends, their (legal) compatibility
with GNU Emacs and implement them
5.1. One of the most often used algorithms is the one de-
veloped by Franklin Mark Liang and implemented for
TeX. While there are implementations even in GNU
Emacs Lisp, the licence of the accompanying pattern
files is often a topic of discussion so that for ex-
ample Apache FOP outsourced them to a separate pack-
a) Work out with FSF whether and how pattern files
can be included in which form. As groff does
this, I am confident that this path can be fol-
lowed. Port/review and adjust an implementation
of Liang's algorithm and enhance the Emacs build
system by targets that import the pattern files
and convert them to GNU Emacs Lisp.
b) If they cannot be included, define a user inter-
face with sensible defaults that point to their
location elsewhere. Candidates are installations
of (La)TeX and the aforementioned "FOP XML Hy-
phenation Patterns". Implement a reader.
5.2. There are other backends that implement other algo-
rithms or clad Liang's in a different form. Re-
search whether they are popular (enough) and option-
ally implement a connector. If 5.1. is legally fea-
sible, this would be an add-on.
6. Test the system and fix the bugs.
Completion criteria would be that:
- at least the use cases selected by the mentor in
1. would be implemented with a non-dummy backend,
- the source is documented to a degree that a third per-
son who is familiar with hyphenation/the chosen algo-
rithm understands the code so that it can be main-
- no existing functionality has been broken :-).
As the project is aimed at users and Emacs developers appar-
ently didn't bother enough about hyphenation to implement it
themselves :-), I'd plan to code the project in the early
stages as a separate package that would advice the relevant
core functions so that it could be tested by users running a
regular release, and only integrate it in the regular code
late in the game.
Comments or sentiments?
- GSoC project "Hyphenation"?,
Tim Landscheidt <=