emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

GSoC project "Hyphenation"?


From: Tim Landscheidt
Subject: GSoC project "Hyphenation"?
Date: Tue, 27 Mar 2012 16:01:30 +0000
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/23.1 (gnu/linux)

Hi,

time and time again I have searched for "Emacs" and "hyphen-
ation", and so little results came up that I looked up "hy-
phenation" again to make sure that I hadn't misspelled it.
It seems that it is not a feature often asked for as the
typical workflow of text processing in Emacs usually in-
volves TeX or something similar, but I do find myself often
in need to hyphenate texts like mails or output of console
programs.  With Google Summer of Code around, I'd like to
propose the following idea "Hyphenation in GNU Emacs":

1. Research, define and qualify "use cases"

   Where in the Emacs world could hyphenation be used, where
   must it not be and how would it be used in a typical
   workflow?  For example, in TeX documents or program
   sources, automatic hyphenation is probably only useful in
   comments if at all.  In text modes, paragraphs are writ-
   ten, filled, edited, refilled, killed, yanked, etc.  In
   HTML and other languages, it might be useful to add soft
   hyphens to individual or all words.  In all modes, it
   might be handy to show possible hyphenations for the word
   at point.

     These use cases can be ordered according to their (pos-
   itive) effect on user productivity and difficulty of im-
   plementation.  At this stage the mentor would decide
   which of these use cases would have to be implemented as
   part of this project.

2. Research and define a high-level interface and syntax

   Based on the use cases, how would the user specify the
   hyphenation "locale" wanted?  How does that relate to
   other language-specific customizations?  How would edit-
   ing and filling functions query the hyphenation of a par-
   ticular word?  How would automatically hyphenated words
   be marked up in buffers and on disk?

3. Implement a dummy backend and set up tests

   Compile a list of hyphenated words from free sources and
   implement a backend that uses them.  Set up a test suite
   that compares the results generated by other backends
   with this.

4. Implement the frontend

   This involves amending the editing and filling functions
   so that the use cases identified in 1. can be fulfilled
   with the limited word list of the dummy backend.  This
   would also serve as the mid-term evaluation point.

5. Identify possible backends, their (legal) compatibility
   with GNU Emacs and implement them

   5.1. One of the most often used algorithms is the one de-
        veloped by Franklin Mark Liang and implemented for
        TeX.  While there are implementations even in GNU
        Emacs Lisp, the licence of the accompanying pattern
        files is often a topic of discussion so that for ex-
        ample Apache FOP outsourced them to a separate pack-
        age.

        a) Work out with FSF whether and how pattern files
           can be included in which form.  As groff does
           this, I am confident that this path can be fol-
           lowed.  Port/review and adjust an implementation
           of Liang's algorithm and enhance the Emacs build
           system by targets that import the pattern files
           and convert them to GNU Emacs Lisp.

        b) If they cannot be included, define a user inter-
           face with sensible defaults that point to their
           location elsewhere.  Candidates are installations
           of (La)TeX and the aforementioned "FOP XML Hy-
           phenation Patterns".  Implement a reader.

   5.2. There are other backends that implement other algo-
        rithms or clad Liang's in a different form.  Re-
        search whether they are popular (enough) and option-
        ally implement a connector.  If 5.1. is legally fea-
        sible, this would be an add-on.

6. Test the system and fix the bugs.

   Completion criteria would be that:

   - at least the use cases selected by the mentor in
     1. would be implemented with a non-dummy backend,

   - the source is documented to a degree that a third per-
     son who is familiar with hyphenation/the chosen algo-
     rithm understands the code so that it can be main-
     tained, and

   - no existing functionality has been broken :-).

As the project is aimed at users and Emacs developers appar-
ently didn't bother enough about hyphenation to implement it
themselves :-), I'd plan to code the project in the early
stages as a separate package that would advice the relevant
core functions so that it could be tested by users running a
regular release, and only integrate it in the regular code
late in the game.

  Comments or sentiments?

Tim




reply via email to

[Prev in Thread] Current Thread [Next in Thread]