[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

idea for Google Summer of Code project: html-reading info

From: Per Bothner
Subject: idea for Google Summer of Code project: html-reading info
Date: Wed, 10 Feb 2016 11:42:15 -0800
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.5.0

I suggest the texinfo project sponsor the following proposed project
for Google Summer of Code, under the GNU umbrella.   Ideally we'd
want two mentors.  I can be one of them, but it would be good to
have someone familiar with the internals of the info program.

** Enhance GNU info documentation reader to read html files **

Background: GNU documentation is based on the texinfo source format 
similar to markdown, but better for documentation), and tools that can convert 
into various formats, including pdf, html, and info format.  The info format is 
used for
reading the documentation in a terminal emulator (using the info program) or
in the emacs editor.

The info format is used as the primary "distribution format" because it is
readable on any terminal, using the info program, but it doesn't have the
structural information that was in the texinfo source: There is no styling,
and since lines are pre-wrapped, info can't adjust to different terminal widths.

We would like to deprecate the info format as a distribution format, while still
using the texinfo source format and tool-chain.  Instead, using html as the
primary distribution format makes sense; we already have the tools to generate 
What is needed is a program that can read the generated html in a plain 
using the efficient keyword interface of the info program.  This would also
allow use of terminal features like color and styles, though that is not the 
primary goal.

The task is to enhance the existing info program (which is part of the texinfo 
so it can search for and read either html-format files or info-format files.  
If it finds
an html-format file, it needs to parse the html file and display it more-or-less
the same way as if it found an info-format file, and respond the keystrokes in 
the same way.

The task includes stripping out html tags; line-wrapping when appropriate; 
links, and applying minimal formatting.  The easiest approach might be for the 
program to convert on-the-fly each section ("node") of the html file to plain 
text similar
to the info format, since the info program already know how to handle that, and 
so you'd need
minimal changes to the user interface.  A slight complication is one might want 
include ansi escape sequences for highlighting or colors.

The converter does not need to handle generic html - just the html generated by 
conversion program (makeinfo) from texinfo to html.  Part of the task may be to 
suggest and
maybe implement changes to the generated html and thus the makeinfo program.
        --Per Bothner
address@hidden   http://per.bothner.com/

reply via email to

[Prev in Thread] Current Thread [Next in Thread]