[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#270: texinfo generates invalid html

From: xah lee
Subject: bug#270: texinfo generates invalid html
Date: Sat, 17 May 2008 08:49:04 -0700

The elisp document generated by texinfo in html is not valid html.

Here's the major problems:

Problems with texinfo generated html, with respect to html 4 transitional:

    * there's no doctype declaration.
* when there's a footnote, it is generated as <p><hr></div> which is invalid.

Problems with respect to html4 strict:

    * “<ol type=1 start=1>” should just be “<ol>”.
* sometimes there's “</p></blockquote>” but missing a opening “<p>”. * whenever there's a “<b>Common Lisp note:</b>”, it should have a “<p>” wrapped around the block, since it's inside “<blockquote>” and html4strict requires it.

Other minor problems:

* the css is plastered into every page. It should be one css file instead. * it should declare utf8 as the charset. (so that it doesn't need to do a lot html character encoding)
    * the ending </p> is often not used.

Dead Links to external docs

In the elisp manual (one node per html page, roughly 850 html pages), there are 70 (local) links to other GNU documents. The local links are nice in that they provide cross-reference, but if one hosts only the elisp doc, all these local links will be dead.

Therefore, it would be nice, to have perhaps at texinfo level to embed markers to links that cross-ref to external docs, or perhaps at the html conversion level to provide a option to filter local links, so that local links can replaced as non-links (such as “See Emacs manual node on Abbrev”) or full http links to the right uri at gnu.org.

Use of ascii...

texinfo still use the convention of backtick ` and straight single quote ' to emulate curly ones “” and ‘’, and other ascii kludge such as “=>” instead of “⇒”. The ability to displaying these chars has been widely available on commercial platforms since mid 1990s, and on linuxes since about 2003 or so (emacs itself support unicode to a practical degree since emacs 21, released in 2001). It is perhaps time to update gnu doc convention to utf8 and use the proper characters.


The HTML generated by texinfo is actually far superior than other org's, such as those of perl, pyhton, java, in the sense that when sending the html to w3c's validator, texinfo's html actually contain just a few errors, all are fixable. While other org's such as python (which was generated from TeX), are so messy that is not fixable.

Kudos to the textinfo developer(s).

PS I had problem with the quality of FSF's documentation uri. Namely, sometimes the doc's uri disappears, so that people cannot reliably link to it. Also, some links in the doc are dead links to to the transformation scheme of links from texinfo to uri. For detail, see: http://xahlee.org/emacs/gnu_doc.html (warning: rant)

∑ http://xahlee.org/

reply via email to

[Prev in Thread] Current Thread [Next in Thread]