[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[help-texinfo] Re: texinfo.xsl not working

From: Torsten Bronger
Subject: [help-texinfo] Re: texinfo.xsl not working
Date: Thu, 02 Jun 2005 18:44:09 +0200
User-agent: Gnus/5.110003 (No Gnus v0.3) Emacs/21.3.50 (gnu/linux)


"R. Mattes" <address@hidden> writes:

> On Thu, 02 Jun 2005 09:46:52 +0200, Torsten Bronger wrote:
>> [...]
>> Now we are talking about HTML output of makeinfo?
> No, all of my mail refers to the transformation of makeinfo's XML
> output by means of the xsl stylesheet in the texinfo distribution.

Sorry, then I recalled wrongly.  I thought it was XSL (without "T").

>>> I'm not too happy with the xml current output: i'd expect an
>>> encoding attribute in the xml declaration (or, even better, a
>>> commandline switch to configure the encoding attribute). I really
>>> don't think there's a place for all theses special entities that
>>> duplicate characters easily available in ISO-8859-1 and Unicode.
>> All this is not necessary in my opinion, because UTF-8 is the
>> default with makeinfo's ASCII being a real subset.  
> What are you talking about?

You said you'd like to have an encoding attribute, I say that this
is not necessary because the default encoding (according to the XML
specs) is UTF-8 anyway.

You said you want to get rid of entities, I say they don't hurt
since XML is an intermediate format.

> Makeinfo unfortunately does not enforce any restrictions on the
> character encoding. Any 8bit character in the textinfo source gets
> passed ito the XML output as is.

If the @documentencoding is ignored at the moment, this is a bug,
but a completely other construction site.  Eventually makeinfo's XML
will either contain UTF-8 or entities (or both), whatever is easier
to implement.

>> Besides, the XML output is not intended for human eyes and XML
>> parsers don't care at all.
> So they say. As i already wrote the use of character entities
> (besides the ones defined in the XML standard) requires parsers to
> be able to read (and find!) the defining DTD.

Okay, but above you made the impression that you want to get rid of
entities in favour of direct encoding, and now you just object to
*named* entities.  I agree that they should be replaced in the
future.  On the other hand, this issue could also be solved by
copying the entity definitions into the XML document.

> Given that all except two (sic!) entities are just named
> characters entites that looks like extreme overkill.

Well, for testing and development work, these entities are very
convenient.  And by the way, XML is overkill anyway.  ;-)

> [...]  Note: this is _not_ HTML 1.0 that inherited from SGML the
> requirement to be 7bit clean in certain environments.
>  Argh, i just had a look at the texinfo.dtd:
>  ....
>  <!ENTITY ellipsis   ""> 
>  [...]
>  <!ENTITY eosperiod  "">
>  ...
> VERY smart, indeed!

Not too stupid nevertheless.  This is how it looks like on my
harddisk at the moment:

<!ENTITY ellipsis   "&#x2026;">
<!ENTITY lt         "&#x3c;">
<!ENTITY gt         "&#x3e;">
<!ENTITY bullet     "&#x2022;">
<!ENTITY copyright  "&#xa9;">
<!ENTITY registered "&#xae;">
<!ENTITY euro       "&#x20ac;">
<!ENTITY pounds     "&#xa3;">
<!ENTITY minus      "&#x2212;">
<!ENTITY linebreak  "<linebreak/>">
<!ENTITY space      " ">          <!-- Should become an element. -->
<!ENTITY dots       "<punct end-of-sentence='no'>&#x2026;</punct>">
<!ENTITY enddots    "<punct end-of-sentence='yes'>&#x2026;</punct>">
<!ENTITY amp        "&#x26;">
<!ENTITY ldquo      "&#x201c;">
<!ENTITY rdquo      "&#x201d;">
<!ENTITY mdash      "&#x2014;">
<!ENTITY ndash      "&#x2013;">
<!ENTITY period     "<punct end-of-sentence='no'>.</punct>">
<!ENTITY eosperiod  "<punct end-of-sentence='yes'>.</punct>">
<!ENTITY quest      "<punct end-of-sentence='no'>?</punct>">
<!ENTITY eosquest   "<punct end-of-sentence='yes'>?</punct>">
<!ENTITY excl       "<punct end-of-sentence='no'>!</punct>">
<!ENTITY eosexcl    "<punct end-of-sentence='yes'>!</punct>">

You see, I've already played with them intensively.  I couldn't have
done it with the source code, because I haven't understood it.  ;-)
(Probably I never will.)  Besides, the replacements would make the
code even longer.

So, I can ship my Texinfo transformer with *my* DTD, and I use the
entities as hooks for modifying makeinfo's behaviour without forcing
my users to recompile it.

> So, instead of taking advantage of the fact that Unicode has
> codepoints for almost(?) all these characters they just got
> removed from the XML output?

Eventuelly, yes.  They will look like my replacements above (again,
which are pure ASCII).

>> [...]
> I'll have a look at the makeinfo sources later, i first
> concentrate on a working XSLT sheet (and fix those broken entities
> ..)

Related, but much more general:

How about putting Texinfo on an XML basis?  makeinfo would produce
info and XML, and anything else is realised with backends in XSLT.
The info readers should be made ready to read the XML so that
eventually the info file format would go.

David Kastrup talked with me about his ideas for a better
fundamentum for Texinfo a couple of weeks ago, but unfortunately he
had to leave early.

I've read a *long* "new Texinfo format" thread in an Emacs mailing
list archive (IIRC starting regarding ANSI sequences), but nothing


Torsten Bronger, aquisgrana, europa vetus

reply via email to

[Prev in Thread] Current Thread [Next in Thread]