emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: enriched-mode and switching major modes.


From: Oliver Scholz
Subject: Re: enriched-mode and switching major modes.
Date: Tue, 14 Sep 2004 16:41:25 +0200
User-agent: Gnus/5.1006 (Gnus v5.10.6) Emacs/21.3.50 (windows-nt)

Richard Stallman <address@hidden> writes:

>     Word
>     processors assign properties to paragraphs, including defaults for
>     character styles (like the font, the weight etc.); and they support
>     style sheets for that.
>
> Why can't we do something like that in Emacs using text properties?
> We could perhaps have a text property on the whole paragraph
> that indirects to a list of default properties, and then have other overriding
> properties on specific characters in the paragraph.
>
> Aside from data format, what would be the difference between a "style
> sheet" and that list of default properties?
>
>     It is very, very hairy to keep paragraphs,
>     their properties and their representation in an Emacs buffer in sync,
>     not to talk about style sheets.  In fact I do think that getting a WP
>     UI right in Emacs is currently impossible.

The impossible part are tables (which I consider to be important).
table.el is a very nifty package for tables in documents that are
basically text/plain; but think of a table where the table cells of a
row contain text with different character and paragraph formatting
properties, for instance, column 1 has text with height 24 pt, column
2 only 12 pt, both in a proportional font and both having the same
line spacing.

The hairy part is whitespace formatting. The problems arise from the
fact that I can't tell Emacs: "Display this text from position POS1 to
POS2 as a paragraph with a left margin of 20 pt and a right margin of
40 pt with 20 pt above and below -- *without* adding any character to
the buffer." I am going to expand a little bit on these difficulties
below with a practical example. Parts of it are probably solvable; I
am not yet sure how reliable those solutions are.

> Since you're saying something negative, I think you should fill in the
> argument for this conclusion.  What methods have you considered?
>
>     Indeed, I believe that in the long run Emacs' display engine should
>     support a real block model.
>
> Could you explain more clearly what you mean by that?

I actually meant "box model". I am thinking of something like
specified by CSS 2 or CSS 3
<URL: http://www.w3.org/TR/2002/WD-css3-box-20021024/> (draft).

In short: a box model is an abstract way to specify the formatting of
a piece of character data on screen. Emacs' text-properties (those
affecting the display of text) could be regarded as "inline boxes" in
the terminology of that model, because they do not force the text to
which they apply to be displayed as a block (a "paragraph").

Block boxes are missing. CSS's block box model specifies margins
(between borders and surounding boxes), borders, padding (between
borders and content area) and content area as the four components of a
block box. In a picture:

+ - - - - - - - - - - - - - - - - - - - - - - - +
               Margin (Top)
|   +--------------------------------------+    |
    |          Padding (Top)               |
|   |   + - - - - - - - - - - - - - - - +  |    |
    |PL |                               |PR|
|ML |          Content Area                | MR |
    |   |                               |  |
|   |                                      |    |
    |   + - - - - - - - - - - - - - - - +  |
|   |          Padding (Bottom)            |    |
    +--------------------------------------+
|                                               |
               Margin (Bottom)
+ - - - - - - - - - - - - - - - - - - - - - - - +


If Emacs' display engine would support this, e.g. as a `block' text
property, then I could write:

(progn (switch-to-buffer (generate-new-buffer "*tmp*"))
       (insert "Example text. Example paragraph. Example text.")
       (put-text-property 15 33
                          'block
                          '(:margin (4 1 1 1) :border nil :padding nil)))

And then the text "Example paragraph" would get displayed as a
paragraph on its own with a left margin of four canonical character
units etc..  No inserting of newline characters or inserting of
spaces for the left margin involved here.

Other box types of the CSS include `list-item' for numbered or
bulleted lists or various table-boxes for specifying tables.

I am not bound to this particular model of CSS.  But I do think that
in the long run Emacs' display engine should support a visual
formatting model that is equally powerful.  The reason being, that I
envision Emacs-the-Word-Processor as an XML-centric application.
Even non-XML formats like RTF should be parsed into a data structure
that is an instance of the XML infoset (DOM or SXML, probably).  So
that users have a nice API for writing extensions to that WP in Emacs
Lisp.

So much for the answer to your question what I mean by "box model".
Now for the more concrete problem of implementing WP functionality for
Emacs with its current capabilities. The difficult part here is the
relation of data structure ("the document"), visual appearance ("the
formatting") and user interface.

With text/plain their relation is so simple that we hardly
distinguish them at all.  The visual appearance is determined by
control characters like space and newline, which are part of the
document (i.e. part of the data structure).  The user interface is
also simple: to change the (whitespace) formatting, we just insert
spaces and newlines where appropriate, which in turn become part of
the data structure.  To some extend this also works for
text/enriched.

But it stops to work for more elaborate, more widely used and --
IMNSHO -- more interesting document types and document formats.

Consider the following RTF document:

{\rtf1\ansi\deff0
{\fonttbl{\f0\froman Times;}{\f1\fswiss Helvetica;}}
{\stylesheet{\s1\f0\fs24\snext1 Standard;}
{\s2\keepn\f1\sb400\sa200\fs48 Headline;}
{\s3\sbasedon1\i\sb100\sa100\fs20\lin709 Motto;}}
{\*\listtable
{\list\listtemplateid1
{\listlevel\levelnfc23\leveljc0\levelstartat1\levelfollow2
{\leveltext\'01\u8226 ?;}} \listid1}}
{\listoverridetable{\listoverride\listid1\listoverridecount0\ls0}}
{\s2 Lirum larum (A Headline)}
{\par\pard\s3 "Mariage is the chief cause of divorce."}
\par\pard\plain\s1 This is just ordinary {\fs48 paragraph} text.
 Nothing special here.
\par\pard\plain\ls0\ilvl0 This is a list item. It contains two subitems:
\par\pard\plain\ls0\ilvl1 One and
\par\pard\plain\ls0\ilvl1 Two.
\par\pard\plain\ls0\ilvl0 This is another list item.}

A short explanation: Brackets group stuff together. Everything up to
line 10 ("{\listoverridetable ...") is header information. The
\fonttbl group specifies the fonts to use in the document. Each font
definition starts with \fN where N is a decimal number which is used
to refer to that font. The \stylesheet group defines stylesheets. Here
I only define paragraph stylesheets whose definition is started with
\sN. I define three paragraph styles here, "Standard", "Headline" and
"Motto". For example for the "Headline" style this specifies that a
"Headline"-paragraph should use the font "Helvetica" (\f1) with a
height of 24pt (\fs48), that it should be preceded by 20 pt vertical
whitespace (\sb400 -- the units are "Twips") and followed by 10 pt
vertical whitespace.  The rest of the header is important for
bulleted or numbered lists; I won't go into details here, because
that is a black art, which I have not yet fully mastered myself.

In the document itself \par starts a new paragraph and \sN refers to
a stylesheet.  \lsN\ilvlN is for list-items, again.

A plain/text approximation to the whitespace formatting of the
document (e.g. how it would be rendered on a tty) could look like
this:

---------------------- Start Document --------------------------------
Lirum larum (A headline)

    "Mariage is the chief cause of divorce."

This is just ordinary paragraph text. Nothing special here.
* This is a list item. It contains two subitems:
    1. One and
    2. Two
* This is another list item.
---------------------- End Document ----------------------------------

If Emacs display engine would support a block model, we would just
tell the display engine how to render the paragraphs. There is not a
single newline chars and no space between paragraphs that would be
part of the character data.  I.e.
`(buffer-substring-no-properties (point-min) (point-max))' would
return:

"Lirum larum (A headline)\"Mariage is the chief cause of divorce.\"\
This is just ordinary paragraph text. Nothing special here. This is\
 a list item. It contains two subitems:One and Two This is another \
list item."

(Note that the bullets and the numbers of the lists are not part of
the character data, either.)

Without a block model supported by the display engine, we have to
fake it by inserting newline characters and space (probably with a
`display' property) where appropriate.

In this case we would have to make sure that the UI is right. For
instance a user must not be able insert characters in a place where
"no character data are".  For instance, here:

Lirum larum (A Headline)
-!-
    "Mariage is the chief cause of divorce."


Or here:

* This is a list item. It contains two subitems:
-!- 1. One and
    2. Two.


The UI in typical word processors simply inhibits to move the cursor
to these places.  If the cursor is after "subitems:" and the user
hits <left>, the cursor would move before "One".

To get the same effect in Emacs we would have to make everything from
the newline after "subitems:" up to "1." intangible.  For this we
need a specialised fill function.  If we store the paragraph
properties in a text property, then this fill-function would


1) determine how far the paragraph extents, this could be, for
   instance, every text with an `eq' paragraph text property.

2) Remove every newline or space character that was inserted
   programatically by any previous filling.  Those newlines and
   spaces were not entered by the user and she does not want them to
   be part of her document.  They were added to the buffer only for
   visual rendering.

3) Determine the whitespace formatting properties of the paragraph.
   They may be specified via a stylesheet or directly or both (direct
   specification which overrides the defaults of a style sheet).

4) Add newline chars (word wrapping) and spaces (indentation) where
   appropriate to get a visual approximation to the paragraph
   properties specified in step 3). Those programatically added spaces
   and newlines should probably marked with a text property in order
   to make them distinguishable in step 2) from spaces that were
   entered by the user.


So far I have only talked about vertical and horizontal whitespace.
Character formatting information is another issue.  Take for example
this part from the RTF above:

\par\pard\plain\s1 This is just ordinary {\fs48 paragraph} text.

\s1 says: use paragraph stylesheet #1: Font: Helvetica; font-height:
12pt.  But this default for the paragraph is overriden by \fs48 for
the single word paragraph, it is meant to displayed with a font
height of 24 pt; however, this overrides only the height, all other
properties of the stylesheet do apply.

I guess this is best solved by letting font-lock look at the
paragraph properties, resolve all style information and then put an
according anonymous face on the `face' property.

Large parts of a WP may be possible in this or similar ways. Tables,
borders (and border styles), embedded vector graphics, multiple column
text are probably not feasible; but with the exeption of tables they
are IMO not /that/ important for now.

However, about one thing I am positiv: there is absolutely no room
for a minor mode here.  That's why I say that enriched-mode (as a
minor mode) is a dead end.

    Oliver
--
Oliver Scholz               29 Fructidor an 212 de la Révolution
Ostendstr. 61               Liberté, Egalité, Fraternité!
60314 Frankfurt a. M.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]