[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: enriched-mode and switching major modes.

From: Oliver Scholz
Subject: Re: enriched-mode and switching major modes.
Date: Wed, 22 Sep 2004 12:35:15 +0200
User-agent: Gnus/5.1006 (Gnus v5.10.6) Emacs/21.3.50 (windows-nt)

I split my answer to this mail in order to adress some issues

Richard Stallman <address@hidden> writes:

[nested blocks]
> What does it *mean* to copy a character from inside environment
> `larum' which is inside environment `lirum' and insert it somewere
> else?  What should that character look like in its new location?

Two things could make sense here:

* Copy the properties of the /immediate/ containing block.

* Ignore the block formatting properties and copy only normal text

I definitely prefer the second one.  I think it would be the Right
Thing.  If I copy text from a H1 paragraph and insert it into a H2
paragraph, then it should get all character formatting properties that
are specified at the paragraph level from the H2 environment.  But if
the text has /additional/ character formatting properties specified,
like it contains some italic words, those should be preserved.

>     <h1>Some <i>meaningless</i> heading</h1>
>     The <i> element maps directly to text properties, of course.  But the
>     h1 element both demands that its contents be rendered as a paragraph
>     (a block) /and/ specifies certain character formatting properties for
>     the whole of it, e.g. a large bold font.
>     When encoding a buffer, I need to identify the whole paragraph as
>     being of the type "h1".  I.e. I have to distinguish it from:
>     <p><font size=7><b>Some <i>meaningless</i> heading</font></p>
> Why do you have to distinguish them? 

It is about preserving the user's intent.

Word processors as well as the file formats used in word processing
typically provide several ways to apply character formatting
properties on text:

*  paragraph formatting stylesheets
   - RTF: \sN
   - HTML: block elements like h1, h2 ... 

*  character formatting stylesheets
   - RTF: \csN
   - HTML: inline elements like em

*  direct specification of character formatting properties
   - RTF: \fN, \fsN, \b ...
   - HTML: i, b, font ...

The first two provide an layer of indirection which allows to specify
the user's /semantical/ intent on the document text.  Some
users---well, /I/ for example---would prefer /not/ to work with direct
specification of formatting properties at all.

It is a matter of what is the intent that the user has expressed.  Did
she specify "I want this to be a top level headline" or did she
specify "I want this to be large, bold text"?

The difference will show up, when the document is transfered to
another rendering device or when the user changes her mind and changes
the stylesheet for "level 1 headlines".  We have to preserve that
intent of the user in the data structure.  That's why I introduced the
concept of the abstract document and distinguished it from the
appearance.  The abstract document is the aggregation of the user's

Specifying only the appearance ("This should be large, bold text") is
considered bad practice in word processing.  Some users do it this
way; but many, at least most people /I/ know, prefer stylesheets.  If
Emacs would fail to preserve the semantical intents, it would get a
very bad reputation as a word processor.  Even worse, we would have to
expect that sophisticated users would recommend /not/ to use Emacs in
document exchange.  This must not happen.  Emacs has the potential to
be much better than any existing word processor; I would be very sad
if it happens to become worse.

> Why wouldn't it work simply to put these properties on the whole
> text of the paragraph?  What aspect would work differently as a
> result of doing one or the other, and why is it better if the
> properties are attached to paragraphs?

When encoding the document, I have to determine the type of a
paragraph, so that the encoded document file conserves the user's
semantical intent.  I have to get that information from somewhere.

If we can guarantee, that text properties affecting the paragraph
/always/ cover the whole of text of a paragraph, then this o.k.  When
encoding, I first distuinguish the paragraph; then I look at the text
property.  Kim has hinted at some ways of guaranteeing this.  Offhand
I believe that this would work for non-nested paragraphs (blocks).  I
dislike that approach, though, partly because I don't trust its
robustness, partly because it does not scale to handle nested blocks.

This whole affair is partly an UI problem.  The functions that encode
the document must be able to unambigously determine the type of a
paragraph as well as its other features from the data structure.  But
also the user must get feedback on how her actions affected the
abstract document (as expressed in said data structure):

>     We have to deal with the case that a user deletes the hard newline (if
>     you evaluate the code above: just hit backspace).  Is the resulting
>     paragraph of type `h1' or of type `h2'?
> Why ask the question?  Why not just accept that it's a paragraph
> of partly h1 text and partly h2 text?

In HTML there is no such thing as a paragraph that is partly H1 and H2
text.  What you suggest would result in this:

<h1>lirum larum</h1><h2>lirum larum</h2>

Any user agent (web browser, another word processor) would render this
as two paragraphs (blocks).  But the user in Emacs saw it as a single
paragraph when she saved that document.  Due to the commands she has
issued (maybe accidentally) the data structure treats it as two
separate paragraphs and encodes it accordingly when writing to the
file; but the user does not get any visual feedback on this.  She will
be surprised.  If she knows that things like this could happen, she
could feel the urge to examine the encoded document file before she
transers it to somebody else.  Eventually she could even stop to use
the word processing facilities and edit the raw HTML from the
beginning; or use another word processor.

Of course, treating "h1" and "h2" always as character formatting types
only would avoid the "one paragraph that suddenly becomes two
paragraphs" effect:

<p><font size=7><b>lirum larum</b></font><font size=6><b>lirum 

But then we fail again to preserve any semantical intent.

Oliver Scholz               Jour de la Révolution de l'Année 212 de la 
Ostendstr. 61               Liberté, Egalité, Fraternité!
60314 Frankfurt a. M.       

reply via email to

[Prev in Thread] Current Thread [Next in Thread]