[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
"special" spaces in Texinfo parsing and output
From: |
Karl Berry |
Subject: |
"special" spaces in Texinfo parsing and output |
Date: |
Tue, 26 Mar 2013 21:19:02 GMT |
(Switching to texinfo-devel)
I would expect any unicode space to be
treated as a space with respect to word and paragraph breaking.
Apparently Unicode agrees with you -- search for "breaking space" in
http://www.unicode.org/reports/tr14; all the Unicode space chars are
deemed breakpoints. That seems quite wrong to me -- as an author, I
would certainly not want a line break at, say, a thin space -- but
Unicode is what it is. Fine.
Yet, considering [\r\n\t ] only to be space characters and everything
else to be non-space, treated as letters would simplify my life.
I think that is actually better, because makeinfo is not a display
engine. In practice, makeinfo has never tried to implement full (or
most) Unicode semantics and I don't see any users wanting it, so I see
no problem with just saying "Unicode chars stay as is in utf-8 encoding,
all else is undefined".
Suppose we have a text with a '* SPACE' what should be done at the
end of a line, could it be replaced by a new line?
I'm sorry, but I don't understand what you mean by '* SPACE'.
Do you mean three characters: an asterisk, a normal ASCII space, and
then an unusual Unicode space character? From the rest of
what you write, I don't think so, but I can't figure it out.
Not necessarily, there is already some special handling of fullwidth
east asian characters,
Sure, I know. But there's a lot more to Unicode line breaking than East
Asian character widths. See above TR. I would prefer that we *not*
implement it. No one is expecting it. I foresee it causing only
trouble to do so.
I'd say that we let perl have its way.
Fine.
k