[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
proposal for inter-word spaces
From: |
Mike Dowling |
Subject: |
proposal for inter-word spaces |
Date: |
Tue, 13 Feb 96 19:24 MET |
>>> On Tue, 13 Feb 1996 10:45:06 +1000, address@hidden (Jeff Kingston) said:
Jeff> The first option would cause Lout to insert one extra space between two
Jeff> objects that are separated by white space characters that include at
Jeff> least one newline, provided that the first object is a word that ends in
Jeff> any one of a certain set of characters which would depend on the current
Jeff> language.
I for one am averse to ending sentences in my source files with newlines
('\n'). The logical structure of my input would then different from its
intended purpose, as separating sentences with newlines makes each sentence
look like a paragraph. Besides, my editor (emacs) assumes sentences are
delimited by double spaces. I would have to dispense with lots of nice things
that emacs has on offer.
Several people have mentioned TeX's solution. I don't have the TeXbook handy,
but as I recall, the solution is different to what I have seen on this mailing
list. I think it goes something like this.
TeX assumes that a small letter followed by the appropriate punctuation (.?!),
followed by white space AND then a capital letter is a sentence ending, for
which extra space is to be added. TeX knows of several exceptions. If I
remember rightly, TeX knows about Dr. Mr. Mrs. I don't think TeX knows about
expressions like "Cats and rats etc. I don't like." which is the sort of
situation that TeX requires the use of '~'. Such situations are comparatively
rare. (I think that should be "etc.,", but you get the jist.)
It depends on how fussy you are. If you always want to have correct spacing,
then I think there will have to be something akin to TeX's '~', or else, as
Jeff suggests, requiring that all users use something like '\n' to mark the end
of sentences in the source file, which amounts to the same thing, except that
'\n' really changes the appearance of the source file, whereas '~' much less
so. We could also allow for improvements without being perfect.
Two proposals are:
(a) Like TeX, small letters, followed by while space and a capital letter
indicates a new sentence except for widely used abbreviations (a list of
which could be extended by the user).
(b) The user uses double spaces to indicate the end of sentences as is
currently the case. For the situation in which a sentence in the source
file ends at the end of a line, Lout checks whether the user normally
uses double spaces to terminate sentences, in which case, lout assumes
that a new sentence was intended, and adds extra space. Lout should know
about common (language dependent) abbreviations, and so not only not treat
"Dr.\nKingston" as the start of a new sentence, but also tries its
damnedest not to break a line between "Dr." and "Kingston" in the lout
output.
If we want to be perfect, then both (a) and (b) would require the user being
able tell lout manually what is the end of a sentence in rare situations.
The choice (b) has the advantage that it is highly compatible with the current
settings, as anybody who currently wants double spacing, is already indicating
this with his source files. It's also language independent, as German users,
for example, are not likely to delimit sentences with double spaces, despite
the advantages accruing from the use of emacs were they to do so.
Choice (b) has the disadvantage that, if you are unlucky enough to have all
your sentences accidentally ending with a '\n', or, indeed, if, like Jeff, you
do this deliberately, then lout would not be able to tell whether or not you
want double spaces. This could be remedied with a configation variable, but
then that would have to be language dependent for those of us who are
bilingual.
I think I like choice (b) as it stands, and am easy as to whether or not lout
should understand TeX's '~' or equivalent for the perfectionists.
Cheers,
Mike Dowling