help-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: How to compare strings?


From: Joost Kremers
Subject: Re: How to compare strings?
Date: 29 Apr 2007 23:06:12 GMT
User-agent: slrn/0.9.8.1 (Linux)

Jesper Harder wrote:
> "Lennart Borgman (gmail)" <lennart.borgman@gmail.com> writes:
>
>> But I think there are completely different problems too. Does not some 
>> languages sort partly depending the phonetics instead of the spelling?
>
> Yes. In Danish 'aa' is alphabetized according to how it's
> pronounced. 
>
> If it is pronounced as two vowels (e.g. ekstraarbejde), it's
> alphabetized as two a's. If it is pronounced as one vowel
> (e.g. afrikaans) is alphabetized as å (the last letter in the Danish
> alphabet).

technically, this is not (if i understand things correctly, i don't speak
danish) a case of alphabetising according to pronunciation. when 'aa' is,
as you put it, pronounced as one vowel, it is technically a digraph, i.e. a
combination of two letters that indicate a single sound.

many languages have digraphs, e.g. english has th, ch, ph and ng, and quite
a few vowel combinations that are pronounced as one vowel (or diphthong);
dutch has quite a few vowel digraphs (with pronunciations that are somewhat
more regular than in english ;-), e.g. oe, eu, ui, au, ou, ei and ij.

in some languages, digraphs are treated as single letters for
alphabetisation. the 'aa' case in danish above is an example. sometimes,
digraphs present particularly interesting problems. in dutch dictionaries,
the digraph ij is treated as two letters, so words starting with ij appear
under i, but in phone books and the like, it's often treated as equivalent
to y, so that names starting with ij appear intermingled with y.

and then there's the case of nahuatl, which has a bunch of consonant
digraphs (ch, cu/uc, hu/uh, qu, tl, tz). dictionaries often (though not
always, there's no "standard" here), have separate sections for words
starting with these digraphs, but for the rest treat them as two separate
letters for alphabetisation within a section. (well, there's of course the
whole issue of roots vs. stems and the fact that cu/uc and hu/uh change
based on the position of the word they're in, but let's not get into
that. ;-)


-- 
Joost Kremers                                      joostkremers@yahoo.com
Selbst in die Unterwelt dringt durch Spalten Licht
EN:SiS(9)


reply via email to

[Prev in Thread] Current Thread [Next in Thread]