[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Inputting special symbols

From: stk
Subject: Re: Inputting special symbols
Date: Wed, 4 Jan 2006 23:56:28 -0500 (EST)

> . . . If you don't mind, can I ask what code or language is this
> workaround based on? (Like why does one need to input two pairs of
> number combinations to obtain a symbol)

This is based on Unicode, but that doesn't really answer your question, as
Unicode is nothing but a catalogue of thousands of symbols, with each
symbol assigned a "Unicode number", which is just the catalogue-number of
the symbol.

Starting from a symbol's Unicode number, one can use an "encoding
algorithm" to produce a sequence of bytes that represents the symbol in a
real text file.  The problem is that, historically, there have been
invented several *different* encoding algorithms, and you have to know
which encoding algorithm you are using.  The two currently dominant
encoding algorithms are called UTF-16 and UTF-8.  Lilypond uses UTF-8.

(To some extent, Microsoft products use "Unicode", but they use UTF-16.
At least that is the case in the old Microsoft software I use, but my
guess is that Microsoft will stick with UTF-16, because if it switched
to UTF-8, then that would invalidate a huge existing repository of Visual
Basic programs.)

A UTF-8 character takes up either 1 or 2 or 3 or 4 or 5 or 6 bytes.
That fact alone will tell you that understanding UTF-8 is not easy.
If you really want to know the story, I recommend the following two Web
sites for starters:
On this site, the "Articles" box on the left of the page contains 9 links.
Click on and read the following 3:
This gives more information on UTF-8 and also presents an illuminating
comparison of UTF-8 to UTF-16.

-- Tom

reply via email to

[Prev in Thread] Current Thread [Next in Thread]