bug-hurd
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: console plans


From: Marcus Brinkmann
Subject: Re: console plans
Date: Thu, 21 Feb 2002 16:21:21 +0100
User-agent: Mutt/1.3.27i

On Thu, Feb 14, 2002 at 11:38:30PM +0100, Niels Möller wrote:
> Marcus Brinkmann <Marcus.Brinkmann@ruhr-uni-bochum.de> writes:
> 
> > Esp with the input below, I feel much better about Unicode now (although
> > I would not like to support the whole lot of it right from the
> > start, esp the compose characters A + ["] = Ä and stuff like that).
> 
> Yes, unicode is a lot more complex than one might think before getting
> into the details. In particular combining characters and normalization
> issues.

However, we can just stick with Level 1 support and Normalization Form C.
This should make a lot of people happy without adding a lot of complexity to
the code.  (http://www.cl.cam.ac.uk/~mgk25/unicode.html)

The implementation can grow with the actual support for UTF-8 in the
applications.  This is a process that will likely take a couple of years.

> For the input part, the complexity hits whatever component it is that
> converts unicode or utf8 to a local charset like latin1 (and given the
> current level of support for utf8 in tools like emacs and TeX, I don't
> think eightbit charsets will be abandoned very soon).

Actually, it's quite easy for the input part, as we control it.  We define
the mapping of scan codes to UCS-4 characters, and glibc's iconv will do
the conversion to the local charset.  I don't expect a lot of problems here.

> For output, it hits the component that converts unicode/utf8 to
> glyphs or glyph indices.

Yes.  Again, glibc's iconv will do the mapping from local charset to UCS-4.
Here we would have to deal with normalization and all the other stuff. 
However, for the text console the idea of supporting arbitrary composition
character combinations makes me flinch.  It will definitely never happen. 
We might support a subset, and maybe we can support some wide characters
(charcaters composed of two sub characters) and things like that.  But
unlikely much more.

Thanks,
Marcus

-- 
`Rhubarb is no Egyptian god.' Debian http://www.debian.org brinkmd@debian.org
Marcus Brinkmann              GNU    http://www.gnu.org    marcus@gnu.org
Marcus.Brinkmann@ruhr-uni-bochum.de
http://www.marcus-brinkmann.de



reply via email to

[Prev in Thread] Current Thread [Next in Thread]