Re: [Freefont-bugs] Discussion and questions on Unicode Han Unification

On Thu, Feb 10, 2011 at 9:42 AM, Steve White <address@hidden> wrote:

Hi Ange,

Sorry for the long delay. I can't go into a lot of detail, although I
could say a lot here.
I'm just generally a little pressed for time right now.

Sorry about my own (even longer) delay, I took your answer into consideration when it came in but never took the time to answer myself!
Anyway if ever you have, some day, time to answer the main questions about how the Unicode works as a decision entity, and why do they do the decisions they make… don't hesitate.

Moreover I came recently to another thought as they regularly standardize icons whose use case seems really doubtful (representing animals, people with funny costums, etc.). And they do a ton of them. So I guess that making custom kanjis for the various countries using them would indeed take even more place. But that would also be a better use of the place, I would say (I have nothing against though, if they really want *also* an icon representing a cake or smiling rabbits or whatever).

At this time, I'll just answer your last (P.S.) question, briefly.

No there are no immediate plans to support CJK scripts in FreeFont.

The problem has to do with balancing resources against quality. For
me to do the
work myself is out of the question: any of these languages would
hugely increase the
size of the project. Regarding quality, I have looked at a lot of
free CJK fonts, to see
if I could simply drop them in, but none has satisfied me. To
maintain the level of technical
quality of the glyphs that we are trying to achieve with the font
would involve a glyph-by-glyph
editing--months of work.

Another consideration is: what is really the point of having these
scripts in a single font file?
As I wrote in the article about policy
http://www.gnu.org/software/freefont/articles/Why_Unicode_fonts.html
the point of such a family is a simple means of having mixed writing
systems (and symbols)
that look pretty good together in some sense. This is fairly
meaningful for alphabetic scripts,
but for mixed alphabetic and ideogrammatic scripts, I have my doubts.

I see. I don't know if I fully agree though. I guess you can still have some common style even between very different scripts (the width of the lines, a curved or square style, hand-written or typographic style, exotic writing styles, and so on).

But I understand your point of view. It is clear that the need of the 'é' looking like a 'e' is not the same as any alphabetic letter having a common style with an ideogram mixed in the same text.

These days, with
font renderers that automatically find characters from installed
fonts, it seems to me to be
of less importance.

So somehow doesn't it partly negate your initial will to have a generic font (ok you answered above, so don't take this question into consideration. I am teasing)?

Cheers!

Anyway thanks for the answer! :-)

Ange

On Wed, Jan 26, 2011 at 7:52 AM, Ange Gapes <address@hidden> wrote:
> Hello,
>
> sorry this is not directly about bugs in Freefont, nor direct development
> matters, but I could not find a more generic ml for your project. But I
> think this kind of discussion is still of interest. Hopefully you will think
> so.
>
> I recently came to some interest on the Han unification project and problem
> they implies for texts mixing languages. As you are a font project, I guess
> you know the issues, but for those who don't, I summarize this way:
> typically for the main 3 languages (Chinese, Japanese, and Korean, though
> these last one don't use them much in modern writing, hence CJK) who use
> Chinese-originated characters (Han characters), the Unicode project has
> decided to unite the character from a same origin (Han Unification: Unihan).
> This leads to problem when the actual writing of them is different depending
> on the actual country, sometimes slightly (style), sometimes in a more
> obvious way. The Wikipedia page has good examples on the issue:
> http://en.wikipedia.org/wiki/Unihan#Examples_of_language_dependent_characters
> (this is significant only if you have right fonts on the computers which
> will show actually the characters with difference).
>
> The way it is dealt with is:
> - you use only one of these languages, then you don't care and take only
> fonts which display your chosen language's way.
> - if you read texts of several languages, or even mixed inside a same text,
> the text can have some kind of markup then different fonts are selected.This
> is the way it is done in html, hence you can see different fonts for the
> actually same unicode character in the Wikipedia page I showed before.
>
> But what when you read raw text file without markup for instance? No sure
> way to tell the language for the editor and mixed characters won't show up.
>
> So why do I tell this all to you? I would like to know your opinion, if not
> position, towards this Unicode decision. Do you have any remarks on it?
> Also what does it mean for a project like yours? Is it possible in a same
> font family to provide several different fonts/design for the same character
> with "context" information (= this font is preferably for Chinese display
> only, unless no other choice, this one for Japanese, and so on) and a
> default one maybe (in case no context is available, use this "generic"
> design)? So that a software using your font only may still display different
> designs depending on the displayed language (if it knows it) or a default
> version otherwise...
>
> On a side note, I read somewhere that there were maybe some other kinds of
> characters where similar problems arise. In particular I read on a website
> about another example of Arabic characters being used in several
> country/languages but displayed slightly differently. Yet after some search,
> I could not find actual information on this specific issue, so I don't know
> if it is true, or maybe it has been fixed since then by the Unicode project
> by assigning specific characters or control characters to change the
> display? (Arabic don't have that many characters as those East Asian
> languages, hence less space issue for duplicating characters)
> Do you know about such specific Arabic-character issue? Or other issues with
> other glyphs in other alphabet?
> Do you participate into Unicode standardization? Do you have details on what
> conducted to this internally? Is it really ONLY a space problem? Because
> even though there are for sure a lot of characters in these countries, it
> looks to me there are still a lot of slots unassigned, really far enough
> (that's how Unicode has been designed after all: with far enough slots for
> all history, as far as I know). So I don't see the points of keeping them
> for no reason (it's not like suddenly new alphabets of hundred of thousands
> of characters, all new, will be created in the next century).
> And in the worst case, Unicode may still be extended.
> So if you have any particularly interested link to discussion in the Unicode
> project (mailing lists maybe?) about how we came to this, this is
> interesting as well.
> I will also myself ask directly to Unicode guys later, but I first wanted to
> know the opinion of a font project whose goal would be to span on all the
> Unicode. What does that imply for you?
>
> And so on second level, why do I ask all this? Simply first of all I am
> interested in Unicode, in such questions, for personal use but also for pure
> intellectual interest (among other reasons, being myself involved in
> standardization processes, though not directly into Unicode, for now at
> least). Also because I think this is pretty sad and when I read about this,
> I didn't agree much with such moves (whereas the prime goal of Unicode was
> to support any existing character, so this looks like a step backwards; and
> also because we know that some countries, Japan at least for what I know, is
> not very into standardization, thus they don't use that much the Unicode
> encodings, like UTF-8, but localized encodings, and this kind of move won't
> make them want to change this).
> And also because I am currently beginning to write what-may-become-a-book,
> in some future, not on this in particular, but this kind of topic may be
> part of it.
> So thanks all. Any opinion and information on the topic would be greatly
> appreciated.
>
> Ange
>
> P.S.: and for personal use, a last question: do you plan on supporting these
> East-Asian characters in some foreseen future? In particular Japanese
> Hiragana-Katakana-Kanjis and Korean basic alphabet?
>

From:	Ange Gapes
Subject:	Re: [Freefont-bugs] Discussion and questions on Unicode Han Unification
Date:	Sat, 23 Apr 2011 00:07:22 +0900