[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: describe-char and unicode data

From: Simon Josefsson
Subject: Re: describe-char and unicode data
Date: Tue, 13 May 2003 08:07:50 +0200
User-agent: Gnus/5.1002 (Gnus v5.10.2) Emacs/21.3.50 (gnu/linux)

Richard Stallman <address@hidden> writes:

>       Recipient is granted the right to make copies in any form for
>       internal distribution and to freely use the information
>       supplied in the creation of products supporting the UnicodeTM
>       Standard. The files in the Unicode Character Database can be
>       redistributed to third parties or other organizations (whether
>       for profit or not) as long as this notice and the disclaimer
>       notice are retained. Information can be extracted from these
>       files and used in documentation or programs, as long as there
>       is an accompanying notice indicating the source.
> Perhaps that last sentence gives us permission to release a free work
> containing the full information, but we had better check that with a
> lawyer first.

Below is the response from Unicode.  Is this sufficient?

Rick also made the following comment:

     Note that if you use any Unihan data, you should look at the UCD
     documentation and pay particular attention to which fields are
     normative and which are not; and to what is "provisional". A lot
     of the data in Unihan.txt is not normative, it is spotty and
     provisional and subject to change and improvement without notice.

So I think anyone working on this should separate the display into a
normative part and a "provisional" part so the user isn't lead to
believe some data are normative but really are unchecked data.

--- Begin Message --- Subject: Re: draft-rmcgowan-unicode-procs-02.txt Date: Mon, 12 May 2003 17:05:02 -0700
Hello Simon --

You asked:

> Has there been any progress?  I notice that the UCD-4.0.0.html says:
> |     Recipient is granted the right to make copies in any form ...

Yes there has been progress. You can take this as an official response.

We intend for the Unihan database to have the same rights and restrictions
as all of the UCD. Therefore we inserted the revised clause into
UCD-4.0.0.html, intending that it over-ride what is in the Unihan file.
The Unihan database has not yet been updated to a 4.0 version, so 3.2 is
the current one. But the 4.0 UCD clause over-rides the older terms in the  
Unihan 3.2 file.

When the Unihan database is (soon) updated to a 4.0 or later version, the
clause will be changed in the Unihan database itself to align with the new
intent, and to match the 4.0 UCD.

> In a discussion about adding support for this in the text editor
> application Emacs, Richard Stallman raised the following issue:
> ,----
> |     Unihan.txt> Recipient is granted the right ... to freely use
> |     Unihan.txt> the information supplied in the creation
> |     Unihan.txt> of products supporting Unicode.
> |
> | It is not clear that that gives us permission to transform
> | this data into a anything that would be under a free license,
> | but we could ask those who released it whether they meant it
> | to allow that.  Would someone like to contact them and ask?
> |
> `----

Yes, we mean that. If people couldn't take our data and transform it by
compression, compilation, extraction, etc, then it wouldn't be very useful.
We definitely intend people to use it. What we don't really want is for
people to take our data verbatim and re-distribute it verbatim, although
such use is definitely allowed explicity. We would prefer that, if
people need to distribute our data verbatim, they do so by referring to
the "latest version" on the web, and point to our web site. That way
constomers of the products can know how and where to get the latest

But to make a product that uses Unicode, almost everyone needs to take the
character information and properties and somehow distill that information
into a form that is suitable for use by a program during the course of
execution. When you do so, it is nice to also allow a means for end-users
to get an upgrade of the data by supplying some distillation mechanism, or
explaining your data format (if you use one) so that users can do manual
upgrades if needed.

A good example of using the Unicode data files is provided by this program:


That program comes with a compressed database, and has an option for live
update to the most recent Unicode data files by including a parser within
it. Any user can download the Unicode data files from our website, and ask
the program to upgrade itself from those files.

Please let me know if you have any further questions.

All the best,


--- End Message ---

reply via email to

[Prev in Thread] Current Thread [Next in Thread]