[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: idn.el and confusables.txt

From: Ted Zlatanov
Subject: Re: idn.el and confusables.txt
Date: Wed, 18 May 2011 13:15:05 -0500
User-agent: Gnus/5.110018 (No Gnus v0.18) Emacs/24.0.50 (gnu/linux)

On Tue, 17 May 2011 10:32:03 -0500 Ted Zlatanov <address@hidden> wrote: 

TZ> Here's the converter.  It reads the confusables.txt file and generates a
TZ> char-table with strings as values.  I'll package the converter and the
TZ> resulting uni-confusables.el library and put them on the GNU ELPA.

TZ> Could you tell me the best way to write uni-confusables.el?  In what
TZ> format should I provide the char-tables in the ELisp code?

The shortest format turned out to be a range enumeration, because the
native char-table dump was much bigger (700K vs. 100K).  So I wrote
`gen-confusables-write' to create the "uni-confusables.el" file that
defines the two char-tables and then populates them.

As a bonus, two ERT tests (one per single/multiple type) are also
generated dynamically based on the data found in the confusables.txt

gen-confusables.el is a pretty unholy mix of Lisp and string
manipulations, but since I am the only real user I don't mind.  You can
test it with
http://www.unicode.org/Public/security/revision-04/confusables.txt (I'm
not including the resulting uni-confusables.el here because it's over


Attachment: gen-confusables.el
Description: application/emacs-lisp

reply via email to

[Prev in Thread] Current Thread [Next in Thread]