gnu-arch-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Gnu-arch-users] Is tagline an attractive nuisance for international use


From: Stephen J. Turnbull
Subject: [Gnu-arch-users] Is tagline an attractive nuisance for international users?
Date: Tue, 12 Oct 2004 10:39:11 +0900
User-agent: Gnus/5.1006 (Gnus v5.10.6) XEmacs/21.5 (chayote, linux)

The tagline tag-snarfing algorithm is

1. Restrict the file to 1st 1K plus last 1K.
2. Find "^[[:blank:][:punct:]]*arch-tag:[[:blank:]]*(.*?)[^[:graph:]]*$"
   where *? is the shy repetition operator (subject to the 1024 byte
   boundary)
3. Grab the group, and smash any octet in it outside of [33,126] to '_'.
4. Return the result of 3.

Anybody using a "human-readable" algorithm for tag construction in a
language other than English is liable for lots of collisions.  For
example, in EUC or UTF-8 Japanese, the tag disappears (it gets scarfed
by the non-captured trailing [[:graph:]]* in the regexp) unless
there's some stray non-blank non-Japanese in the tag part.

I think nowadays everybody is using uuidgen or the like, but this
probably should be documented.

-- 
Institute of Policy and Planning Sciences     http://turnbull.sk.tsukuba.ac.jp
University of Tsukuba                    Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
               Ask not how you can "do" free software business;
              ask what your business can "do for" free software.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]