[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: regexp filter to match non-english characters

From: Ted Zlatanov
Subject: Re: regexp filter to match non-english characters
Date: Wed, 05 Nov 2008 16:18:41 -0600
User-agent: Gnus/5.110011 (No Gnus v0.11) Emacs/23.0.60 (gnu/linux)

On Wed, 05 Nov 2008 14:14:31 -0600 "Robert D. Crawford" <> 

RDC> Ted Zlatanov <> writes:
>> (string-match "[^\\000-\\1ff]" "hello")   ;; OK
>> (string-match "[^\\000-\\1ff]" "здрасти") ;; not OK (Unicode characters)
>> This will match character values over 0x1FF, which is the limit of
>> extended ASCII.  Does that work for you?

RDC> Will this match the unicode double ">" and the like?  Some people feel
RDC> the need to use these in their breadcrumbs and such.  If there is no way
RDC> to just filter out the foreign characters, I will use it.  

You can just try it!

(string-match "[^\\000-\\1ff]" "»") ;; returns 0, meaning it's a match
(string-match "[^\\000-\\1ff]" ">>") ;; returns nil, meaning it's not a match

Put the cursor after the closing parenthesis and hit C-x C-e in Emacs to
see the result.

RDC> The other possibility is to lower permanently on each character that is
RDC> read to me, but this seems tedious and time consuming on my part and
RDC> likely slow for gnus to score.

Nah, the above should work.  You will need a single backslash instead of
two, though (the doubling is needed to tell Emacs Lisp that's a real
backslash inside the string when it reads it in).


reply via email to

[Prev in Thread] Current Thread [Next in Thread]