[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Making re-search-forward search for \377

From: Xah
Subject: Re: Making re-search-forward search for \377
Date: Sun, 2 Nov 2008 12:32:53 -0800 (PST)
User-agent: G2/1.0

Xah Lee wrote:
> Xah<address@hidden> writes:
> > what's the C-q 377 char?
> > if i press Ctrl+q 377 Enter, i get this char: ÿ, which is LATIN SMALL
> > Then if i do:
> > (re-search-forward "ÿ")

Tyler Spivey wrote:
> I'm probably going to end up working with binary data in a temp
> buffer. Doing more research, I want enable-multibyte-characters to be
> off. Given that, if we go to *scratch*
> and run M-X toggle-enable-multibyte-characters until that variable
> becomes nil, doing C-Q 377 RET gives 0xff, which is what I want
> (according to C-x =, C-u C-x = and M-x describe-char). Now to
> match it, I try:
> (re-search-forward "\xff") - no luck

sorry can't help you much there. ...i don't have much experience
working with binary data.

> What did you use to figure out that the multibyte version of that
> character was 0x00FF? I found it out accidentally as a lisp error, but
> none of the previously described commands (C-X =, M-X describe-char or
> C-u C-x =) will show that it is 0x00ff, they just show FF.

installing a unicode data file is probably what you need.

Q: I have this character α on the screen. How to find out its
unicode's hex value or name?

You can find out a character's decimal, octal, or hex values by
placing your cursor on the character, and type “Alt+x what-cursor-
position” (Ctrl+x =). You can get more info if you place your cursor
on the character, then press “Ctrl+u Ctrl+x =”.

However, if you want the complete unicode info of a character, you
need to download a unicode data file and let emacs know where it is.
The unicode data file can be downloaded at:
After you downloaded it, place the following code in your “~/.emacs”
to let emacs know where it is:

; set unicode data file location. (used by what-cursor-position)
(let ((x "~/Documents/emacs/UnicodeData.txt"))
  (when (file-exists-p x)
    (setq describe-char-unicodedata-file x)))

Then restart emacs. Once you've done this, then place your cursor on a
unicode char, and do “Ctrl+u Ctrl+x =”, then emacs will give you all
the unicode info about that char, including the code point in decimal,
octal, hex notations, as well the unicode character name, category,
the font emacs is using, and others.

For example, here's the output on the character “α”:

      character: α (332721, #o1211661, #x513b1, U+03B1)
        charset: mule-unicode-0100-24ff
                 (Unicode characters of the range U+0100..U+24FF.)
     code point: #x27 #x31
         syntax: w      which means: word
       category: g:Greek
    buffer code: #x9C #xF4 #xA7 #xB1
      file code: #xCE #xB1 (encoded by coding system mule-utf-8-unix)
        display: by this font (glyph code)
   Unicode data:
       Category: lowercase letter
Combining class: Spacing
  Bidi category: Left-to-Right
      Uppercase: Α
      Titlecase: Α

There are text properties here:
  fontified            t

this page might help you if you work with unicode.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]