emacs-devel
[Top][All Lists]

## Re: modify-syntax-entry and UTF8?

 From: Geoffrey Alan Washburn Subject: Re: modify-syntax-entry and UTF8? Date: Wed, 23 May 2007 11:09:22 -0400 User-agent: Thunderbird 2.0.0.4pre (X11/20070522)

James Cloos wrote:

"Geoffrey" == Geoffrey Alan Washburn <address@hidden> writes:


Geoffrey> No, what I wrote is exactly what I meant, unless the author of
Geoffrey> the TeX-input method incorrectly defined \langle and \rangle.

Ah.  That does put a different spin on things.

And in fact, the UCS has expanded since that was written, and characters
were added for exactly TeX's \langle and \rlangle (and a few others in
latin-ltx.el which currently point to CJK characters instead of math chars).

latin-ltx.el should be updated to use ⟨ U+27E8 MATHEMATICAL LEFT ANGLE
BRACKET for \langle and ⟩ U+27E9 MATHEMATICAL RIGHT ANGLE BRACKET for \rangle.



Ah, that is good to know. Is there any straightforward way to override this in my .emacs file?


What does C-uC-x= output when point is on the characters in your
(modify-syntax-entry) calls and when point is on one of the characters
you are trying to match in the buffer you are editing?  What are the
mode and coding-system of the buffer you are editing?  What is the
coding-system of the .el file?


So when using the correct glyphs I get

character: ⟨ (10216, #o23750, #x27e8)
preferred charset: unicode (Unicode (ISO10646))
code point: 0x27E8
syntax: (⟩   which means: open, matches ⟩
buffer code: #xE2 #x9F #xA8
file code: #xE2 #x9F #xA8 (encoded by coding system utf-8-unix)
display: no font available
...

and

character: ⟩ (10217, #o23751, #x27e9)
preferred charset: unicode (Unicode (ISO10646))
code point: 0x27E9
syntax: )⟨   which means: close, matches ⟨
buffer code: #xE2 #x9F #xA9
file code: #xE2 #x9F #xA9 (encoded by coding system utf-8-unix)
display: no font available

...


which as I understand it means that they should already be treated as matching delimiters.


However, if create an empty scratch buffer and I move the cursor on top of either of the glyphs they become highlighted, but with the face that is used for matched delimiters rather than the face mismatch/unmatched delimiters. Adding both glyphs to an empty buffer in correctly and incorrectly matching permutations gives the same behavior.


So I am inclined to believe Stefan's hypothesis that modify-syntax-entry is working correctly here and instead whatever code actually interprets the syntax table or performs the actual adjustment to the faces for highlighting has a bug of some sort.


I'm also somewhat curious that emacs tells me that no font is available for these glyphs, but Thunderbird seems to be able to locate a font that can be used to display them.