[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [AUCTeX-devel] [AUCTeX-diffs] GNU AUCTeX branch, master, updated. 09

From: Ikumi Keita
Subject: Re: [AUCTeX-devel] [AUCTeX-diffs] GNU AUCTeX branch, master, updated. 097084443771d6716c6870f2f8d329e9c0949d97
Date: Wed, 31 Oct 2018 01:01:06 +0900

Hi David,

>>>>> David Kastrup <address@hidden> writes:
>> I changed it to "[\x00-\xFF]+" to process all the raw 8-bit bytes
>> together at decoding with the relavant coding system.

> That does not cover raw bytes since they are not in the range 00-ff in
> Emacs multibyte characters.  So that expression would only work for
> bytes in buffers decoded from files considered to be in Latin-1
> encoding.

If my memory serves, that's the behavior of non-unicode emacs
(mule-version < 6).  The current emacs (mule-version = 6) actually has a
multibyte treatment smart (or confusing) enough to match raw 8-bit byte
with regexp "[\x00-\xFF]".  The both form
(string-match "[\x00-\xFF]" (string-to-unibyte (byte-to-string #xab)))
(string-match "[\x00-\xFF]" (string-to-multibyte (byte-to-string #xab)))
returns non-nil value (0), at least on my emacs 26.1.

Although it is true that raw 8-bit characters in multibyte string are
not in the range 00-ff, the current emacs automatically (and implicitly)
converts them into 00-ff when matching against such regexps.  Whereas
the form
(aref (string-to-multibyte (byte-to-string #xab)) 0)
returns #x3fffab, the string matches with "[\x00-\xFF]" in
`string-match'.  (I admit that this behavior is confusing.)

Ikumi Keita

reply via email to

[Prev in Thread] Current Thread [Next in Thread]