bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#14461: 24.3.50; bad display for 'space' + (U+0336) unicode combinati


From: Stephen Berman
Subject: bug#14461: 24.3.50; bad display for 'space' + (U+0336) unicode combination
Date: Sat, 17 Aug 2019 16:40:44 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (gnu/linux)

On Sat, 17 Aug 2019 17:14:45 +0300 Eli Zaretskii <address@hidden> wrote:

>> From: Stephen Berman <address@hidden>
>> Cc: Kenichi Handa <address@hidden>,  address@hidden,
>>   address@hidden,  address@hidden
>> Date: Sat, 17 Aug 2019 15:50:20 +0200
>> 
>> Hm, I chose COMBINING ACUTE ACCENT and COMBINING CIRCUMFLEX ACCENT more
>> or less at random, but I do indeed see the sequence 'aU+0301U+0302' as
>> two grapheme clusters (also with -Q): 'a' with an acute accent over it
>> followed by a circumflex.  In contrast, the sequences 'aU+0301U+0317'
>> and 'aU+0302U+0317' are displayed as single grapheme clusters (317 is
>> COMBINING ACUTE ACCENT BELOW).  I also noticed that the seqence
>> '-U+0301U+0302' is displayed as a dash followed by a single grapheme
>> cluster of an acute accent and a circumflex; this holds for all
>> nonalphabetic ASCII characters I tried and for some but not all
>> non-ASCII alphabetic characters.  So there seems to be some
>> inconsistency in the display of combining characters.
>
> Is this in Emacs 27 built with HarfBuzz support?

Yes (both --with-cairo and without).

>                                                   If so, I think this
> just means that the default font you use doesn't support these
> combining accents, because on my system I see a single grapheme
> cluster in both of the above cases, when I select a suitable font.

My default font is DejaVu Sans Mono, but it seems there's something else
at play here: in contrast to 'aU+0301U+0302', I do see the sequence
'bU+0301U+0302' as a single grapheme cluster.  Maybe the difference is
because there is a glyph for 'a' with an acute accent and it doesn't
support further combining.  (But I have no idea if that makes sense.)
Here's what describe-char shows on both:

________________________________________________________________________
             position: 1 of 7 (0%), column: 0
            character: a (displayed as a) (codepoint 97, #o141, #x61)
              charset: ascii (ASCII (ISO646 IRV))
code point in charset: 0x61
               script: latin
               syntax: w        which means: word
             category: .:Base, L:Left-to-right (strong), a:ASCII, l:Latin, 
r:Roman
             to input: type "C-x 8 RET 61" or "C-x 8 RET LATIN SMALL LETTER A"
          buffer code: #x61
            file code: #x61 (encoded by coding system utf-8-unix)
              display: composed to form "á̂" (see below)

Composed with the following character(s) "́̂" using this font:
  xfthb:-PfEd-DejaVu Sans Mono-normal-normal-normal-*-15-*-*-*-m-0-iso10646-1
by these glyphs:
  [0 2 97 163 9 0 8 12 0 nil]
  [0 2 769 650 9 2 7 12 -9 [0 0 0]]

Character code properties: customize what to show
  name: LATIN SMALL LETTER A
  general-category: Ll (Letter, Lowercase)
  decomposition: (97) ('a')

________________________________________________________________________
             position: 5 of 7 (57%), column: 0
            character: b (displayed as b) (codepoint 98, #o142, #x62)
              charset: ascii (ASCII (ISO646 IRV))
code point in charset: 0x62
               script: latin
               syntax: w        which means: word
             category: .:Base, L:Left-to-right (strong), a:ASCII, l:Latin, 
r:Roman
             to input: type "C-x 8 RET 62" or "C-x 8 RET LATIN SMALL LETTER B"
          buffer code: #x62
            file code: #x62 (encoded by coding system utf-8-unix)
              display: composed to form "b́̂" (see below)

Composed with the following character(s) "́̂" using this font:
  xfthb:-PfEd-DejaVu Sans Mono-normal-normal-normal-*-15-*-*-*-m-0-iso10646-1
by these glyphs:
  [0 2 98 69 9 1 9 11 0 nil]
  [0 2 769 649 9 3 7 12 -9 [-9 -3 0]]
  [0 2 770 650 9 2 7 12 -9 [-9 -3 0]]

Character code properties: customize what to show
  name: LATIN SMALL LETTER B
  general-category: Ll (Letter, Lowercase)
  decomposition: (98) ('b')






reply via email to

[Prev in Thread] Current Thread [Next in Thread]