bug#11073: 24.0.94; BIDI-related crash in redisplay with certain byte se

bug-gnu-emacs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#11073: 24.0.94; BIDI-related crash in redisplay with certain byte se

From:	Stefan Monnier
Subject:	bug#11073: 24.0.94; BIDI-related crash in redisplay with certain byte sequences
Date:	Tue, 03 Apr 2012 00:22:32 -0400
User-agent:	Gnus/5.13 (Gnus v5.13) Emacs/24.0.94 (gnu/linux)

>> > Usually, yes.  But as far as there is a code space in high
>> > area for a CJK charset, it is unavoidable to have a
>> > buffer/string that contains a character represented by a
>> > byte sequence in that high area as the test case of
>> > Bug#11073.  And, as "unification" means to treat such a
>> > character the same way as the unified character, I thought
>> > they both have the same character code.

>> Since there are two internal byte-sequence representation, I don't see
>> any good reason why we shouldn't have 2 internal int representations.
>> I.e. if unification failed for the byte-sequence (which might be the
>> result of a bug, for all I know), we may as well keep them non-unified
>> in the int representation.

> Please note that not all characters in the code-space of a
> CJK charset are unified.  For instance, Big5 has it's own
> PUA (private use area), and characters in PUA are not
> unified by default.  So, if Emacs reads a Big5 file that
> contains PUA chars, those chars stay in high-area.   Then,
> one can provide his own unification map that also maps PUA
> chars to some Unicode chars as this:
>   (unify-charset 'big5 "MyBig5.map")
> After this, I thought that previously read PUA chars staying
> in the high-area should be treated as the corresponding
> Unicode chars (in displaying, search, etc).

But again, this unification takes place during decoding.  Whereas what
I'm talking about takes place when reading the internal utf-8
representation, which should be already unified.


        Stefan

[Prev in Thread]

Current Thread

[Next in Thread]

bug#11073: 24.0.94; BIDI-related crash in redisplay with certain byte sequences, Kenichi Handa, 2012/04/02
- bug#11073: 24.0.94; BIDI-related crash in redisplay with certain byte sequences, Stefan Monnier <=
  - bug#11073: 24.0.94; BIDI-related crash in redisplay with certain byte sequences, Kenichi Handa, 2012/04/03
    - bug#11073: 24.0.94; BIDI-related crash in redisplay with certain byte sequences, Stefan Monnier, 2012/04/03
    - bug#11073: 24.0.94; BIDI-related crash in redisplay with certain byte sequences, Kenichi Handa, 2012/04/03
    - bug#11073: 24.0.94; BIDI-related crash in redisplay with certain byte sequences, Stefan Monnier, 2012/04/03
    - bug#11073: 24.0.94; BIDI-related crash in redisplay with certain byte sequences, Kenichi Handa, 2012/04/05
    - bug#11073: 24.0.94; BIDI-related crash in redisplay with certain byte sequences, Eli Zaretskii, 2012/04/06
    - bug#11073: 24.0.94; BIDI-related crash in redisplay with certain byte sequences, Kenichi Handa, 2012/04/09

Prev by Date: bug#11075: 24.0.94; Arabic character composition
Next by Date: bug#11159: 24.0.92; epa-insert-keys
Previous by thread: bug#11073: 24.0.94; BIDI-related crash in redisplay with certain byte sequences
Next by thread: bug#11073: 24.0.94; BIDI-related crash in redisplay with certain byte sequences
Index(es):
- Date
- Thread