[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#11073: 24.0.94; BIDI-related crash in redisplay with certain byte se
From: |
Stefan Monnier |
Subject: |
bug#11073: 24.0.94; BIDI-related crash in redisplay with certain byte sequences |
Date: |
Tue, 03 Apr 2012 00:22:32 -0400 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/24.0.94 (gnu/linux) |
>> > Usually, yes. But as far as there is a code space in high
>> > area for a CJK charset, it is unavoidable to have a
>> > buffer/string that contains a character represented by a
>> > byte sequence in that high area as the test case of
>> > Bug#11073. And, as "unification" means to treat such a
>> > character the same way as the unified character, I thought
>> > they both have the same character code.
>> Since there are two internal byte-sequence representation, I don't see
>> any good reason why we shouldn't have 2 internal int representations.
>> I.e. if unification failed for the byte-sequence (which might be the
>> result of a bug, for all I know), we may as well keep them non-unified
>> in the int representation.
> Please note that not all characters in the code-space of a
> CJK charset are unified. For instance, Big5 has it's own
> PUA (private use area), and characters in PUA are not
> unified by default. So, if Emacs reads a Big5 file that
> contains PUA chars, those chars stay in high-area. Then,
> one can provide his own unification map that also maps PUA
> chars to some Unicode chars as this:
> (unify-charset 'big5 "MyBig5.map")
> After this, I thought that previously read PUA chars staying
> in the high-area should be treated as the corresponding
> Unicode chars (in displaying, search, etc).
But again, this unification takes place during decoding. Whereas what
I'm talking about takes place when reading the internal utf-8
representation, which should be already unified.
Stefan
- bug#11073: 24.0.94; BIDI-related crash in redisplay with certain byte sequences, Kenichi Handa, 2012/04/02
- bug#11073: 24.0.94; BIDI-related crash in redisplay with certain byte sequences,
Stefan Monnier <=
- bug#11073: 24.0.94; BIDI-related crash in redisplay with certain byte sequences, Kenichi Handa, 2012/04/03
- bug#11073: 24.0.94; BIDI-related crash in redisplay with certain byte sequences, Stefan Monnier, 2012/04/03
- bug#11073: 24.0.94; BIDI-related crash in redisplay with certain byte sequences, Kenichi Handa, 2012/04/03
- bug#11073: 24.0.94; BIDI-related crash in redisplay with certain byte sequences, Stefan Monnier, 2012/04/03
- bug#11073: 24.0.94; BIDI-related crash in redisplay with certain byte sequences, Kenichi Handa, 2012/04/05
- bug#11073: 24.0.94; BIDI-related crash in redisplay with certain byte sequences, Eli Zaretskii, 2012/04/06
- bug#11073: 24.0.94; BIDI-related crash in redisplay with certain byte sequences, Kenichi Handa, 2012/04/09