[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Deleting supplementary characters leaves corruption
From: |
Kenichi Handa |
Subject: |
Re: Deleting supplementary characters leaves corruption |
Date: |
Mon, 19 Apr 2004 13:19:30 +0900 (JST) |
User-agent: |
SEMI/1.14.3 (Ushinoya) FLIM/1.14.2 (Yagi-Nishiguchi) APEL/10.2 Emacs/21.3 (sparc-sun-solaris2.6) MULE/5.0 (SAKAKI) |
In article <address@hidden>, Alexander Winston <address@hidden> writes:
> M-x shell-command <RET> echo -e "\360\235\204\242" > SupplementaryChar
> <RET>
> M-x find-file <RET> SupplementaryChar <RET>
> M-x delete-char <RET>
> At this point, the first byte that comprises the supplementary character
> is deleted, leaving the last three bytes' octal escape sequences. The
> entire supplementary character should have been deleted.
The current Emacs still doesn't support a Unicode character
not in BMP. So, what the utf-8 decoder does is to read each
byte of such a character as is and display the sequence by
composing the sequence into a single Unicode character
U+FFFD. Thus, for instance, delete-char delete only the
first character of the sequenece (i.e. the first byte; \360
in the above case), thus the remaining bytes are shown as is
(because the original composition gets invalid).
To fix this problem, I tried to put `modification-hooks'
text-property that deletes the whole sequence when a part of
the sequence is being deleted. But, that revealed Emacs'
bug.
Please try this:
% emacs -q --no-site-file
;; Load the attached file
M-x load-file RET ..../atomic.el RET
;; Make "buffer" atomic.
M-b C-SPC M-f M-x set-region-atomic RET
;; Move point to the head of "buffer"
C-b
;; Delete "buffer"
M-d
;; Yank it twice.
C-y C-y
Then Emacs crashes as below:
(gdb) bt
#0 0x4030f781 in kill () from /lib/libc.so.6
#1 0x080e7a5a in abort () at emacs.c:433
#2 0x08193109 in Fremove_list_of_text_properties (start=190, end=197,
list_of_properties=-1471838288, object=-2009135880) at textprop.c:1619
#3 0x0814ca47 in Ffuncall (nargs=4, args=0xbfffeeb4) at eval.c:2737
#4 0x0817cabc in Fbyte_code (bytestr=1746734708, vector=-2011361536,
maxdepth=5) at bytecode.c:689
#5 0x0814cf31 in funcall_lambda (fun=-2011361720, nargs=2,
arg_vector=0xbfffefc8) at eval.c:2913
#6 0x0814caf1 in Ffuncall (nargs=3, args=0xbfffefc4) at eval.c:2783
#7 0x0817cabc in Fbyte_code (bytestr=1746735308, vector=-2011360984,
maxdepth=5) at bytecode.c:689
#8 0x0814cf31 in funcall_lambda (fun=-2011361112, nargs=1,
arg_vector=0xbffff0d8) at eval.c:2913
#9 0x0814caf1 in Ffuncall (nargs=2, args=0xbffff0d4) at eval.c:2783
#10 0x0817cabc in Fbyte_code (bytestr=1746735100, vector=-2011361232,
maxdepth=4) at bytecode.c:689
#11 0x0814cf31 in funcall_lambda (fun=-2011361320, nargs=1,
arg_vector=0xbffff1d8) at eval.c:2913
#12 0x0814caf1 in Ffuncall (nargs=2, args=0xbffff1d4) at eval.c:2783
#13 0x0817cabc in Fbyte_code (bytestr=1747075004, vector=-2011021300,
maxdepth=4) at bytecode.c:689
#14 0x0814cf31 in funcall_lambda (fun=-2011021428, nargs=1,
arg_vector=0xbffff318) at eval.c:2913
#15 0x0814caf1 in Ffuncall (nargs=2, args=0xbffff314) at eval.c:2783
#16 0x08149546 in Fcall_interactively (function=675491296,
record_flag=675170448, keys=-2009127256) at callint.c:862
#17 0x080f4a19 in Fcommand_execute (cmd=675491296, record_flag=675170448,
keys=675170448, special=675170448) at keyboard.c:9670
#18 0x080eb0fc in command_loop_1 () at keyboard.c:1727
#19 0x0814b01d in internal_condition_case (bfun=0x80ea410 <command_loop_1>,
handlers=675231320, hfun=0x80ea014 <cmd_error>) at eval.c:1333
#20 0x080ea2d8 in command_loop_2 () at keyboard.c:1264
#21 0x0814ab95 in internal_catch (tag=675225384, func=0x80ea2b4
<command_loop_2>, arg=675170448) at eval.c:1094
#22 0x080ea283 in command_loop () at keyboard.c:1243
#23 0x080e9dd8 in recursive_edit_1 () at keyboard.c:959
#24 0x080e9f00 in Frecursive_edit () at keyboard.c:1015
#25 0x080e8d72 in main (argc=3, argv=0xbffffab4) at emacs.c:1692
#26 0x402ff14f in __libc_start_main () from /lib/libc.so.6
(gdb) xba
"remove-list-of-text-properties"
"remove-yank-excluded-properties"
"insert-for-yank-1"
"insert-for-yank"
"yank"
"call-interactively"
(gdb)
---
Ken'ichi HANDA
address@hidden
;; atomic.el
(defun find-atomic-region (from to)
"Return an atomic region at FROM or after FROM and before TO.
The value is (START . END)."
(if (< from to)
(if (get-text-property from 'atomic)
(cons (previous-single-property-change (1+ from) 'atomic)
(next-single-property-change from 'atomic nil (point-max)))
(setq from (next-single-property-change from 'atomic nil to))
(if (< from to)
(find-atomic-region from to)))))
(defun fix-atomic-region (from to)
(let (atomic)
(while (setq atomic (find-atomic-region from to))
(if (or (and (> from (car atomic)) (< from (cdr atomic)))
(and (> to (car atomic)) (< to (cdr atomic))))
(let ((inhibit-modification-hooks t))
(delete-region (car atomic) (cdr atomic))
(setq from (car atomic)
to (- to (cdr atomic) from)))
(setq from (cdr atomic))))))
(defun set-region-atomic (from to)
"Set the specified region atomic."
(interactive "r")
(put-text-property from to 'intangible t)
(put-text-property from to 'atomic t)
(put-text-property from to 'modification-hooks '(fix-atomic-region)))