emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

regex.c bug? - Re: HTML Mode and Turkish Locale - Segfault


From: Kenichi Handa
Subject: regex.c bug? - Re: HTML Mode and Turkish Locale - Segfault
Date: Tue, 28 Nov 2006 10:17:29 +0900
User-agent: SEMI/1.14.3 (Ushinoya) FLIM/1.14.2 (Yagi-Nishiguchi) APEL/10.2 Emacs/22.0.91 (i686-pc-linux-gnu) MULE/5.0 (SAKAKI)

It seems that I found the reason of the attached crash.

Currently we have this code in regex.c.

                        if (multibyte)
                          SET_RANGE_TABLE_WORK_AREA_BIT (range_table_work,
                                                         re_wctype_to_bit (cc));

                        for (ch = 0; ch < 1 << BYTEWIDTH; ++ch)
                          {
                            int translated = TRANSLATE (ch);
                            if (re_iswctype (btowc (ch), cc))
                              SET_LIST_BIT (translated);
                          }

In tr_TR.UTF-8, 'I' is translated to #x51051 (U+0131).  But,
it seems that SET_LIST_BIT assumes that the argument is less
than 256 (or 128).  So, I've just installed the following
change.

@@ -2939,7 +2939,8 @@
                         for (ch = 0; ch < 1 << BYTEWIDTH; ++ch)
                          {
                            int translated = TRANSLATE (ch);
-                           if (re_iswctype (btowc (ch), cc))
+                           if (translated < (1 << BYTEWIDTH)
+                               && re_iswctype (btowc (ch), cc))
                              SET_LIST_BIT (translated);
                          }

If translated is set to a mutibyte character, I think the
above SET_RANGE_TABLE_WORK_AREA_BIT handles such a case.

Stefan, could you please confirm that my guess above is
correct?

---
Kenichi Handa
address@hidden

In article <address@hidden>, Kenichi Handa <address@hidden> writes:

> [1  <text/plain; US-ASCII (7bit)>]
> In article <address@hidden>, Eli Zaretskii <address@hidden> writes:

> > > From: address@hidden (Cafer =?utf-8?B?xZ5pbcWfZWs=?=)
> > > Date: Sun, 26 Nov 2006 22:58:29 +0200
> > > 
> > > It's crash when using html-mode randomly (seg fault) when using
> > > tr_TR.UTF-8 locale. I've tried it en_US.UTF-8 locale and it seems
> > > working.
> > > 
> > > I've tried with both (from CVS and from Debian Repository)
> > > 
> > > Version: 22.0.91.1

> > Thank you for your report.

> > However, there's not enough information in this for us to try to find
> > out what is wrong.  Please use "M-x report-emacs-bug RET" to provide
> > the information.  Also, since this is a segfault, please run GDB on
> > the core file, type the command "bt" inside GDB, and post the
> > resulting backtrace here.

> I can reproduce it with the following scenario (on Debian
> testing) with the attached temp.html.  But, I have not yet
> found what is wrong.  I suspect that case-table handling has
> a problem because it happenes only in tr_TR.UTF-8.

> (gdb) set env LANG=tr_TR.UTF-8
> (gdb) run -Q temp.html

> ESC : (garbage-collect) RET

> Then Emacs crashes as this:

> Program received signal SIGSEGV, Segmentation fault.
> mark_object (arg=139689009) at alloc.c:5717
> (gdb) bt
> #0  mark_object (arg=139689009) at alloc.c:5717
> #1  0x0813ab66 in mark_object (arg=139272845) at alloc.c:5825
> #2  0x0813ab66 in mark_object (arg=141201765) at alloc.c:5825
> #3  0x0813af6e in mark_object (arg=138980860) at alloc.c:5700
> [...]
> #119 0x0813aa7f in mark_object (arg=139883241) at alloc.c:5714
> #120 0x0813af6e in mark_object (arg=137465060) at alloc.c:5700
> #121 0x0813e8ff in Fgarbage_collect () at alloc.c:5156
> #122 0x081522b3 in Feval (form=141197693) at eval.c:2325
> #123 0x08152da7 in Ffuncall (nargs=2, args=0xafcfabb0) at eval.c:2997
> #124 0x0817d61a in Fbyte_code (bytestr=136311491, vector=136311508, 
> maxdepth=40) at bytecode.c:679
> #125 0x08152844 in funcall_lambda (fun=136311436, nargs=2, 
> arg_vector=0xafcface4) at eval.c:3184
> #126 0x08152c5b in Ffuncall (nargs=3, args=0xafcface0) at eval.c:3054
> #127 0x08154523 in Fapply (nargs=2, args=0xafcfad30) at eval.c:2485
> #128 0x08154654 in apply1 (fn=137689233, arg=141197565) at eval.c:2749
> #129 0x0814fdf7 in Fcall_interactively (function=137689233, 
> record_flag=137464009, keys=137504524) at callint.c:406
> #130 0x080f09c3 in Fcommand_execute (cmd=137689233, record_flag=137464009, 
> keys=137464009, special=137464009) at keyboard.c:9867
> #131 0x080fc00a in command_loop_1 () at keyboard.c:1858
> #132 0x0815187b in internal_condition_case (bfun=0x80fbc90 <command_loop_1>, 
> handlers=137508713, hfun=0x80f66a0 <cmd_error>) at eval.c:1481
> #133 0x080f5a7e in command_loop_2 () at keyboard.c:1326
> #134 0x0815193c in internal_catch (tag=137504921, func=0x80f5a50 
> <command_loop_2>, arg=137464009) at eval.c:1222
> #135 0x080f64ee in command_loop () at keyboard.c:1305
> #136 0x080f6878 in recursive_edit_1 () at keyboard.c:1003
> #137 0x080f6966 in Frecursive_edit () at keyboard.c:1064
> #138 0x080ecbb2 in main (argc=1526726658, argv=0xafcfb5c4) at emacs.c:1794

> Lisp Backtrace:
> "garbage-collect" (0x2)
> "eval" (0x86a817d)
> "eval-expression" (0x86a817d)
> "call-interactively" (0x834f891)
> (gdb) 

> ---
> Kenichi Handa
> address@hidden

> [2  <text/html; US-ASCII (7bit)>]
> <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
> <html>

> <head>
>    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
>    <title> Sample </title>
> </head>

> <body>
> </body>
> </html>
> [3  <text/plain; us-ascii (7bit)>]
> _______________________________________________
> Emacs-devel mailing list
> address@hidden
> http://lists.gnu.org/mailman/listinfo/emacs-devel




reply via email to

[Prev in Thread] Current Thread [Next in Thread]