bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#61514: 30.0.50; sadistically long xml line hangs emacs


From: Stefan Monnier
Subject: bug#61514: 30.0.50; sadistically long xml line hangs emacs
Date: Mon, 20 Feb 2023 11:47:38 -0500
User-agent: Gnus/5.13 (Gnus v5.13)

> I don't know... but I observe that this alone:
>
> (with-current-buffer (get-buffer-create "*bug*")
>   (insert "<id name=\"")
>   (insert (make-string 250000 ?n))
>   (goto-char 5)
>   (looking-at
> "[^<>\n]+?\\(\\(?:\\(xmlns\\)\\|[_[:alpha:]][-._[:alnum:]]*\\)\\(:[_[:alpha:]][-._[:alnum:]]*\\)?\\)[
>  \r\t\n]*=\\(?:[ 
> \r\t\n]*\\('[^<'&\r\n\t]*\\([&\r\n\t][^<']*\\)?'\\|\"[^<\"&\r\n\t]*\\([&\r\n\t][^<\"]*\\)?\"\\)\\(?:\\([
>  \r\t\n]*>\\)\\|\\(?:\\([ \r\t\n]*/\\)\\(>\\)?\\)\\|\\([ \r\t\n]+\\)\\)\\)?"))
>
> doesn't fail, so I don't think it's this regexp which causes the overflow.

Indeed, there' still something unclear about how the overflow occurs,
but at least it seems my analysis doesn't match emacs-regex.c's because
I can get a stack overflow using the first part of the regexp:

    (with-current-buffer (get-buffer-create "*bug*")
      (erase-buffer)
      (insert "<id name=\"")
      (insert (make-string 2500000 ?n))
      (goto-char (+ (point-min) 10))
      (looking-at
"\\(\\(?:\\(xmlns\\)\\|[_[:alpha:]][-._[:alnum:]]*\\)\\(:[_[:alpha:]][-._[:alnum:]]*\\)?\\)[
 \r\t\n]*="))

where I can even reduce the regexp down to "[-._[:alnum:]]*\t*=".
Looks like we're missing a case in our backtracking-elimination code.


        Stefan






reply via email to

[Prev in Thread] Current Thread [Next in Thread]