[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#61514: 30.0.50; sadistically long xml line hangs emacs
From: |
Stefan Monnier |
Subject: |
bug#61514: 30.0.50; sadistically long xml line hangs emacs |
Date: |
Mon, 20 Feb 2023 09:59:30 -0500 |
User-agent: |
Gnus/5.13 (Gnus v5.13) |
Eli Zaretskii [2023-02-20 15:54:52] wrote:
>> From: Stefan Monnier <monnier@iro.umontreal.ca>
>> Cc: mah@everybody.org, 61514@debbugs.gnu.org
>> Date: Mon, 20 Feb 2023 08:19:26 -0500
>>
>> >
>> > "\\(\\(?:\\(xmlns\\)\\|[_[:alpha:]][-._[:alnum:]]*\\)\\(:[_[:alpha:]][-._[:alnum:]]*\\)?\\)[
>> > \r\t\n]*=\\(?:[
>> > \r\t\n]*\\('[^<'&\r\n\t]*\\([&\r\n\t][^<']*\\)?'\\|\"[^<\"&\r\n\t]*\\([&\r\n\t][^<\"]*\\)?\"\\)\\(?:\\([
>> > \r\t\n]*>\\)\\|\\(?:\\([ \r\t\n]*/\\)\\(>\\)?\\)\\|\\([
>> > \r\t\n]+\\)\\)\\)?"
>> >
>> > As you can see, the prepended "[^<>\n]+?" in the regexp which "hangs"
>> > makes all the difference. So the looking-at which fails reasonably
>> > quickly is the first call to looking-at above, whereas the one the
>> > "hangs" is the second one.
>>
>> Yes, it makes a lot of sense now.
>>
>> > Maybe this points out a way out of this misery?
>>
>> I think it does. E.g. there's a chance that using "[^<>\n]+?\\<"
>> instead of "[^<>\n]+?" avoids the hang
>
> It does, thanks.
>
>> (not sure if it's the right thing to do for all the regexp that can
>> be returned by `xmltok-attribute`, tho).
>
> How would we go about finding out? Because other than that, changing
> the regexp solves this nasty problem, and all the tests in
> test/lisp/nxml/ still pass.
I did find out: we'll always get the same regexp hre, so it's OK.
It turns out that (xmltok-attribute regexp) doesn't mean to return "the
something of `regexp`" but to return the "the regexp named
`xmltok-attribute`".
`xmltok-attribute` is a funny macro built by `xmltok-defregexp`.
>> And for the stack overflow I haven't yet found its origin.
>
> Not sure what is the mystery here. AFAIU, we look for the closing
> ">", don't find it, and then start looking for fewer and fewer non-'>'
> characters followed by '>'. Isn't that what happens here?
Right, but the stack overflows always come from repetitions where
our `mutually_exclusive_p` test fails. Let's see:
\\(\\(?:\\(xmlns\\)\\|[_[:alpha:]][-._[:alnum:]]*\\)\\(:[_[:alpha:]][-._[:alnum:]]*\\)?\\)[
\r\t\n]*=
The first two `*` should be non-backtracking because they repeat
[-._[:alnum:]] which is mutually-exclusive with what follows (either `:`
or whitespace, or `=`). Similarly the third `*` should be
non-backtracking because its body can't match the `=` that must follow.
\\(?:[\s\r\t\n]*
there aren't enough whitespaces so even if this can backtrack it
shouldn't be the source of our current problems.
\\('[^<'&\r\n\t]*\\([&\r\n\t][^<']*\\)?'
Neither `*` here should backtrack.
\\|\"[^<\"&\r\n\t]*\\([&\r\n\t][^<\"]*\\)?\"\\)
Same here.
\\(?:\\([ \r\t\n]*>\\)\\|\\(?:\\([ \r\t\n]*/\\)\\(>\\)?\\)\\|\\([
\r\t\n]+\\)\\)\\)?"
And here we're back to only repeating whitespace.
What am I missing?
Stefan
bug#61514: 30.0.50; sadistically long xml line hangs emacs, Stefan Monnier, 2023/02/19
- bug#61514: 30.0.50; sadistically long xml line hangs emacs, Eli Zaretskii, 2023/02/20
- bug#61514: 30.0.50; sadistically long xml line hangs emacs, Stefan Monnier, 2023/02/20
- bug#61514: 30.0.50; sadistically long xml line hangs emacs, Eli Zaretskii, 2023/02/20
- bug#61514: 30.0.50; sadistically long xml line hangs emacs,
Stefan Monnier <=
- bug#61514: 30.0.50; sadistically long xml line hangs emacs, Gregory Heytings, 2023/02/20
- bug#61514: 30.0.50; sadistically long xml line hangs emacs, Stefan Monnier, 2023/02/20
- bug#61514: 30.0.50; sadistically long xml line hangs emacs, Gregory Heytings, 2023/02/20
- bug#61514: 30.0.50; sadistically long xml line hangs emacs, Gregory Heytings, 2023/02/20
bug#61514: 30.0.50; sadistically long xml line hangs emacs, Stefan Monnier, 2023/02/20
bug#61514: 30.0.50; sadistically long xml line hangs emacs, Gregory Heytings, 2023/02/20
bug#61514: 30.0.50; sadistically long xml line hangs emacs, Stefan Monnier, 2023/02/20
bug#61514: 30.0.50; sadistically long xml line hangs emacs, Stefan Monnier, 2023/02/20
bug#61514: 30.0.50; sadistically long xml line hangs emacs, Gregory Heytings, 2023/02/20
bug#61514: 30.0.50; sadistically long xml line hangs emacs, Eli Zaretskii, 2023/02/21