[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#58558: 29.0.50; re-search-forward is slow in some buffers
From: |
Stefan Monnier |
Subject: |
bug#58558: 29.0.50; re-search-forward is slow in some buffers |
Date: |
Tue, 13 Dec 2022 12:38:13 -0500 |
User-agent: |
Gnus/5.13 (Gnus v5.13) |
>>> I will look how to do it. Maybe perf probe.
>>> I guess, it will be useful to compile Emacs with debug symbols at this
>>> point.
>>
>> AFAIR, you can ask perf to profile a single function, and you can ask
>> it to annotate the profile with the source code.
>
> I now compiled Emacs with debug symbols, waited enough to see observable
> increase in the benchmark-run timing, and recorded the perf data.
>
> buf_bytepos_to_charpos is still on the top
>
> 78.06% emacs emacs [.]
> buf_bytepos_to_charpos
> 3.00% emacs emacs [.] re_match_2_internal
> 1.05% emacs emacs [.] find_interval
> 1.04% emacs emacs [.] CHAR_TABLE_REF_ASCII
> 0.85% emacs emacs [.] make_lisp_symbol
> 0.80% emacs emacs [.] re_search_2
> 0.76% emacs emacs [.] builtin_lisp_symbol
> 0.62% emacs emacs [.] PSEUDOVECTORP
AFAIK the main places where we call `buf_bytepos_to_charpos` from
`re_match_2_internal` is via the `SYNTAX_TABLE_BYTE_TO_CHAR` macro, used
for regexp elements that depend on syntax tables (i.e. \<, \>, \_<, ...).
But I'd expect those to be executed "frequently&closely" enough that the
`cached_(byte|char)pos` data should almost always be nearby, making the
call to `buf_bytepos_to_charpos` fairly cheap (more specifically
the `for (tail = BUF_MARKERS (b);...` loop should not iterate many
times, regardless how many markers there are).
> My guess: number of markers is growing somehow?
`buf_bytepos_to_charpos` itself creates markers (using them as a cache
of previous conversions), so that might be why.
But we only look at the first N markers where N*50 is the distance to
the closest marker found so far. So growth is not sufficient (it's
clearly a part of the reason, tho).
Regarding growth: could you call `garbage-collect` between the calls to
`re-search-forward` to see if that avoids the accumulation?
[ I presume here that those markers are created/added by
`buf_bytepos_to_charpos` itself, so they should be recovered by the GC
because they're not referenced from anywhere else. ]
I'd be interested to know how many iterations of the `for (tail =
BUF_MARKERS (b);...` loop are executed on average during your
`re-search-forward` (and how that average changes between runs of
`re-search-forward`).
Stefan
PS: Of course, another approach would be to replace this code with
something else. Using markers as a cache of bytepos/charpos conversions
has been a source of a few performance issues over the year.
Another approach could be to use a "vector with gap" managed alongside
the actual buffer text. It could be indexed by "charpos divided by
1024", so conversion from charpos to bytepos could be a simple vector
lookup followed by scanning at most 1kB, and conversion in the other
direction would use a binary search in that same vector (or we could use
2 "vectors with gap", one per direction of conversion).
- bug#58558: 29.0.50; re-search-forward is slow in some buffers, (continued)
- bug#58558: 29.0.50; re-search-forward is slow in some buffers, Eli Zaretskii, 2022/12/14
- bug#58558: 29.0.50; re-search-forward is slow in some buffers, Ihor Radchenko, 2022/12/14
- bug#58558: 29.0.50; re-search-forward is slow in some buffers, Eli Zaretskii, 2022/12/14
- bug#58558: 29.0.50; re-search-forward is slow in some buffers, Ihor Radchenko, 2022/12/14
- bug#58558: 29.0.50; re-search-forward is slow in some buffers, Eli Zaretskii, 2022/12/14
- bug#58558: 29.0.50; re-search-forward is slow in some buffers, Stefan Monnier, 2022/12/13
- bug#58558: 29.0.50; re-search-forward is slow in some buffers, Ihor Radchenko, 2022/12/13
- bug#58558: 29.0.50; re-search-forward is slow in some buffers, Stefan Monnier, 2022/12/13
- bug#58558: 29.0.50; re-search-forward is slow in some buffers, Eli Zaretskii, 2022/12/13
- bug#58558: 29.0.50; re-search-forward is slow in some buffers, Ihor Radchenko, 2022/12/14
- bug#58558: 29.0.50; re-search-forward is slow in some buffers,
Stefan Monnier <=
- bug#58558: 29.0.50; re-search-forward is slow in some buffers, Ihor Radchenko, 2022/12/14
- bug#58558: 29.0.50; re-search-forward is slow in some buffers, Ihor Radchenko, 2022/12/14
- bug#58558: 29.0.50; re-search-forward is slow in some buffers, Eli Zaretskii, 2022/12/14
- bug#58558: 29.0.50; re-search-forward is slow in some buffers, Ihor Radchenko, 2022/12/14
- bug#58558: 29.0.50; re-search-forward is slow in some buffers, Eli Zaretskii, 2022/12/14
- bug#58558: 29.0.50; re-search-forward is slow in some buffers, Ihor Radchenko, 2022/12/14
bug#58558: 29.0.50; re-search-forward is slow in some buffers, Stefan Monnier, 2022/12/13