bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#58558: 29.0.50; re-search-forward is slow in some buffers


From: Eli Zaretskii
Subject: bug#58558: 29.0.50; re-search-forward is slow in some buffers
Date: Tue, 13 Dec 2022 15:11:17 +0200

> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: Eli Zaretskii <eliz@gnu.org>, monnier@iro.umontreal.ca,
>  58558@debbugs.gnu.org
> Date: Tue, 13 Dec 2022 10:28:57 +0000
> 
> Ok. I got around to try perf, and it turned out to be very easy to get
> started.
> 
> perf record -p <PID> + perf report already appear to give some clue:
> 
>     88.27%  emacs    emacs-30-vcs                      [.] 
> buf_bytepos_to_charpos
>      3.75%  emacs    emacs-30-vcs                      [.] re_match_2_internal
>      1.35%  emacs    emacs-30-vcs                      [.] scan_sexps_forward
>      1.03%  emacs    emacs-30-vcs                      [.] re_search_2
>      0.65%  emacs    emacs-30-vcs                      [.] find_interval
>      0.56%  emacs    emacs-30-vcs                      [.] sub_char_table_ref
>      0.55%  emacs    emacs-30-vcs                      [.] 
> lookup_char_property
> 
> The fraction of buf_bytepos_to_charpos increases over repeated benchmark
> runs.

So buf_bytepos_to_charpos is the main suspect now, I guess.  This
could happen because either (a) buf_bytepos_to_charpos is called more
times as session uptime progresses, or (b) because each call to
buf_bytepos_to_charpos becomes more and more expensive.  So I think
the first question is: how many times is buf_bytepos_to_charpos called
for each search, or, equivalently, is the CPU time per call used up by
buf_bytepos_to_charpos stays stable or goes up?  I think perf can
answer these questions if you ask nicely.

If the number of calls is the same, but each call becomes more and
more expensive, then the next step is to ask perf to produce a
detailed profile for each line of buf_bytepos_to_charpos, and see
which parts of it become more expensive.  I could think about a couple
of possible reasons for that, but I'd rather not speculate about
profiles, as that is known to produce wrong guesses.

Is the buffer in question being edited as time advances?  Or is buffer
text and everything else in the buffer left unchanged?

> In contrast, using find-file-literally produces
> 
>     34.44%  emacs    emacs-30-vcs                             [.] 
> re_match_2_internal
>     25.55%  emacs    emacs-30-vcs                             [.] 
> scan_sexps_forward
>     11.09%  emacs    emacs-30-vcs                             [.] re_search_2
>     ...
>     0.59%  emacs    emacs-30-vcs                             [.] 
> buf_bytepos_to_charpos
> 
> with buf_bytepos_to_charpos taking diminishing cpu sample fraction.

That find-file-literally yields a buffer with a much faster
buf_bytepos_to_charpos is not surprising: when each character is a
single byte, the conversion is trivial, and buf_bytepos_to_charpos
returns immediately.  The puzzling part is not that
buf_bytepos_to_charpos is much more expensive in a buffer with
non-ASCII text, the puzzle is why it becomes more and more expensive
with time.

Thanks.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]