[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Markers in a gap array

From: Stefan Monnier
Subject: Markers in a gap array
Date: Thu, 04 Jul 2024 00:59:56 -0400
User-agent: Gnus/5.13 (Gnus v5.13)

I've just pushed to `scratch/markers-as-gap-array` code that seems to be
working correctly (passes all the tests, for example).
The code is not ready for `master` (there are some cleanups to do), but
if you use code that uses many markers, I'd be interested to hear about
your experience with it.

What it does: replace the unordered singly-linked list of markers that
we keep in every buffer, with an "ordered array (with gap) of markers".
The main purpose is to try and eliminate some pathological behavior in
the bytepos<->charpos conversion routines (which "abuse" markers as
a cache of bytepos<->charpos conversions) when you have many markers in
large buffers.

There's no free lunch, so this comes at the cost of slowing down other
marker operations, which is why I'd like to hear about your experience.
Here are those operations and how the performance is expected to compare:

The good:
- Conversion bytepos<->charpos used to be O(N), now it's O(lg N).
  The frequency of such conversions can vary drastically depending on
  the circumstances, so a lot of use-cases will see no difference at all,
  whereas some particular use-cases should see a significant speed up.

The indifferent:
- `make-marker`: Was and is still O(1).
- `marker-position`: Was and is still O(1).
- Adjustments when performing text insertion/deletion: Used to be O(N)
  and is still O(N), but with a smaller constant factor because
  it can fetch the markers in parallel whereas that used to be
  sequentialized by the pointer chasing of the singly-linked list and
  its branch instructions should be easier to predict.
  It's unlikely you'll see the gain, tho, because those adjustments are
  usually drowned in the noise of everything else we do upon text

The bad:
- `copy-marker`: Used to be O(1), now will cost time O(M + lg N) where
  N is the number of markers and M is the distance between where this
  marker is inserted and the previous position of the gap.
  The `lg N` factor should be hopefully cheap enough.
  The `M` factor could be more worrisome, since M can be as large as N,
  but there are two reasons to be optimistic:
  - The locality which makes our "gap buffer" usually efficient should
    hopefully work its same magic here keeping M usually small.
  - Moving the gap is a `memmove` and this can be performed quite fast
    even for fairly large M (i.e. the constant factor applied to
    M should be quite small).
- `(set-marker m nil)`: Used to be O(N), now costs the same as
  `copy-marker`, hence O(M + lg N).
  Even when M=N, this should be much better than the previous worst case
  because the constant factor applied to M should be *much* smaller.
  So it could be considered to be part of "The good" except that
  (set-marker m nil) was often near O(1) in practice if `m` was recently
  created (in which case it was found near the beginning of the
  singly-linked list).

So, for instance, `save-excursion` used to do a `copy-marker` followed
by a (set-marker m nil), both of which were, typically, O(1) and are now
made slower.

Side note: IIUC this design is similar to the one used in XEmacs, so
we're slowly catching up to them.  🙂


reply via email to

[Prev in Thread] Current Thread [Next in Thread]