bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#33653: 27.0.50; Change Gnus obarrays-as-hash-tables into real hash t


From: Eric Abrahamsen
Subject: bug#33653: 27.0.50; Change Gnus obarrays-as-hash-tables into real hash tables
Date: Thu, 06 Dec 2018 14:39:07 -0800

Here's the next thing: turning Gnus' obarrays-as-hash-tables into real
hash tables. Gnus currently stores information about groups by coercing
group names to unibyte, interning them in custom obarrays, and then
setting their symbol-value to whatever value needs to be stored. I think
all this was written before Emacs had actual hash tables.

I think real hash tables are better for a few reasons:

1. Hash table lookups seem to be marginally faster than obarray lookups,
2. It's "less weird" for contributors and bug hunters,
3. It allows us to reduces the amount of encoding/decoding going on:
group names can stay strings instead of being forced into symbols.

I've pushed a branch, scratch/gnus-hashtables, with two commits in it.
The first changes all the obarrays to hash tables. Apart from simply
replacing function calls, there were a few bumps.

1. Gnus uses `text-property-any' to find and compare group-name text
   properties; I made a new `gnus-text-property-search' which behaves
   similarly, but compares with equal.
2. Some of the "hash tables" were simply storing the value t, ie just
   used for membership. I left these as hash tables, but I think in many
   cases simple string-in-list membership tests would suffice.
3. The hash table in gnus-async.el didn't appear to be doing anything --
   it seemed to be redundant with `gnus-async-article-alist'. I've
   removed it, but it might need to be put back if I'm missing something.
4. The creation of `gnus-newsgroup-dependencies' was the most fiddly,
   and I added tests for that. I'm not entirely convinced that
   `gnus-thread-loop-p' behaves the way it's meant to, it appears to
   only check for direct loops between parent and child, not parent and
   descendant.
5. The old return value of (gnus-gethash <group-name>
   gnus-newsrc-hashtb) was kind of a slice into `gnus-newsrc-alist': it
   behaved a bit like `member', but also included the group *before* the
   group you wanted, as well as all those after, so you could traverse
   the list in either direction. It now no longer preserves the order of
   `gnus-newsrc-alist' (this ordering wasn't actually used in many
   places), and instead there's a new variable `gnus-group-list' which
   records the proper sort order of the groups.

I feel fairly confident that all this is working okay. The second commit
I do *not* feel very confident about, and it's more of a "let's see how
things break" attempt. In theory, we should now be able to limit group
name encoding/decoding to the boundaries of Gnus -- reading/writing
active files, and talking to servers. Within Gnus, the group names can
remain decoded strings.

In the second commit I've just gone in and pulled out all the decoding I
can find, changed all the 'raw-text encoding options to 'undecided, and
deleted the `mm-disable-multibyte's. I have no confidence that I've
covered all the bases, though I have been using this branch for a couple
of weeks, with nnml, nnmaildir, and nnimap groups with multibyte names,
and so far nothing has broken. I'm less confident about nntp.

I'll continue working on this, and I hope I can get some feedback here,
particularly on the second commit.

Thanks, Eric





reply via email to

[Prev in Thread] Current Thread [Next in Thread]