[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug#39258] [PATCH v2 0/3] Xapian for Guix package search

From: Arun Isaac
Subject: [bug#39258] [PATCH v2 0/3] Xapian for Guix package search
Date: Sun, 08 Mar 2020 14:31:42 +0530

>> It turns out that most of the time is spent in printing and texinfo
>> rendering of the search results.

Also, when we put all package metadata into the Xapian index, we don't
have to look up any of the package variables in (gnu packages *) during
`guix search` time. This also contributes substantially to the speedup.

> In general, pre-rendering doesn’t seem practical to me: the output of
> ‘guix search’ is locale-dependent (it speaks the user’s language) and

Note that we already need to index package synopses and descriptions in
all languages. I still haven't implemented this, though.

> adjusts to the terminal width (well, this is temporarily broken on
> Guile 3.0.0, but see ‘%text-width’ in (guix ui)).

This could be accomplished even with pre-rendering. Xapian provides
"slots" to store arbitrary strings with a document. Instead of storing
the pre-rendered document as a whole, we could store pre-rendered fields
in separate slots. Then, during `guix search` time, we can assemble the
result from these pre-rendered fields.

> Also, if the 12K+ descriptions need to be rendered at the time the user
> runs ‘guix pull’, the experience may not be great, because it could take
> a bit of time.

This is a problem, but I would see it as a necessary "compilation"
step. :-P In fact, this whole patchset speeds up `guix search` by doing
part of the work of `guix search` ahead of time. So, some such cost is

> What I like about the recutils format in this context is that it’s both
> human- and machine-readable.  The examples in the manual show how it can
> be useful to select the information displayed or to refine the search
> (info "(guix) Invoking guix package").

Xapian's query language is much more natural (as in natural language)
than the regexp based techniques we need to use with recutils. I have
hardly ever used the regexp based search and I suspect many others
haven't either. Also, refining the search query should be easier to do
with Xapian. We could even use Xapian's query expansion feature to
suggest improved queries to the user.

That said, if we want the recutils format, we can still keep it in a
simplified form like so.

name: inkscape
version: 0.92.4
synopsis: Vector graphics editor

name: inklingreader
version: 0.8
synopsis: Wacom Inkling skecth format conversion and manipulation

> Also: I’d recommend tackling one thing at a time.  :-)

I totally agree, but I'm tempted to say that pre-rendering would be a
lot cheaper with the simplified form of search results. :-)

> Were you able to measure the cost of rendering specifically?

generate-package-search-index takes around 50 seconds. If I modify
generate-package-search-index to not pre-render but simply store the
package description alone, it takes around 20 seconds. That gives us a
rough idea of the cost of pre-rendering.

> I think we should look at a profile of ‘package->recutils’, there’s
> probably room for improvement there.

On quick inspection, most of the time in package->recutils is spent in
texinfo rendering the description. Unless we use the simplified search
results format as discussed above, we cannot avoid it.

Attachment: signature.asc
Description: PGP signature

reply via email to

[Prev in Thread] Current Thread [Next in Thread]