[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug#39258] [PATCH v2 0/3] Xapian for Guix package search

From: Ludovic Courtès
Subject: [bug#39258] [PATCH v2 0/3] Xapian for Guix package search
Date: Sat, 07 Mar 2020 21:33:16 +0100
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux)


Arun Isaac <address@hidden> skribis:

> Here is the second iteration of my Xapian Guix package search patchset. I have
> found the reason the earlier patchset did not show significant speedup. It
> turns out that most of the time is spent in printing and texinfo rendering of
> the search results. So, in this patchset, I pre-render the search results
> while building the Xapian index and stuff them into the Xapian database
> itself. Therefore, during `guix search`, I just pull out the pre-rendered
> search results and print it on the screen. This is much faster. See comparison
> below.
> With a warm cache,
> $ time guix search inkscape
> real  0m1.787s
> user  0m1.745s
> sys   0m0.111s
> $ time /tmp/test/bin/guix search inkscape
> real  0m0.199s
> user  0m0.182s
> sys   0m0.024s


In general, pre-rendering doesn’t seem practical to me: the output of
‘guix search’ is locale-dependent (it speaks the user’s language) and
adjusts to the terminal width (well, this is temporarily broken on
Guile 3.0.0, but see ‘%text-width’ in (guix ui)).

Also, if the 12K+ descriptions need to be rendered at the time the user
runs ‘guix pull’, the experience may not be great, because it could take
a bit of time.


> Why not use a simpler package search results format like Arch Linux or Debian
> does? We could just display the package name, version and synopsis like so.
> inkscape 0.92.4
>     Vector graphics editor
> inklingreader 0.8
>     Wacom Inkling sketch format conversion and manipulation
> Why do we need the entire recutils format? If the user is interested, they can
> always use `guix package --show` to get the full recutils formatted
> info. Having shorter search results will make everything even faster and much
> more readable. WDYT?

What I like about the recutils format in this context is that it’s both
human- and machine-readable.  The examples in the manual show how it can
be useful to select the information displayed or to refine the search
(info "(guix) Invoking guix package").

Also: I’d recommend tackling one thing at a time.  :-)

> Ludovic Courtès <address@hidden> writes:
>> Note that ‘guix search’ time is largely dominated by I/O.
> Yes, `guix search` is I/O intensive. That is why I expect Xapian to do better
> since it only needs to access matching packages not all packages. Also, the
> Xapian index is fast at all times. It is not very dependent on a warm
> filesystem cache.

Yes, indeed.

>> On my laptop,
>> I get (first measurement is cold cache, second one is warm cache):
>> --8<---------------cut here---------------start------------->8---
>> $ sudo sh -c 'echo 3 > /proc/sys/vm/drop_caches'
>> $ time guix search foo >/dev/null
>> real    0m2.631s
>> user    0m1.134s
>> sys     0m0.124s
>> $ time guix search foo >/dev/null
>> real    0m0.836s
>> user    0m1.027s
>> sys     0m0.053s
>> --8<---------------cut here---------------end--------------->8---
>> It’s hard to do better on the warm cache case because at this level,
>> there may be other things to optimize having little to do with searching
>> itself.
>> Note that this is on an SSD; the cold-cache case must be worse on NFS or
>> on a spinning disk, and there we could gain a lot.
> My laptop is quite old with a particularly slow HDD. Hence my motivation to
> improve guix search performance!

Were you able to measure the cost of rendering specifically?

Here’s what I see when I turn ‘package->recutils’ into a no-op:

--8<---------------cut here---------------start------------->8---
$ sudo sh -c 'echo 3 > /proc/sys/vm/drop_caches'
$ time ./pre-inst-env guix search foo 

real    0m1.617s
user    0m0.812s
sys     0m0.094s
$ time ./pre-inst-env guix search foo 

real    0m0.595s
user    0m0.747s
sys     0m0.043s
--8<---------------cut here---------------end--------------->8---

To compare with:

--8<---------------cut here---------------start------------->8---
$ time ./pre-inst-env guix search foo >/dev/null

real    0m0.829s
user    0m1.026s
sys     0m0.046s
--8<---------------cut here---------------end--------------->8---

I think we should look at a profile of ‘package->recutils’, there’s
probably room for improvement there.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]