[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: NLNet grant "Next Generation Internet -- Search & discovery": I'm in
Re: NLNet grant "Next Generation Internet -- Search & discovery": I'm in!
Sat, 14 Dec 2019 18:04:13 +0100
On Fri, 13 Dec 2019 at 15:49, Pierre Neidhardt <address@hidden> wrote:
> 2. File search
> (Previous discussion:
Yes, it is really lacking. For example, if one wants to use the 'hg'
control version system, then one will naively search "guix search hg"
and this will return "Human Genome" packages (useful in bioinformatics
stuff). Worse, because the description/synopsis of the package which
provides the command 'hg' do not mention the term 'hg', it is
impossible to reach it if one does not know that it is provided by the
very package mercurial. So, what I am personally doing is: DuckDuckGo,
look at some Debian packages and hope it is the same name in Guix.
And it is super useful to find headers. You have a code with "#include
<name-it.h>" and it is hard to know which Guix package provides this
very header 'name-it.h'.
However, IMHO, the "filesearch" should be included in the "search"
command and not another command. I mean:
guix package --file-search=hg
guix search hg --file-search
It appears to me a better UI than adding again another subcommand. :-)
> 5. Social integration with the Guix catalogue
> Previous discussions:
> - Adding wikidata, wikipedia & screenshot-url fields to
> - Re-approaching package tagging:
> - New library: guile-wikidata:
> - Guix <-> Wikidata:
> - Guix Wikidata module - next steps:
> There were also a few discussions regarding package search improvements, in
> which has Zimoun participated quite a bit if I recall correctly. Feel free
> to share all your precious links! :)
Firstly, IMHO tagging, i.e., assign a specific word belonging to a set
of words, is not a good approach. My main argument is: the set of
words is arbitrary, and at the end, it is bikeshedding and/or it is
not really useful because it is not self-organised by the data
themselves. As a Debian user, I do not use their tag system; and I am
almost sure they have documented the "usefulness" of their tagging
system and the feedback (I have in mind talks in DebConf but I am not
able to find it now).
However, grouping packages by similar topic is important for
discoverybility. The question is: how is the grouping done? Instead of
a manual tagging, I propose to first compare clustering methods based
on synopsis+description and Natural Language Processing (NLP).
It is what I had in mind when I answered to the thread "Re-approaching
package tagging" but life intervened and I did nothing on this front.
Well, the Python ecosystem provides nice packages (most not yet in
Guix last time I checked) to ease the first exploratory and see if it
will pay off or not.
Not about tagging but close enough to be maybe relevant:
Secondly, instead of manual tagging, I propose to work on the
relevance scoring. Basically, "guix search" should act as a
recommendation system IMHO. Then the questions are: where is done the
indexing computations? locally? by the Guix Data Service and "guix
pull" will fetch this index? Can be merge with other distro or
upstream (CRAN, github) via wikidata or API? etc.
Well, thirdly I also think that Guix lacks tools to navigate in its
Git history. Now we have "guix time-machine", it appears to me that
finding specific package back in the history is complicated (basically
git checkout+git log+grepouch! not user-friendly). I have tried to
describe use cases in this message.
IMO, something similar to "git tag" should be added (in "guix pull"?).
But one can also think to integrate such historical information in
Wikidata and for example "guix search emacs --all" will return all the
versions and commits present in Guix, then it is easy to run "guix
time-machine --commit=1234 -- install emacs".
Kind of such ideas... and not fully clear in my mind. ;-)
Last about UI:
Currently "guix search" supports regexp but part of the filtering is
done by recsel. And I do not find that handy.
Hope that these words make sense.