guix-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: File search


From: Ludovic Courtès
Subject: Re: File search
Date: Sat, 05 Feb 2022 12:18:44 +0100
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux)

Hi,

Ryan Prior <rprior@protonmail.com> skribis:

> On Friday, January 21st, 2022 at 9:03 AM, Ludovic Courtès <ludo@gnu.org> 
> wrote:
>
>> The database for 18K packages is quite big:
>>
>> --8<---------------cut here---------------start------------->8---
>>
>> $ du -h /tmp/db*
>>
>> 389M /tmp/db
>>
>> 82M /tmp/db.gz
>>
>> 61M /tmp/db.zst
>>
>> --8<---------------cut here---------------end--------------->8---
>> [snip]
>> In terms of privacy, I think it’s better if we can avoid making
>> one request per file searched for. Off-line operation would be
>> sweet, and it comes with responsiveness; fast off-line search is
>> necessary for things like ‘command-not-found’ (where the shell
>> tells you what package to install when a command is not found).
>
> Offline operation is crucial, and I don't think it's desirable to download 
> tens or hundreds of megabytes. What about creating & distributing a bloom 
> filter per package, with members being file names? This would allow us to 
> dramatically reduce the size of data we distribute, at the cost of not giving 
> 100% reliable answers. We've established, though, that some information is 
> better than none, and the uncertainty can be resolved by querying a web 
> service or building the package locally and searching its directory.

My understanding is that Bloom filters are sets essentially, but here we
need more than that: we need to map files to package names.

Or am I misunderstanding what you have in mind?

Thanks,
Ludo’.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]