[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [GNUnet-developers] Approximate Searches

From: Christian Grothoff
Subject: Re: [GNUnet-developers] Approximate Searches
Date: Wed, 24 Jun 2009 12:15:41 -0600
User-agent: KMail/1.11.4 (Linux/2.6.29-2-686; KDE/4.2.4; i686; ; )

I like this idea (at least as an option that should likely be the default) and 
have added it to the list of things to change for 0.9.x.  What I wonder if 
sorting the consonants should be omitted or not.  Some statistics on bad 
collisions with and without sorting would probably be nice to have...


On Tuesday 23 June 2009 07:27:17 leo stone wrote:
> I believe the biggest factor on how we judge a system for future usability
> is how many results we get if we are looking for "something" like
> "something".
> Imagine a shoe shop, with only two pair of shoes in it. And one with a few
> hundreds.
> The result in the end might be the same you leave both shop's not finding
> what you want, but most people will consider
> the shop with a hundred pairs more promising and worth spending time next
> time they try to find some shoes.
> So making sure people are getting results in their searches is probably one
> of the more important issues, after
> my doubts about how the routing is handled.
> Even though it might mean some significant overhead, i would consider doing
> something like normalizing keywords.
> If it must be, per language but in the beginning English should be enough.
> So if i wanted to share the following file, and i would like it public, so
> people can find it, why not store it such:
> "Woh_the.fuck_is ALICe(2008).divx.avi.WMV"  =>  { HW , HT , CFK , S , CL ,
> 2008 , DVX , V ,  MVW }
> Put the file under the hash's of those nine "key words".
> When i seach now for "fuck alice"  =>   { CFK , CL }
> search h(CFK)  AND h(CL)  will return a lot of wrong similar results but
> them one can filter locally in a more elaborate way.
> It might even be more selective than search  h(video/x-msvideo)
> At least it returns results, whereas "Woh_the.fuck_is
> ALICe(2008).divx.avi.WMV" as a key word is very unlikely that any one
> would think to search for and therefore never be found, never be spread
> ....., except by chance of course.
> regards leo

reply via email to

[Prev in Thread] Current Thread [Next in Thread]