[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[help-GIFT] A flexible security solution for searching the desktop
From: |
Wolfgang Müller |
Subject: |
[help-GIFT] A flexible security solution for searching the desktop |
Date: |
Fri, 4 Jan 2002 21:43:42 +0100 |
Hi,
I am maintainer of the GNU Image Finding Tool and active in the
Fer-de-Lance project that's been in (not very loud, but
behind-the-scenes-active) exsistence since April last year. Within this
project we work towards integration of searching services into the desktop.
I am mailing our list and a couple of other developer lists, because I think
I have found an architecture that provides security while maintaining most of
the advantages of demon-based search engine architectures. I think this
architecture and associated tricks are flexible enough to encompass different
search engines, so this mail is not about Medusa vs. htdig vs. GIFT, but
rather how to work together to solve our common security problems for desktop
integration of our engines.
And, of course, I would be happy to get some suggestions for improvement
and/or some developer time. I would be less happy if someone finds a
fundamental flaw, but also this would be better than wasting my time trying
to develop this stuff further.
Now let's go into more detail.
GOAL:
The goal is to provide search services to the desktop user. These search
services should not only encompass web-visible URLs, but rather all files
accessible to the user as well as http/ftp/etc. accessible items.
ISSUE:
The first issue is -privacy-: the system should not tell us locations of
files that we cannot read otherwise. For example: looking for some
correspondence with the health insurance, we do not want to know that our
colleague wrote last month three letters that match our search.
Second -memory consumption-: All indexes for similarity searching use memory
which is either proportional to the size of each indexed file, or quite big
to begin with. We do not want plenty of users that roll their own index, we
want one index, otherwise we are likely to spend a multiple of our useful
disk space on indexes.
SUGGESTION: Use a daemon and make sure that authentication is good. :-)
Too easy? Of course the problem lies in providing the authentification.
What I suggest is to run a daemon which creates for each user U an unix
domain socket which is readable *and* writable *only* by this one user U (and
root, of course). All instructions to the indexing demon like e.g.
add item to index
delete item from index
move item within index (new URL for same file)
block item/subdirectory/pattern (don't index *.o files for example)
process query
would go through the socket. By knowing which socket received the request, we
automatically know the user, and then we just have to compare for each result
item, if it can be read by the user who issued the query. Of course we give
back only the readable items.
We can create the sockets as user "nemo", and then chown them using a very
small script running under root. So we would be root during a couple of
seconds on startup, afterwards everything would happen as a user (nemo) who
has write rights on one directory tree which is unreadable for all else. So
there is not the issue of a big indexing program running under root for days
and days in a row.
Adding an item is a (small) issue. We probably have to pipe the uuencoded (or
something equivalent) binary through the socket in order to have it indexed
on the other side of the socket. However, I guess the efficiency overhead is
small compared to the indexing cost.
Things become a trifle more complex for adding items which are found on the
web. Somebody indexing a web page should probably indicate who else (group,
all) is allowed to know that somebody's indexed that page. If several users
publish an URL the least restrictive rights are taken into account.
WHATS THERE? WHAT'S NEEDED?
Basically, I have tried out the socket stuff with a small test program.
Works. Now I am starting to integrate that with the GIFT (which involves
cleaning up some of my internet socket code).
What's still needed is the filter that stores which URLs are indexed under
which owner, and with which rights. On each query GIFT can ask this filter,
if a list of URLs can be given out as query result. Currently, I would like
to base this filter on MySQL.
When that filter is in place, writing a medusa-plugin for the GIFT would be
easy. I just finished a primitive htdig GIFT plugin which soon goes to CVS,
so that one just needs some fleshing out.
CONCLUSION
I hope to have convinced you that we can get relatively easily a secure, yet
memory efficient indexing solution for the desktop. If this is been already
done, please tell me where. If my mail is a stupid suggestion, please tell me
that, too. However, if you would like to participate in the coding and design
effort or simply to share your opinion, please do not hesitate to subscribe
to the fer-de-lance-development list.
Cheers,
Wolfgang
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- [help-GIFT] A flexible security solution for searching the desktop,
Wolfgang Müller <=