[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Social-mediagoblin] Tahoe-LAFS as a document-oriented database
From: |
Christopher Allan Webber |
Subject: |
Re: [Social-mediagoblin] Tahoe-LAFS as a document-oriented database |
Date: |
Sat, 09 Apr 2011 10:20:53 -0500 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/24.0.50 (gnu/linux) |
Zooko, thanks for this email. It's actually incredibly timely.. I had
just sketched up an idea of how the storage API was to work. I want to
have a generic storage API so we can plug in multiple backends (from
just simple local file storage, to eucalyptus, to tahoe-lafs even!)
So, before I reply to your email directly, let me paste in that API.
#+BEGIN_SRC python
storage_handler.file_exists(['dir1', 'dir2', 'filename.jpg'])
# True / False
storage_handler.get_unique_filename(['dir1', 'dir2', 'filename.jpg'])
# Possibly returns either:
# - 'filename.jpg' # if no such file yet exists
# - '%s-filename.jpg' % uuid.uuid4() # if another file of this name exists
#
# You would then use this to call
# this_file = storage_handler.get_file(['dir1', 'dir2', our_unique_filename])
storage_handler.get_file(['dir1', 'dir2', 'filename.jpg'])
# Returns a read/writeable file-like object
# I guess makes the directory if necessary?
storage_handler.write_file(['dir1', 'dir2', 'filename.jpg'], data)
# Writes this file, lazy convenienceness.
# I guess makes the directory if necessary?
storage_handler.delete_file(['dir1', 'dir2', 'filename.jpg'])
# Deletes the file here
storage_handler.file_path(['dir1', 'dir2', 'filename.jpg'])
# Only for appropriate local filestores
#+END_SRC
This API is inspired by Django's filesystem API:
http://docs.djangoproject.com/en/dev/topics/files/
...where you get back a python file-like object that you .read() and
.write() to, but maybe it isn't a file object directly, as possibly
you're actually writing to some remote server, etc. Note that we're not
using paths like ['dir1/dir2/filename.jpg'] but rather a list of
components. This way we can really be sure what directories the author
*intended*, but also strip out evil things via
werkzeug.utils.secure_filename().
If anyone has comments on this I'd greatly appreciate hearing them.
Okay, now I'll respond to your email inline:
"Zooko O'Whielacronx" <address@hidden> writes:
> Folks:
>
> There is a discussion on the Tahoe-LAFS mailing list about how
> Tahoe-LAFS could be used for storing assets such as media files.
>
> http://tahoe-lafs.org/pipermail/tahoe-dev/2011-April/006257.html
>
> I want to emphasize that I'm not "pushing" for social-mediagoblin to
> use Tahoe-LAFS instead of using MongoDB or postgresql or whatever. I'm
> sure you folks have good reasons for your choices (Chris has written
> some notes about this issue) and I don't want Tahoe-LAFS to get used
> in ways that it is ill-suitedโ that would just cause headaches for
> everybody including me.
Interesting post, thanks for sharing.
And yeah, I don't intend to use tahoe-lafs where mongodb is currently,
as the "database". If tahoe-lafs ever becomes a backend to mongodb,
maybe, and then I won't even need to write support for that myself! ;)
But as a media storage system I think it might work out well.
In the case of a backend like tahoe-lafs, I figure we can actually
allow for space in the database to actually map where these paths are if
necessary, but I'm not sure.
> Rather, I think there are some important fundamental architectural
> issues which are revealed in this conversation on the Tahoe-LAFS
> mailing list. Regardless of which actual technologies we use, we
> should understand and make conscious decisions about these
> architectural issues.
>
> Namely:
>
> 1. On what do you rely for the guarantee that the file is uncorrupted?
> There are basically two use cases: you store a file yourself and get
> it back later, or you share a file with someone else. In the former
> you want to be sure that you get back the same file you put in. In the
> latter the recipient wants to be sure that they get the same file the
> sharer sent.
So case 1, I guess we can store sha1 hashes in the database and check
against them if necessary?
So, in case 2, I'm really not sure what kind of problem you're
anticipating. Maybe more examples would be helpful. Are you talking
about like, a cryptographic integrity check to make sure yes, this is
the right file, no fooling, nobody's going to goatse.cx me?
I'd be interested in some cases where you think this will become a
problem.
> 2. Is there a guarantee of confidentialityโthat people who weren't
> intended to see the file can't see it? If so, on what do you rely for
> that guarantee?
Initially I think we're just going to handle things where everything is
pulbic to everyone. But eventually I'd like to be able to support
sharing certain files with only family and friends, but I think that's a
way off.
One way of doing this when *not* using something like tahoe-lafs might
be to use the X-Sendfile response header. This way we can authenticate
that it's okay to send this file and then let apache/nginx/whatever
actually do the serving.
> 3. Performance questions about clusters of servers and clients. May or
> may not be relevant to socialmediagoblin.
>
> Thank you for your attention.
>
> Regards,
>
> Zooko
>
Thank you, zooko, for the useful email :)
--
๐๐ฑ๐ป๐ฒ๐ผ๐ฝ๐ธ๐น๐ฑ๐ฎ๐ป ๐๐ต๐ต๐ช๐ท ๐ฆ๐ฎ๐ซ๐ซ๐ฎ๐ป
- [Social-mediagoblin] Tahoe-LAFS as a document-oriented database, Zooko O'Whielacronx, 2011/04/09
- Re: [Social-mediagoblin] Tahoe-LAFS as a document-oriented database,
Christopher Allan Webber <=
- Re: [Social-mediagoblin] Tahoe-LAFS as a document-oriented database, Zooko O'Whielacronx, 2011/04/09
- Re: [Social-mediagoblin] Tahoe-LAFS as a document-oriented database, will kahn-greene, 2011/04/10
- Re: [Social-mediagoblin] Tahoe-LAFS as a document-oriented database, Christopher Allan Webber, 2011/04/10
- Re: [Social-mediagoblin] Tahoe-LAFS as a document-oriented database, Deborah Nicholson, 2011/04/14
- Re: [Social-mediagoblin] Tahoe-LAFS as a document-oriented database, Christopher Allan Webber, 2011/04/14
- Re: [Social-mediagoblin] Tahoe-LAFS as a document-oriented database, will kahn-greene, 2011/04/14
- Re: [Social-mediagoblin] Tahoe-LAFS as a document-oriented database, Greg Grossmeier, 2011/04/15