[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: nnml article filenames
Pranav K. Tiwari
Re: nnml article filenames
Mon, 31 Oct 2005 14:43:53 +0530
Gnus/5.110003 (No Gnus v0.3) Emacs/21.3 (windows-nt)
Steve Youngs <firstname.lastname@example.org> writes:
> * Pranav K Tiwari <email@example.com> writes:
> > Steve Youngs <firstname.lastname@example.org> writes:
> >> * Pranav K Tiwari <email@example.com> writes:
> >> > To allow desktop search programs go through nnml articles, I would
> >> > like to give an extension like .xyz, and tell these programs to
> >> > treat these files like email.
> >> I think this is the wrong approach. Instead of modifying the
> >> filenames to suit the search program, find a way to make the search
> >> program work properly.
> >> It's really not that difficult, see...
> >> $ find <nnmldir> -type f -regex '^.*[0-9]+$'
> > The question is not about 'finding' these files, but about
> > associating a 'type' with the file.
> But if you can find them, there's really no point in associating a
> "type" to them.
> $ find <nnmldir> -type f -regex '^.*[0-9]+$' | \
> xargs some_app_needing_mail_files_as_input
> > Most indexing programs (google/yahoo/microsoft desktop search
> > engines, X1) rely on file extensions to determine the filetype,
> > and then index the contens of the file accordingly. It'll be good
> > if they could deal with files with no extensions, but they don't
> > (afaik).
> Yes they do. For example:
> > So - with that in mind, the easiest way would be to change the way gnus
> > nnml stores files, or write another backend that allows changing
> > filenames.
> Maybe you should say what it is exactly that you want to do with your
> nnml files.
swish is fine - that's what I've used till now. I've been unable to use
it to index all of my email periodically. I would like to say, here's
the top directory under which all my nnml mail is, and this should be
indexed periodically. But swish runs out of memory (even with -e option,
on my 512Meg Win2k machine) in trying to index my mails (some, 35-40
nnml folders, each with 2000-5000 emails). So, the way I use swish is to
have one index file per nnml folder, and I have modified the swish
search function to search a list of index files.
It works, but as you can see, it's not optimal. Maybe, my usage of swish
is not correct - and if so, I'll be glad to be corrected.
desktop search programs that I mentioned, all support a 'crawl' type of
indexing where they can keep track of what has changed, and update their
indices appropriately. And I have never had any trouble with memory with
them. That's why I'll like to use any of those to index my mail, instead
of swish that I'm using at present.