[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Indexed search with grep-like output

From: Eli Zaretskii
Subject: Re: Indexed search with grep-like output
Date: Tue, 04 Jan 2011 03:11:06 -0500

> From: Lennart Borgman <address@hidden>
> Date: Tue, 4 Jan 2011 08:22:09 +0100
> Cc: address@hidden, address@hidden, address@hidden
> > In the directory where you installed docindexer, there's a file named
> > conf.py, a piece of Python code that describes the docindexer parser
> > configuration.  Its syntax should be self-explanatory; you can add
> > entries there for whatever source files you'd like to index.
> No, you do not have that file if you used the installer and installed
> the binary version.

Well, I certainly did use the installer, and I do have that file.  Are
you sure you don't have it?

In any case, you can find it in the docindexer source distribution.

> If you want to use that installer you can not
> change the how files with different extensions are parsed by docindex.

But I just did change that.  Here's the exact recipe:

 . Find config.py in the docindexer installation directory and edit it
   to add a line for *.el files.
 . Find a file named library.zip in the docindexer installation
   directory.  This is the class library used by docindexer.
 . Replace the file docindexer/config.pyc in library.zip with the
   edited docindexer/config.py.  Note: the .pyc extension means that
   the file was compiled by Python; the corresponding .py file is not
   compiled, but it will be used anyway -- this is similar to what
   Emacs does with *.el and *.elc files.
 . Run "docindexer --config" and make sure you see the *.el line in
   the output.

After performing the above procedure, I have just indexed the entire
Emacs lisp/ directory.  It took 3 minutes (yes, the indexer is not
very fast, which is why it's scheduled to run at night when I'm away;
mkid does the same job 3 times faster).

Morale: Never underestimate the power of Free Software!  When you have
sources, _you_ are in control, not the software developer.  This is
what Free Software is all about.

> > Having said that, I don't think docindexer is the right tool for
> > indexing program source files.  Lucene text analyzers are biased
> > towards indexing plain text, so they typically ignore one-letter
> > words, like "a" and "i", words like "the", "in", "on", "some", etc. --
> > which could well be valid identifiers in a program.  It really isn't
> > the tool for this job.
> It does not give an index of the kind you want, that is correct.
> However I might still find it handy to quickly find parts of the code.

Is it really handy?  Lisp identifiers include punctuation characters
such as `-', `>', `:', etc.  I'd guess that plain text indexing will
not index these identifiers as you'd want to.

> If you want to then feel free to add support for ID-utils to
> idxsearch.el. It should typically be a file on its own. The file
> idxdocindex.el is a good starting example.

I'd rather extend id-utils.el, and eventually add that to Emacs.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]