[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [help-GIFT] Re: Clarification on inverted file

From: David Squire
Subject: Re: [help-GIFT] Re: Clarification on inverted file
Date: Wed, 22 Aug 2001 16:32:38 +1000

(forwarded for Wolfgang)

On Monday 20 August 2001 12:10, David Squire wrote:
> Wolfgang Mueller wrote:
> > MARS is strongly inspired by text retrieval,
> > but modifies the retrieval scheme, basing the weighting not on the
> > document frequency but on the standard deviation of the term frequency.
> I haven't got the article in front of me, but if I recall correctly they
> didn't use standard deviations of term frequencies, but rather std. devs.
> of continous-valued features. This would mean that features that took on a
> wide range of values in the query would get a low weight.

OK. I was not precise enough:

The continuous feature values are seen as "pseudo tf" and then he looks at 
stdevs of these pseudo tfs.

> This is clearly related to the term frequency idea, since if the features
> were quantized a la Viper, then features with low std. dev. would tend to
> get high term frequencies for the quantiles around the mean.

He uses the log standard deviation as equivalent to the log inverse 
*document* frequency. By design, this goes in the same direction as Viper's 
tf.idf stuff, but it does not capture multimodal distributions, and more 
importantly multi-modal feedback.


Wolfgang Müller, 
assistant-doctorant ==  PhD student (2001), teaching assistant
Personal page: 
Maintainer, GNU Image Finding Tool (

reply via email to

[Prev in Thread] Current Thread [Next in Thread]