help-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Any faster way to find frequency of words?


From: Emanuel Berg
Subject: Re: Any faster way to find frequency of words?
Date: Sun, 09 May 2021 20:00:30 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux)

Jean Louis wrote:

> I think that your (4) is not necessary, as counting is
> not necessary.

Some counting is if you are to learn the frequency.

How about `forward-word' the whole buffer and for every word
feed it to a data structure, which keeps a record and a digit
and increase that by 1?

Then the challenge would be to pick a data structure where
searching is fast and in particular where search time doesn't
_grow_ fast with respect to it's overall size growing (size =
the number of unique words)

BTW the theoretical worst-case would be a buffer where all
words are unique. Buffer cost is almost 1, ultimately n.
With the theoretical worst-case, data structure would be, if
linear, like this

if we denote buffer cost : data structure cost

1: 0      <-- first word
1: 1
1: 2
1: 3
..
1: n + 1  <-- last word

linear!

But probably data structure cost is less than linear, say
logarithmic, then we would have

linear(n) + n * logarithmic(n)

linear(n) will grow the faster, so linear!

Whatever you do with the data structure, it'll be fast enough!

-- 
underground experts united
https://dataswamp.org/~incal




reply via email to

[Prev in Thread] Current Thread [Next in Thread]