Re: Any faster way to find frequency of words?

help-gnu-emacs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Any faster way to find frequency of words?

From:	Emanuel Berg
Subject:	Re: Any faster way to find frequency of words?
Date:	Sun, 09 May 2021 20:00:30 +0200
User-agent:	Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux)

Jean Louis wrote:

> I think that your (4) is not necessary, as counting is
> not necessary.

Some counting is if you are to learn the frequency.

How about `forward-word' the whole buffer and for every word
feed it to a data structure, which keeps a record and a digit
and increase that by 1?

Then the challenge would be to pick a data structure where
searching is fast and in particular where search time doesn't
_grow_ fast with respect to it's overall size growing (size =
the number of unique words)

BTW the theoretical worst-case would be a buffer where all
words are unique. Buffer cost is almost 1, ultimately n.
With the theoretical worst-case, data structure would be, if
linear, like this

if we denote buffer cost : data structure cost

1: 0      <-- first word
1: 1
1: 2
1: 3
..
1: n + 1  <-- last word

linear!

But probably data structure cost is less than linear, say
logarithmic, then we would have

linear(n) + n * logarithmic(n)

linear(n) will grow the faster, so linear!

Whatever you do with the data structure, it'll be fast enough!

-- 
underground experts united
https://dataswamp.org/~incal

[Prev in Thread]

Current Thread

[Next in Thread]

Re: Any faster way to find frequency of words?, (continued)
- Re: Any faster way to find frequency of words?, Emanuel Berg, 2021/05/09
  - Re: Any faster way to find frequency of words?, Jean Louis, 2021/05/09
    - Re: Any faster way to find frequency of words?, Emanuel Berg <=
    - Re: Any faster way to find frequency of words?, Jean Louis, 2021/05/09
    - Re: Any faster way to find frequency of words?, Emanuel Berg, 2021/05/09

Prev by Date: Re: outline-minor-mode and org-mode capabilities for programming languages
Next by Date: Re: outline-minor-mode and org-mode capabilities for programming languages
Previous by thread: Re: Any faster way to find frequency of words?
Next by thread: Re: Any faster way to find frequency of words?
Index(es):
- Date
- Thread