[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Any faster way to find frequency of words?
From: |
Eric Abrahamsen |
Subject: |
Re: Any faster way to find frequency of words? |
Date: |
Sun, 09 May 2021 07:56:09 -0700 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux) |
Jean Louis <bugs@gnu.support> writes:
> I am interested if there is some better way for Emacs Lisp to find
> frequency of words.
>
> Purpose is to create HTML clickable tag clouds similar to image tag
> clouds. But I will invoke Perl from Emacs to generate it. For that, I
> have to analyze the text first.
Is there any particular improvement you're trying to make?
> (setq text "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec a
> diam
> lectus. Sed sit amet ipsum mauris. Maecenas congue ligula ac quam
> viverra nec consectetur ante hendrerit. Maecenas congue ligula ac quam
> viverra nec consectetur ante hendrerit..")
>
> (defun text-alphabetic-only (text)
> "Return alphabetic characters from TEXT."
> (replace-regexp-in-string "[^[:alpha:]]" " " text))
>
> (defun word-frequency (text &optional length)
> "Returns word frequency as hash from TEXT."
> (let* ((hash (make-hash-table :test 'equal))
> (text (text-alphabetic-only text))
> (words (split-string text " " t " ")))
I guess I'd suggest using Emacs syntax parsing functions, ie
`forward-word' and `buffer-substring'. Then you can fine tune the
definition of words using the local syntax table.
> (mapc (lambda (word)
> (when (> (length word) 2)
> (let ((word (downcase word)))
> (if (numberp (gethash word hash))
> (puthash word (1+ (gethash word hash)) hash)
> (puthash word 1 hash)))))
While hash tables are probably best for very large texts, alists are
nice because you can use place-setting with a default, simplifying the
above to:
(cl-incf (alist-get word frequency-alist 0 nil #'equal))
Eric
- Any faster way to find frequency of words?, Jean Louis, 2021/05/09
- Re: Any faster way to find frequency of words?,
Eric Abrahamsen <=
- Re: Any faster way to find frequency of words?, Emanuel Berg, 2021/05/09
- Re: Any faster way to find frequency of words?, Jean Louis, 2021/05/09
- Re: Any faster way to find frequency of words?, Eric Abrahamsen, 2021/05/09
- Re: Any faster way to find frequency of words?, Jean Louis, 2021/05/10
- RE: [External] : Re: Any faster way to find frequency of words?, Drew Adams, 2021/05/10
- Re: [External] : Re: Any faster way to find frequency of words?, Jean Louis, 2021/05/10
- RE: [External] : Re: Any faster way to find frequency of words?, Drew Adams, 2021/05/10
- Re: [External] : Re: Any faster way to find frequency of words?, Jean Louis, 2021/05/10
Re: Any faster way to find frequency of words?, Emanuel Berg, 2021/05/09