[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Language identification

From: Alex Ott
Subject: Re: Language identification
Date: Fri, 28 Aug 2009 08:45:05 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/23.1.50 (gnu/linux)


N-Gram algorithms is could be used to identify languages - it simpler than
bayes, and requires smaller database

Juri Linkov  at "Fri, 28 Aug 2009 03:27:35 +0300" wrote:
 >> I often wish that files would open in Emacs with correct mode
 >> more often when there is no file extension.

 JL> In `auto-mode-alist' you can see that with the exception of
 JL> `archive-mode', `doc-view-mode' and `image-mode', all remaining
 JL> modes are programming text modes.  It would be more useful
 JL> to identify file types for these modes that libmagic can't do.
 JL> Do you know a library that identifies programming languages?
 JL> Such a library might be implemented using a Bayesian classifier
 JL> trained on a sufficiently large corpus of different programming
 JL> languages.

With best wishes, Alex Ott, MBA
http://alexott.blogspot.com/           http://xtalk.msk.su/~ott/

reply via email to

[Prev in Thread] Current Thread [Next in Thread]