aspell-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Nynorsk dictionary severely incomplete


From: Morten Bo Johansen
Subject: Re: Nynorsk dictionary severely incomplete
Date: Fri, 23 Aug 2024 16:01:07 +0200
User-agent: slrn/1.0.3 (Linux)

On 2024-08-22 Laura Orvokki Kursula wrote:

> Hi all,
>
> I wish to use aspell with the Nynorsk (nn) dictionary, but it is proving
> difficult owing to the extreme number of false positives it emits. The
> dictionary seemingly does not recognize, for instance, a-infinitives, or a
> variety of alternative noun declensions.
>
> A comprehensive list of Nynorsk words with all recognized declined forms may
> easily be downloaded from the website of the National Library of Norway[1]. I
> would convert this into an aspell dictionary file myself if I only could 
> figure
> out how to, but the documentation[2] instructs me to download something via 
> FTP
> or CVS, or to post to a mailing list, aspell-dict, which, according to
> Savannah's index of lists[3], does not exist.
>
> I am happy to help with this, and appreciate any advice on how to proceed.

Hi Laura

Take a look at https://github.com/mortenivar/aspell-da

It is a ready-made archive for creating a Danish aspell
dictionary. All you need to do is clone the archive

  git clone  https://github.com/mortenivar/aspell-da
  
- then rename the aspell-da directory to aspell-nn

- cd into that directory and

- Download the fullformer file from sprakbanken and extract the
  the words to the file "nn.wl" in the aspell-nn directory.
  I did it with this command
  
  cut -d'       ' -f3 fullformer_2012.txt | iconv -f iso-8859-1 -t utf8 >nn.wl

the delimiter to cut is the <tab> character-

- change all instances of "da" to "nn" in the files

  - info
  - Makefile.pre
  - da.dat - which must also be renamed to "nn.dat"
  - da.multi - which must also be renamed to "nn.multi"
  - dansk.alias - which must also be renamed to "nynorsk.alias"
  
The file da_phonet.dat should then be renamed to
"nn_phonet.dat", but this file contains the phonetic rules for
Danish which may not be appropriate for nynorsk. For now it
doesn't matter so much. You may edit that file later, using the
instructions in http://aspell.net/man-html/Phonetic-Code.html.

finally run

  ./configure
  make
  make install (as root)
  
and hopefully you should be good to go. Note that the make
process spews out a lot of warning about entries with e.g. spaces
in them. You may ignore them.

HTH,
Morten





reply via email to

[Prev in Thread] Current Thread [Next in Thread]