[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Nynorsk dictionary severely incomplete
From: |
Morten Bo Johansen |
Subject: |
Re: Nynorsk dictionary severely incomplete |
Date: |
Fri, 23 Aug 2024 16:01:07 +0200 |
User-agent: |
slrn/1.0.3 (Linux) |
On 2024-08-22 Laura Orvokki Kursula wrote:
> Hi all,
>
> I wish to use aspell with the Nynorsk (nn) dictionary, but it is proving
> difficult owing to the extreme number of false positives it emits. The
> dictionary seemingly does not recognize, for instance, a-infinitives, or a
> variety of alternative noun declensions.
>
> A comprehensive list of Nynorsk words with all recognized declined forms may
> easily be downloaded from the website of the National Library of Norway[1]. I
> would convert this into an aspell dictionary file myself if I only could
> figure
> out how to, but the documentation[2] instructs me to download something via
> FTP
> or CVS, or to post to a mailing list, aspell-dict, which, according to
> Savannah's index of lists[3], does not exist.
>
> I am happy to help with this, and appreciate any advice on how to proceed.
Hi Laura
Take a look at https://github.com/mortenivar/aspell-da
It is a ready-made archive for creating a Danish aspell
dictionary. All you need to do is clone the archive
git clone https://github.com/mortenivar/aspell-da
- then rename the aspell-da directory to aspell-nn
- cd into that directory and
- Download the fullformer file from sprakbanken and extract the
the words to the file "nn.wl" in the aspell-nn directory.
I did it with this command
cut -d' ' -f3 fullformer_2012.txt | iconv -f iso-8859-1 -t utf8 >nn.wl
the delimiter to cut is the <tab> character-
- change all instances of "da" to "nn" in the files
- info
- Makefile.pre
- da.dat - which must also be renamed to "nn.dat"
- da.multi - which must also be renamed to "nn.multi"
- dansk.alias - which must also be renamed to "nynorsk.alias"
The file da_phonet.dat should then be renamed to
"nn_phonet.dat", but this file contains the phonetic rules for
Danish which may not be appropriate for nynorsk. For now it
doesn't matter so much. You may edit that file later, using the
instructions in http://aspell.net/man-html/Phonetic-Code.html.
finally run
./configure
make
make install (as root)
and hopefully you should be good to go. Note that the make
process spews out a lot of warning about entries with e.g. spaces
in them. You may ignore them.
HTH,
Morten