aspell-user
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Information on Portuguese


From: Rodrigo Severo
Subject: Information on Portuguese
Date: Fri, 12 Mar 1999 11:2:14

I am sending information on portuguese to be included on Aspell.

Character set: ISO 8859-1

Vowels: a e i o u

Obs: Portuguese uses letters like "LATIN SMALL LETTER E WITH ACUTE". Do you 
need them listed as separate vowels? If you do, here they are:

"Accented" vowels: 
    a?    00e3 LATIN SMALL LETTER A WITH TILDE
    o?    00f5 LATIN SMALL LETTER O WITH TILDE
    a'    00e1 LATIN SMALL LETTER A WITH ACUTE
    e'    00e9 LATIN SMALL LETTER E WITH ACUTE
    i'    00ed LATIN SMALL LETTER I WITH ACUTE
    o'    00f3 LATIN SMALL LETTER O WITH ACUTE
    u'    00fa LATIN SMALL LETTER U WITH ACUTE
    a!    00e0 LATIN SMALL LETTER A WITH GRAVE
    a>    00e2 LATIN SMALL LETTER A WITH CIRCUMFLEX
    e>    00ea LATIN SMALL LETTER E WITH CIRCUMFLEX
    o>    00f4 LATIN SMALL LETTER O WITH CIRCUMFLEX
    u:    00fc LATIN SMALL LETTER U WITH DIAERESIS
    o!    00f2 LATIN SMALL LETTER O WITH GRAVE

Additional charters: just "-" like French

Additional considerations:
    
    From what I understood about affix compression, it is EXTREMELLY that 
affix compression is implemented to Aspell deal well with portuguese and, 
AFAIK, with all Latin languages: French, Spanish and Italian comme to my 
mind right now.

    Let me brieflly explain one example to be sure everybody undestands 
why.

    In portuguese 80% of all verbs follow one single pattern that is, for 
example:

    CANTAR (to sing)

    You keep the CANT that never changes and join it with:

O     - CANTO
AS    - CANTAS
A     - CANTA
AMOS  - CANTAMOS
AIS   - CANTAIS
AM    - AM
AVA
AVAS
AVAMOS
AVEIS
AVAM

    And the list goes on up to 56 different affixes. So, I believe that 
there are 2 options, or Aspell uses some kind of automatic termination 
method where there would be "affixes groups" (is this affix compression?) 
or for each verb to be included, there would be 56 different words.

Word lists:

     I can create one. My idea is to choose 2 different pocket size 
dictionaires and use the words appearing on both. This would solve the 
mistype problem and, I believe, even the copyright one.

Soundlike code:

     I can try but I can't promise any success on that. If there is anybody 
willing to invest time on soundlike code for portuguese around please let 
me know.

     I believe that's all,

     Rodrigo Severo



reply via email to

[Prev in Thread] Current Thread [Next in Thread]