tetum-translators
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Tetum-translators] Re: Tetum wordlist "official"


From: Peter Gossner
Subject: [Tetum-translators] Re: Tetum wordlist "official"
Date: Tue, 25 May 2004 00:13:44 +0930

Wow Hi Kevin !
Mate that is really impressive... FanBloodyTastic
'Scuse the strange mail format this seemed easier at the time...

<quote who=KevinPS>
Pete,

   Great news -- your word list was more than sufficient
and the crawler bootstrapped without a hitch -- it's grabbed
more than 400,000 words of Tetum already.   
  Here are the docs after just 24 hours online:

 http://borel.slu.edu/tet.html

Perhaps, since you already have this clean list of
5000+ words, I should just run the corpus through
my filters and generate some candidate words for
you to look over -- by the looks of things, without
too much effort on your part we could easily have
a working aspell-tet package sometime this week!
</quote>
<reply who=pete>

Have everything ready to go from ASPELL CVS here... (I think)
The "other Kevin" (Aspell Kevin)  is building a demo from the original
5000 list:) I think I get it. Some of the soudslike mappings are going
to be ... interesting. (tetum with an Au. accent ...lol  nah I wont )

I also now remember why I never bothered to learn C++ :)
(how slow is the compiler !)

</reply>
<quote who=KevinPS>
Let me know and I'll fire off some lists to you.
I guess since you aren't a native speaker this
could be a bit harder than usual (i.e. maybe you'll
need to consult dictionary occasionally) but the filters
and the frequency counts I'll give you will make things
much easier.
</quote>
<reply who=pete>
Sounds really good.
Please fire away... what else do I need ?
Um 400 thousand might be a good place to STOP :)
LOL. 
</reply>
<quote who=KevinPS>
-Kevin

PS if you see any non-Tetum docs on the page above,
or repeats, please let me know since they'll skew the
stats a bit...
</quote>

<reply who=pete>

Looks very clean to me. I will have another look tomorrow.
I thought some may be Indonesian or Portuguese but they seem pretty
clean. 
Mate that's VERY impressive. I guess some may be duplicate
content (the wiki entry seems to be everywhere for instance.. though not
on your list which is also impressive... ) The code is Open Source? 

The original 5000 was fairly carefully gathered from "reputable /
official sources" , looks like the few random pages I sampled where
mostly pure "Dili -Tetum".. some Indonesian Borrow words but that is
real life. That is that is how the real timorese use the language. So
for a GP dictionary... Excellent !

I wish my tetum were that good but to me it seems great.
I will CC some real tetum speakers and beta test the aspell dictionary
(of course) with them and some written (paper) references, as well.

Kevin this is unbelievably great news ! I thought the 5000 was good :> !

Thankyou Thank you.

Pete
</reply>


-- 
Todays fortune:
Today is the first day of the rest of your life.
     
< http://www.gnu.org/software/tetum/ >
< http://bigbutton.com.au/~gossner >
< address@hidden >





reply via email to

[Prev in Thread] Current Thread [Next in Thread]