tetum-translators
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Tetum-translators] Fw: Tetun word list


From: Peter Gossner
Subject: [Tetum-translators] Fw: Tetun word list
Date: Thu, 20 May 2004 09:38:49 +0930

HI all,
Mr Scannel has developed a "crawler" that can be trained to hunt down
words in a language and build a corpus of that language from the
results. (!)

Lev you would be the best bet to communicate with this guy :)
(he is really smart too :)


(guys I have had this bloody flu for about a week now seems to be going
but it SUCKS... grrrrrr)


Forwarded message:

Date: Wed, 19 May 2004 08:33:34 -0500
From: Kevin Patrick Scannell <address@hidden>
To: address@hidden
Subject: Re: Tetun word list



Dear Pete,
   I'm sorry for taking so long to respond, I was 
away from email for a few days.   
   Getting the crawler up and running should be
easy - I suspect there will be a problem finding
sufficiently many texts but, as you mention
in your message, what is nice is that the
crawler can be run periodically and will
slowly build up a corpus with little or no
human intervention needed.

  Naturally there is no rush, whenever you can find
time to send me a word list or some raw texts I can
get the crawler going with just an hour or so of work.
If you or someone from your team could extract
the headwords from the dictionary I see at 
http://www.gnu.org/software/tetum/contributors/cliffMorris-xhtml/ch06.html

that would be an ideal start.


> My primary aim is to sit on the egg until a real Tetum Speaker comes
> forward with even basic coding skills.. There are such beasts but they
> are either:
> - without internet access. (Man the Militia really trashed Dili... and
> beyond)
> - really busy with the UN
> - really busy doing various doctorates :)

I've been working with many many language
groups over the past few months and you're not alone in
"sitting on the egg" ....  indeed this was my motivation for 
starting this project (the people in a position to make a real
linguistic contribution for these languages rarely have the time
to invest in building the necessary computational infrastructure).

hope to hear from you again soon
Best
kevin




reply via email to

[Prev in Thread] Current Thread [Next in Thread]