Unless somebody else is nearly there for Turkish, Gokalp and I will
probably be working on improving Aspell for Turkish (that is, we have
been working on it, and are just awaiting some administrivia to start
working on it again).
We'd love to collaborate with anybody else interested in it, or to get feedback on our approach.
Here's some background, and then our approach, if you are interested.
Turkish is an "agglutinative" language, like Finnish, Estonian,
Hungarian, Japanese, and Korean. That means that suffixes convey
a lot more information than in Indo-european languages, and that any
complete list of "surface forms" of words has to be enormously
longer. Though the suffix trees are big, they're quite regular,
so it fits reasonably well into Aspell's structure (though it fits
better into Hunspell, but for various reasons we can't go there).
There's a good implementation of Aspell for Finnish which proves the
concept.
We hope to take the existing Turkish Aspell word list, or maybe even a
longer word list, if we have time to generate it, and apply a stemmer
to it to come of with a list of the represented stem forms. We'll
connect those up with tables of suffixes we've collected from the web.