I'm currently writing on my Master thesis on context - sensitive spell
checking. This topic covers one point of your to-do list, "Rank suggestions
based on frequency information. Both global frequency and document specific
frequency can be used. The latter will require that the whole document be
made available to the spell checker. Also use frequency information to flag
words which are found in the dictionary but not in common usage, and thus
might not be what was intended.
But I have to use cooccurence information, which I extract out of larger
text corpora. Well I think this should work much better, than just frequency
information.