freecats-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Freecats-Dev] First review of Free CATS specification docs by Yve


From: Stanislav Visnovsky
Subject: Re: [Freecats-Dev] First review of Free CATS specification docs by Yves Savourel
Date: Tue, 24 Jun 2003 14:34:17 +0200 (CEST)

Hi all!

I've glanced the specification quickly. Just a technical note:  
Windows-1252 is not acceptable at all. The only way to go is Unicode
(UTF-8 preferably). Even Windows systems run in Unicode now.

Where can I download the prototype to comment on interface if necessary?

I can't really speak about matching stuff, since you are the expert in the 
area...

Stanislav

BTW, the Berkeley Database you've mentioned in another mail is quite 
effective, but total development management nightmare - they change file 
structure with each major release and break binary compatibility with 
library so often (even in the latest upgrade 4.0 -> 4.1 !)

On Sun, 22 Jun 2003, Henri Chorand wrote:

> Hi all,
> 
> Here is some food for thought: Yves reviewed our specification documents 
> and came up with a first feed-back.
> 
> I hope the attachment will not get lost (I allowed them in the mailing 
> list setup).
> 
> 
> Cheers,
> 
> Henri
> 
> -------- Original Message --------
> Subject: RE: Free CATS specification documents
> Date: Sun, 22 Jun 2003 09:55:11 -0600
> From: "Yves Savourel" <address@hidden>
> To: <address@hidden>
> 
> 
> 
> Hi Henri,
> 
> Thanks for the documents. I've looked at them and came up with the few
> notes and ideas below, in no specific order. Please, take them for what
> they are: just 'thinking out loud' type of notes.
> 
> 
> One think I would suggest for the TM is to take in account a possible
> third type of match. Exact match when you get 100% of the two source
> segments identical, fuzzy match for any segment under 100%, and 'perfect
> match' (I'm not sure how to call it) for the case where it's an exact
> match and you are able, somehow, to detect that the context is also
> identical. This is an important distinction because perfect matches
> normally don't have to be looked at, while exact matches have to be
> looked at by the translator in context. Perfect match could be related
> to an ID for example, like a string in a resource file. Having a
> identical source text doesn't mean it's the same instance of text,
> therefore the context may be different and so the translation. But
> having the same text and the same ID makes the match much more 'safer'.
> This could obviously come at a later stage of development.
> 
> Liked to this, there is a huge aspect of TM that should really be done
> without TM. I my opinion a large part of what TMs do is just a patch, a
> remedy for the symptom of the real problem: the fact that you don't know
> what part of the document has change between version 1 and 2. An updater
> module would be a huge step forward: a way to compare source doc version
> 1, source doc version 2, translated doc version 1, and create the
> translated version 2 with the delta left to edit or translate (and then,
> at that point the translator+TM takes over).
> 
> A note on the TM server repository: You seem to look into XML-databases
> with XML-based indexing engine. It's certainly a possibility, but don't
> discard more simple classic database as well. Something like mySQL for
> example is free and performing very well. SQL offers commands such as
> LIKE that are quite powerful and give already very good query result for
> fuzzy matching (without you doing anything). If on top of that you
> combine such query on a key field generated from the text of the TU it
> can be quite efficient.
> 
> Related to fuzzy matching: I've also attached an old article
> (Waikoloa.zip) that explains one way to create a simple TM engine. It's
> certainly nothing fancy, but it may help a little bit understanding how
> calculating fuzzy matches can work. You can download the source code and
> the executable of the sample at _http://www.opentag.com/Waikoloa.zip_.
> 
> On the topic of interfaces (meaning 'API' not UI). You probably have
> heard of T-Remote, a product from Telelingua, which is basically the
> same think as your TM server. They have developed an interface and the
> 'connectors' to plug their workbench client to various existing TM
> suites. Maybe they would be interested in some form of collaboration.
> They may see FreeCATS as a threat to their own solution, but even so.
> See for example Philippe Mercier's article in one of the LISA newsletter
> (_http://www.lisa.org/archive_domain/newsletters/2003/1.4/mercier.html_).
> 
> 
> One other minor thing: Getting files in SWX format made me smile. While
> OO is a good application and a free one, very few people in the Windows
> world (which is whether we like it or not, the world where, by far, most
> people of our industry work), have it installed. It's no big deal: they
> are really ZIP files with XML documents inside, but I would have
> expected an open source project to have documentation output in a very
> common format like HTML. This is nothing against OO as authoring tool,
> but it's probably not the best format to use for distribution if you
> publish them to a wider public :)
> 
> This actually made me think about a possible problem of open source
> projects. Many are a little bias toward Linux, Java, etc. in reaction
> against Microsoft often. But the mainstream of possible users are on
> Windows, and expect Windows-like applications. Cross platform
> applications have often the drawback of being generic, therefore not
> exactly like users expect. Those aspects don't seem very important at
> first: in Windows for example it would be that you can't use Ctrl+C to
> copy to the Clipboard, or can't use Alt+key to access a menu, etc. On
> the Mac, the user will expect to use whatever Mac special keys sequences
> he/she usually uses, etc. At length not having such expected behavior
> become very annoying to a user, and it becomes a reason to not use the
> application. I guess my point is that as soon as you talk about UI, it
> is very important to have a UI that really fits into the platform it
> targets, or the users will give up. That means a cross-platform
> application may sometime have to have parts developed specifically for
> each targeted platforms.
> 
> 
> Ok, that all for today. Il fait trop beau dehors pour continuer ŕ taper
> sur un clavier :)
> 
> Kenavo,
> -yves
> 
> <<...>>
> 
> 

-- 





reply via email to

[Prev in Thread] Current Thread [Next in Thread]