[Freecats-Dev] OmegaT (cont., from Marc)

freecats-dev

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Freecats-Dev] OmegaT (cont., from Marc)

From:	Henri Chorand
Subject:	[Freecats-Dev] OmegaT (cont., from Marc)
Date:	Thu, 27 Mar 2003 21:35:34 +0100
User-agent:	Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.1) Gecko/20021003

Hi all,

Marc recently sent this message, which was intended for our mailinglist. Here it is for all to read. A good food for thought, indeed!



Cheers,

Henri

-------- Original Message --------
From: Marc Prior <address@hidden>

Re splitting OmegaT into modular components:

I have been thinking about a modular TM application for some time. Keithand I had in fact already discussed this briefly. The background to thiswas as follows:

I have been using Linux and promoting its use among other translatorsfor three years now. The absence of a CAT tool for Linux was a seriousdeficit, both for my own use, and for my efforts to promote Linux. I hadbeen in contact with a large number of TM application vendors to discussthe possibility of either a native Linux version, or a version whichcould be used on Linux (e.g. by using a macro language such asOpenOffice.org's). When it became clear that no vendor was interested insupporting a user base of one, I took the plunge and began learning toprogram.

My first efforts, around the end of 2001, were in ELF. This is theintegral macro language of the Applixware Office suite. I began by"cloning" Wordfast's segmentation function. Initial progress was good(the code may have been a disaster, but I got results that worked), as Iwas able to use keystroke recording in order to produce routines withoutany programming skills, and Applixware Office contains a huge library ofready-made macros.When proper programming became necessary, though, I ground to a halt.The reason is that there is only one manual for ELF, and it is notwritten for beginners. In any case, I had to spend more time translatingin order to eat.

I made a couple of forays into Java and Star Basic, but gave up on both.Java I found much too difficult.; Star Basic suffered from an almosttotal lack of documentation.


In the summer of 2002, I discovered tcl/tk. tcl/tk had a number of

advantages. Firstly, everyone seemed to think it was easy to learn.Secondly, there were a number of free tutorials on the web, includingmanuals suitable for beginners. Thirdly, and crucially, I had discoveredseveral tcl/tk applications, free and open-source, that I could use formy purposes. These were:

RSTool - a text segmenter, actually designed for text parsing forlinguistic purposes

tkXMLiVE - an XML editor
DING - a dictionary, essentially a GUI for grep
en-rus - a GUI for a Russian dictionary
glimpse - a text indexing and retrieval utility
tkglim - a tcl/tk UI for glimpse

With the exception of glimpse, all of these applications are tcl/tk,free and open-source. My first encounter with tk/tcl was when Idiscovered that with a little modification, DING could be used to accessWordfast translation memories. A developer may well say "so what", butyou must remember that at this point, I couldn't program.

That's how I became hooked on tcl/tk. I had the fantastically naive ideathat I could learn it sufficiently well to glue these applicationstogether, and so produce a translation memory without having to do muchprogramming. The user interface would be tkXMLiVE (there was the minordrawback there that I would have to brush up my Russian, as all thedocumentation was in that language...). This I would modify to produce adual-window interface.

RSTool would segment the text, and the user would simply overwrite the

segmented text in one of the two windows. I would add a routine toextract the segment pairs from the two windows and convert them toWordfast TM format, and the resulting memories could then be accessedwith either DING or glimpse+tkglim.


This was, of course, utopian. It may well be that the task could be

accomplished with little effort by an experienced tcl developer, but ifone has no experience, the effort needed to analyse someone else's codeis just as great, if not greater, than learning to do it. I learnt quitea bit of tcl/tk in the process but made little progress with anapplication. I also had to do some more translation, in order to eat again.

Then I discovered some of the features of OpenOffice.org, in particularthe Sections function. By this time, I had re-discovered the originalincarnation of OmegaT (this was before my contact with Keith) and wastrying to get it to import OOo files, so I was also familiarizing myselfwith OOo's XML structure. The solution for a very basic TM was there forthe taking. In tcl/tk, I wrote a routine for detecting OOo paragraphboundaries and inserting interleaved OOo sections between them. The textin each paragraph was then copied into the section. Result: when OOo wasopened again, each paragraph was there twice; thanks to OOo's sectionsfunction, the first paragraph was protected and could be hidden ifdesired. The translator simply overwrote the second paragraph. Revisionwas easy, as the paragraphs were interleaved; reading through the finaltext was also easy as the source text could be hidden. Another tclroutine stripped the source segments from file to produce the final version.

The translation memory was the "uncleaned" file, and since source andtarget were kept together, kfind or glimpse were suitable ways ofaccessing these files in the file system.


Once again, I ran out of time, but this time I did manage to produce a
prototype. It doesn't work properly, but the code is fully annotated and
should be comprehensible to anyone who knows tcl/tk, and anyone who is
interested is welcome to it.

The development of OmegaT has made this application largely superfluousfor me personally, but I still see great benefits in the basic concept.Separating the user interface from the search/indexing engine means thatnew user groups can be supported without the whole code having to bere-written.

So this aspect of Free CATS is one which I am very hopeful about. In

particular, providing an in-line TM application for OpenOffice.org wouldmeet a demand which already exists, as some translators would like sucha product.Most are using OOo on Windows, but it would provide a good introductioninto open-source software in general, in the way that OmegaT is alreadydoing.

Having said all that, it has to be technically possible. As I havealready discovered, just because something is a good idea, doesn't meanthat it will work. In the meantime, OmegaT does work, and in my opinionvery well. So that, at the moment, is where I'm directing my efforts.I'll say more about that in another message.


Marc

[Prev in Thread]

Current Thread

[Next in Thread]

[Freecats-Dev] OmegaT (cont., from Marc), Henri Chorand <=

Prev by Date: [Freecats-Dev] JVM (cont.)
Next by Date: [Freecats-Dev] Freecats & OmegaT (from Keith)
Previous by thread: [Freecats-Dev] Re: Freecats-dev Digest, Vol 3, Issue 5
Next by thread: [Freecats-Dev] Freecats & OmegaT (from Keith)
Index(es):
- Date
- Thread