[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Savannah-register-public] [task #5083] Submission of Romenagri Translit

From: Abhishek Choudhary
Subject: [Savannah-register-public] [task #5083] Submission of Romenagri Transliteration System
Date: Fri, 30 Dec 2005 00:25:02 +0000
User-agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows 98; digit_may2002)


                 Summary: Submission of Romenagri Transliteration System
                 Project: Savannah Administration
            Submitted by: hi_pedler
            Submitted on: Fri 12/30/05 at 00:25
         Should Start On: Fri 12/30/05 at 00:00
   Should be Finished on: Mon 01/09/06 at 00:00
                Category: Project Approval
                Priority: 5 - Normal
                  Status: None
                 Privacy: Public
             Assigned to: None
        Percent Complete: 0%
             Open/Closed: Open
                  Effort: 0.00



A new project has been registered at Savannah 
The project account will remain inactive until a site admin approve or
discard the registration.


While this item will be useful to track the registration process, approving
or discarding the registration must be done using the specific "Group
Administration" page, accessible only to site administrators, effectively
logged as site administrators (superuser):


######### REGISTRATION DETAILS ######### 

Full Name:
  Romenagri Transliteration System

System Group Name:

  non-GNU software & documentation

  GNU General Public License V2 or later

  Romenagri is a GPL'd non-ambiguous invertible case and diacritic
independent compiler acceptable transliteration system with the associated
algorithms implemented using GCC for high portability. It may be used for
developing vernacular compilers, besides regular transliteration work. The
authors have independently developed it and demonstrated it to be applicable
to all languages using the North Indian composite syllabic scripts; viz.
Assomiya, Bangla, Devnagri, Gujrati, Oriya and Punjabi. Romenagri utilises
syllabic complements in Roman script for the symbols of the North Indian
scripts. The mapping for a specific script may be a subset of the complete
mapping owing to the absence of certain characters in the specific case, e.g.
the wa and ba of Devnagari match a single symbol in Bangla ba. The words are
formed by actively concatenating successive syllabic compliments, looked up
from a table through an O(n) lookup achieved by using the normalised codes
for the Indian script symbols as an array index. The process of active
concatenation uses a 'de-voweling' operator carat (^), which forms an
equivalent of halanta or hasanta of the Indian scripts and distinguishes the
matra of the vowels by preceding the syllabic compliment of their akshara
form. The de-voweling operator, however, does not appear in the output. The
syllabic compliment looked up from the mapping table is pushed onto a stack.
On encountering a carat as part of a looked-up compliment, the last pushed
vowel character 'a' is popped out of the stack and discarded. The remaining
part of the compliment, after the carat, is then pushed onto the stack. On
encountering the end of a word, the content of the stack is popped to obtain
the required transliteration, after which the stack is flushed.

The process of converting Romenagri back to the Indian script representation
is more complex and is achieved by using a recursive descent parser. The
authors have designed the syllabic compliment so as to facilitate O(n log n)
parsing. The parser operates at 5 levels. The word is submitted at level 1,
and the initial syllabic compliments are consumed. Successive levels are
entered in case of multiple possibilities with the ultimate level identifying
a matra. All other symbols are identified at earlier levels. After each
production the parser enters level 1 with the non-consumed part of the

The only phonetic modifier used in Romenagri is the underscore '_' character,
which generally forms a part of the input set of most compilers. This allows
rule adherent transliteration for keywords written in Indian scripts. The
underscore characters present in the original Indian script text are expanded
to two underscore characters. Hence, the inversion parser treats every paired
underscore as a character and every nascent underscore as a phonetic
modifier. An instance of Romenagri transliteration with corresponding
syllabic compliments is given below.
ka + ^ra + ^i + ya + ^aa = kriyaa

The source code is currently available along with the Hindawi source code,
but this project has a seperate standing technically and academically. The
sources may be downloaded from


Reply to this item at:


  Message sent via/by Savannah

reply via email to

[Prev in Thread] Current Thread [Next in Thread]