[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Savannah-register-public] [task #5083] Submission of Romenagri Translit
From: |
Abhishek Choudhary |
Subject: |
[Savannah-register-public] [task #5083] Submission of Romenagri Transliteration System |
Date: |
Fri, 30 Dec 2005 00:25:02 +0000 |
User-agent: |
Mozilla/4.0 (compatible; MSIE 6.0; Windows 98; digit_may2002) |
URL:
<http://savannah.gnu.org/task/?func=detailitem&item_id=5083>
Summary: Submission of Romenagri Transliteration System
Project: Savannah Administration
Submitted by: hi_pedler
Submitted on: Fri 12/30/05 at 00:25
Should Start On: Fri 12/30/05 at 00:00
Should be Finished on: Mon 01/09/06 at 00:00
Category: Project Approval
Priority: 5 - Normal
Status: None
Privacy: Public
Assigned to: None
Percent Complete: 0%
Open/Closed: Open
Effort: 0.00
_______________________________________________________
Details:
A new project has been registered at Savannah
The project account will remain inactive until a site admin approve or
discard the registration.
######### REGISTRATION ADMINISTRATION #########
While this item will be useful to track the registration process, approving
or discarding the registration must be done using the specific "Group
Administration" page, accessible only to site administrators, effectively
logged as site administrators (superuser):
<https://savannah.gnu.org/admin/groupedit.php?group_id=8230>
######### REGISTRATION DETAILS #########
Full Name:
----------
Romenagri Transliteration System
System Group Name:
-----------------
romenagri
Type:
-----
non-GNU software & documentation
License:
--------
GNU General Public License V2 or later
Description:
------------
Romenagri is a GPL'd non-ambiguous invertible case and diacritic
independent compiler acceptable transliteration system with the associated
algorithms implemented using GCC for high portability. It may be used for
developing vernacular compilers, besides regular transliteration work. The
authors have independently developed it and demonstrated it to be applicable
to all languages using the North Indian composite syllabic scripts; viz.
Assomiya, Bangla, Devnagri, Gujrati, Oriya and Punjabi. Romenagri utilises
syllabic complements in Roman script for the symbols of the North Indian
scripts. The mapping for a specific script may be a subset of the complete
mapping owing to the absence of certain characters in the specific case, e.g.
the wa and ba of Devnagari match a single symbol in Bangla ba. The words are
formed by actively concatenating successive syllabic compliments, looked up
from a table through an O(n) lookup achieved by using the normalised codes
for the Indian script symbols as an array index. The process of active
concatenation uses a 'de-voweling' operator carat (^), which forms an
equivalent of halanta or hasanta of the Indian scripts and distinguishes the
matra of the vowels by preceding the syllabic compliment of their akshara
form. The de-voweling operator, however, does not appear in the output. The
syllabic compliment looked up from the mapping table is pushed onto a stack.
On encountering a carat as part of a looked-up compliment, the last pushed
vowel character 'a' is popped out of the stack and discarded. The remaining
part of the compliment, after the carat, is then pushed onto the stack. On
encountering the end of a word, the content of the stack is popped to obtain
the required transliteration, after which the stack is flushed.
The process of converting Romenagri back to the Indian script representation
is more complex and is achieved by using a recursive descent parser. The
authors have designed the syllabic compliment so as to facilitate O(n log n)
parsing. The parser operates at 5 levels. The word is submitted at level 1,
and the initial syllabic compliments are consumed. Successive levels are
entered in case of multiple possibilities with the ultimate level identifying
a matra. All other symbols are identified at earlier levels. After each
production the parser enters level 1 with the non-consumed part of the
input.
The only phonetic modifier used in Romenagri is the underscore '_' character,
which generally forms a part of the input set of most compilers. This allows
rule adherent transliteration for keywords written in Indian scripts. The
underscore characters present in the original Indian script text are expanded
to two underscore characters. Hence, the inversion parser treats every paired
underscore as a character and every nascent underscore as a phonetic
modifier. An instance of Romenagri transliteration with corresponding
syllabic compliments is given below.
ka + ^ra + ^i + ya + ^aa = kriyaa
The source code is currently available along with the Hindawi source code,
but this project has a seperate standing technically and academically. The
sources may be downloaded from http://www.indicybers.com
_______________________________________________________
Reply to this item at:
<http://savannah.gnu.org/task/?func=detailitem&item_id=5083>
_______________________________________________
Message sent via/by Savannah
http://savannah.gnu.org/
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- [Savannah-register-public] [task #5083] Submission of Romenagri Transliteration System,
Abhishek Choudhary <=