savannah-register-public
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Savannah-register-public] [task #5082] Submission of Hindawi Vernacular


From: Abhishek Choudhary
Subject: [Savannah-register-public] [task #5082] Submission of Hindawi Vernacular Programming System
Date: Sun, 8 Jan 2006 09:31:35 +0530
User-agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows 98; digit_may2002)

Follow-up Comment #9, task #5082 (project administration):

To be read in conjunction with comment #8

Hi,

Here are some clarifications to issues pointed out by a member of a Hindawi
related discussion group regarding comment #8, with respect to the last
paragraph: (these are my replies to his queries and some which you could
possibly have.)

The final para of comment #8 was:
> Finally, as about Hindawi being *complete* - well, I say 
> *complete* because I have *implemented* or *originally* (*not* 
> forked) localised even *lex* and *yacc*. So we can now have 
> *any* programming language written in Indian languages (Hindi, 
> Bangla, Gujrati, Tamil, Kannada, etc.) This method can also be 
> extended to *every* other human language, but my *personal* 
> focus is on the India languages *for now*.

First, the reference to *lex* here is a reference to the standard family of
*lexical analyser generation tools* or more appropriately the standard
*pattern action language lex*, which automate the task of construction of
lexical analysers. This includes *GNU flex* and for all purposes the compiler
for the *lex language*, as mentioned here, is *flex* and not the original
*lex*, which (probably) is a proprietary program originally written by Eric
Schmidt and Mike Lesk, as per the wiki page http://en.wikipedia.org/wiki/Lex
Please note that Aho, Sethi and Ullman (author of the "Dragon" book - 
Compilers: Principles, Techniques and Tools) refer to the tool or program as
"the Lex compiler", and to its input specification as "the Lex language" (pg
105, sec 3.5, 13th Indian reprint). I choose to adhere to their definition
and refer to the *lex language* by the term *lex*.

Similarly, the reference to *yacc* is a reference to the standard family of
*parser generation tools*, which include *GNU bison*. Original *AT&T yacc*,
developed by Stephen C. Johnson, is *not* proprietary any more as an open
source version of the *original* AT&T yacc is now available with the standard
distributions of Plan 9 and OpenSolaris, as per the wiki page
http://en.wikipedia.org/wiki/Yacc However, for all purposes here, by yacc I
refer to the *program* gnu bison which accepts *the input specification* for
yacc. Link to the original yacc source code distributed under Common
Development and Distribution License
http://cvs.opensolaris.org/source/xref/on/usr/src/cmd/sgs/yacc/

Secondly, since I am refering to *flex*, *yacc* and *bison* programs here, I
need to clarify another point, which possibly relates to *forking*. I have
*not* performed any source code modification on these programs. If (and
whenever) I modify the source for these programs I shall be very happy to
submit the diffs to the *original* authors for inclusion in the original
sources. 

Then what have I done and why do I say that I have *implemented* or
*originally* (*not* forked) localised even *lex* and *yacc*?

I have *originally* localised *the language lex* and *the input specification
for yacc* to Indian languages (and similarly for at least one language
belonging to each of the definitive programming paradigms, but we are only
discussing lex and yacc here, as there may be some confusion regarding their
proprietary nature or concerning the issue of forking). Lex and yacc
languages are called "Shaili Shabda" (the language lex in Hindi, Bangla,
Gujrati etc.; Shabda means word, which is used here to imply "token") and
"Shaili Vyaaka" (the input specification for yacc in Hindi, Bangla, Gujrati
etc.; it is also called Shaili Vyaakaran in full; Vyaakaran means grammar).
These are *original* because, obviously, these are *not* copied or forked
from any other programming system, and *no* other similar system exists, to
the best of my knowledge.

What I mean by *implemented* is that I have implemented the languages Shaili
Shabda and Shaili Vyaaka, as decribed above along with other vernacular
languages. The current implementation of the tools for these languages is as
a front-end compiler. This is similar to the way in which C++ was first
implemented as a front-end compiler to C, called CFront. These front-end
compilers generate intermediate code which can be accepted by various
back-ends. For my distribution of Hindawi, I have choosen GCC as back-end.
Someone else may choose some other back-end, as GPL allows them to do so. The
benefits of having seperate front-ends and back-ends for compilers certainly
do not need to be over-emphasised! (With issues such as optimisation, a new
optimising compiler would not be worth the effort when we already have GCC as
a state-of-the-art optimising compiler as a back-end.)

Third, there is a issue that flex and bison could themselves be localised to
support Indian languages. Well, yes, but how much, without a major rewrite?
Let me try to explain this, though I am not sure that I can recollect every
practical problem I faced. We can certainly write a lexer with flex which
accepts 8-bit Indic code, but for even a moderate sized programming language,
this 8-bit lexer would be tremendously huge if optimised for speed, and very
slow if optimised for size. I tried this with flex initially, but when it
kept crashing for a moderate set of tokens for Indic programming languages, I
decided to adopt a new approach. I could have continued with modifying flex,
but the effort required would be orders of magnitude greater, requiring
changes to internal structures and much more. Along with this, consider the
fact that Shabda and Vyaak are not intended as replacements for lex and yacc.
These are intended to support Indic programming languages. Hence, even the
action statements and other stuff such as buffer funtions, error handlers,
startup-code etc. are to be written in Indic programming languages (in this
case Shaili Guru, which is the C programming language localised to Indian
languages), hence a new language was inevitable. Similar reasons can be
stated for yacc as well.

Regards,
Abhishek Choudhary

    _______________________________________________________

Reply to this item at:

  <http://savannah.gnu.org/task/?func=detailitem&item_id=5082>

_______________________________________________
  Message sent via/by Savannah
  http://savannah.gnu.org/





reply via email to

[Prev in Thread] Current Thread [Next in Thread]