savannah-register-public
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Savannah-register-public] [task #7015] Submission of Enhanced Brill's P


From: Golam Mortuza Hossain
Subject: [Savannah-register-public] [task #7015] Submission of Enhanced Brill's Parts-of-Speech Tagger
Date: Sun, 17 Jun 2007 15:00:19 +0000
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.3) Gecko/20061201 Firefox/2.0.0.3 (Ubuntu-feisty)

URL:
  <http://savannah.nongnu.org/task/?7015>

                 Summary: Submission of Enhanced Brill's Parts-of-Speech
Tagger
                 Project: Savannah Administration
            Submitted by: golam
            Submitted on: Sunday 06/17/2007 at 15:00
         Should Start On: Sunday 06/17/2007 at 00:00
   Should be Finished on: Wednesday 06/27/2007 at 00:00
                Category: Project Approval
                Priority: 5 - Normal
                  Status: None
                 Privacy: Public
        Percent Complete: 0%
             Assigned to: None
             Open/Closed: Open
         Discussion Lock: Any
                  Effort: 0.00

    _______________________________________________________

Details:

A new project has been registered at Savannah 
This project account will remain inactive until a site admin approves or
discards the registration.


= Registration Administration =

While this item will be useful to track the registration process, *approving
or discarding the registration must be done using the specific Group
Administration
<https://savannah.nongnu.org/siteadmin/groupedit.php?group_id=9342> page*,
accessible only to site administrators, effectively *logged as site
administrators* (superuser):

* Group Administration
<https://savannah.nongnu.org/siteadmin/groupedit.php?group_id=9342>


= Registration Details =

* Name: *Enhanced Brill's Parts-of-Speech Tagger*
* System Name:  *gposttl*
* Type: non-GNU software & documentation
* License: Other (The following licence which is GPL-compatible, applies to
the part originally written by Eric Brill. This part is
marked with copyright notices. The rest of the program is
licensed under GPL v2, see further down.
______________________________________________________________________

        License for the part of the program written by Eric Brill
______________________________________________________________________

This software was written by Eric Brill.

This software is being provided to you, the LICENSEE, by the 
Massachusetts Institute of Technology (M.I.T.) under the following 
license.  By obtaining, using and/or copying this software, you agree 
that you have read, understood, and will comply with these terms and 
conditions:  

Permission to [use, copy, modify and distribute, including the right to 
grant others rights to distribute at any tier, this software and its 
documentation for any purpose and without fee or royalty] is hereby 
granted, provided that you agree to comply with the following copyright 
notice and statements, including the disclaimer, and that the same 
appear on ALL copies of the software and documentation, including 
modifications that you make for internal use or for distribution:


Copyright 1993 by the Massachusetts Institute of Technology and the
University of Pennsylvania.  All rights reserved.  

THIS SOFTWARE IS PROVIDED "AS IS", AND M.I.T. MAKES NO REPRESENTATIONS 
OR WARRANTIES, EXPRESS OR IMPLIED.  By way of example, but not 
limitation, M.I.T. MAKES NO REPRESENTATIONS OR WARRANTIES OF 
MERCHANTABILITY OR FITNESS FOR ANY PARTICULAR PURPOSE OR THAT THE USE OF 
THE LICENSED SOFTWARE OR DOCUMENTATION WILL NOT INFRINGE ANY THIRD PARTY 
PATENTS, COPYRIGHTS, TRADEMARKS OR OTHER RIGHTS.   

The name of the Massachusetts Institute of Technology or M.I.T. may NOT 
be used in advertising or publicity pertaining to distribution of the 
software.  Title to copyright in this software and any associated 
documentation shall at all times remain with M.I.T., and USER agrees to 
preserve same.  
______________________________________________________________________

        License for the rest of the program 
______________________________________________________________________

                    GNU GENERAL PUBLIC LICENSE
                       Version 2)

----

==== Description: ====
                    
                        GPoSTTL 

(Brill's Parts-of-Speech Tagger, with built-in Tokenizer and Lemmatizer)

GPoSTTL is an enhanced version of Brill's rule-based Parts-of-Speech Tagger
for English, with built-in Tokenizer and Lemmatizer. It reads from FILE or
STDIN and writes to STDOUT. It is based on LPost package by Jimmy Lin
(jimmylin at umd.edu). LPost itself is based on Benjamin Han's ePost package,
which is a cleaned-up version of Eric Brill's original code. The primary lemma
list was taken from e_lemma.txt (Ver.1), complied by Prof. Yasumasa Someya
(someya at someya-net.com), with permission. Later it has been and being
enhanced by hundreds of additional entries. 

Motivations:

    * GPoSTTL has been developed as a free software alternative for
TreeTagger [1], a non-free Penn Treebank tagger developed by Prof. Helmut
Schmid. GPoSTTL can be used as a drop-in substitute for TreeTagger.

As an explicit case, GPoSTTL is used as a crucial component of Anubadok[2], a
GPL'ed machine translator for English to Bengali. 


The default mode of GPoSTTL uses enhanced Penn tagset to make its output
compatible with the output of TreeTagger. In particular, second letter of the
verb tags distinguishes between "be" verbs (B), "have" verbs (H) and other
verbs (V). The enhancement is done at last step of tagging procedure as its
lexicon contains the original Penn tagset.


GPoSTTL is written in C and the source code is available
from http://www.imsc.res.in/~golam/gposttl/

Ref:

[1]
http://www.ims.uni-stuttgart.de/projekte/corplex/TreeTagger/DecisionTreeTagger.html

[2] http://www.imsc.res.in/~golam/anubadok/





==== Other Software Required: ====
The program doesn't have any external dependency apart from
those available in a free operating system.


==== Other Comments: ====
The project is currently hosted in my personal webpage at IMSc[1].
Recently, I have left IMSc after completing my PhD. So the current
web space will expire withing few months.  

[1] http://www.imsc.res.in/~golam/







    _______________________________________________________

Reply to this item at:

  <http://savannah.nongnu.org/task/?7015>

_______________________________________________
  Message sent via/by Savannah
  http://savannah.nongnu.org/





reply via email to

[Prev in Thread] Current Thread [Next in Thread]