Online Lexical Knowledge Base for Marathi
 

The Prime Minister of India will release the Marathi CD which contains software on Marathi Language Processing developed at IIT Bombay and CDAC. IIT Bombay's contribution has been in the creation of high end tools and resources which include the Marathi Wordnet, spell checker and e-dictionaries. These tools and resources have been developed at the Centre for Indian Language Technologies (CFILT) at the Computer Science and Engineering Department at IIT Bombay, under the project Technology Development for Indian Languages. The CD will be distributed through CDAC.

An online lexical knowledge base for Marathi language, the Marathi Wordnet contains sets of synomymous words called synsets linked by the semantic relations of hypernymy, meronymy and cross part of speech linkage among others. Currently, there are about 10,000 synsets which correspond to about 22,000 unique words. On an average, 23 synsets are created and 12 corrected everyday. The web interface is at www.cfilt.iitb.ac.in/wordnet/webmwn. When completed with about 25,000 synsets, the resource will prove indispensable for automatic machine translation from and into Marathi, language teaching and text mining and information extraction applications. The resource has already been linked with the Global Wordnet Grid: www.globalwordnet.org.

Contact: pb@cse.iitb.ac.in