28 research outputs found
Evaluation of Computational Grammar Formalisms for Indian Languages
Natural Language Parsing has been the most prominent research area since the genesis of Natural Language Processing. Probabilistic Parsers are being developed to make the process of parser development much easier, accurate and fast. In Indian context, identification of which Computational Grammar Formalism is to be used is still a question which needs to be answered. In this paper we focus on this problem and try to analyze different formalisms for Indian languages
Input Scheme for Hindi Using Phonetic Mapping
Written Communication on Computers requires knowledge of writing text for the desired language using Computer. Mostly people do not use any other language besides English. This creates a barrier. To resolve this issue we have developed a scheme to input text in Hindi using phonetic mapping scheme. Using this scheme we generate intermediate code strings and match them with pronunciations of input text. Our system show significant success over other input systems available
Plagiarism Detection: Keeping Check on Misuse of Intellectual Property
Today, Plagiarism has become a menace. Every journal editor or conference organizers has to deal with this problem. Simply Copying or rephrasing of text without giving due credit to the original author has become more common. This is considered to be an Intellectual Property Theft. We are developing a Plagiarism Detection Tool which would deal with this problem. In this paper we discuss the common tools available to detect plagiarism and their short comings and the advantages of our tool over these tools
Design of English-Hindi Translation Memory for Efficient Translation
Developing parallel corpora is an important and a difficult activity for
Machine Translation. This requires manual annotation by Human Translators.
Translating same text again is a useless activity. There are tools available to
implement this for European Languages, but no such tool is available for Indian
Languages. In this paper we present a tool for Indian Languages which not only
provides automatic translations of the previously available translation but
also provides multiple translations, in cases where a sentence has multiple
translations, in ranked list of suggestive translations for a sentence.
Moreover this tool also lets translators have global and local saving options
of their work, so that they may share it with others, which further lightens
the task.Comment: Proceedings of National Conference in Recent Advances in Computer
Engineering, 201
OntoAna: Domain Ontology for Human Anatomy
Today, we can find many search engines which provide us with information
which is more operational in nature. None of the search engines provide domain
specific information. This becomes very troublesome to a novice user who wishes
to have information in a particular domain. In this paper, we have developed an
ontology which can be used by a domain specific search engine. We have
developed an ontology on human anatomy, which captures information regarding
cardiovascular system, digestive system, skeleton and nervous system. This
information can be used by people working in medical and health care domain.Comment: Proceedings of 5th CSI National Conference on Education and Research.
Organized by Lingayay University, Faridabad. Sponsored by Computer Society of
India and IEEE Delhi Chapter. Proceedings published by Lingayay University
Pres
Part of Speech Tagging of Marathi Text Using Trigram Method
In this paper we present a Marathi part of speech tagger. It is a morphologically rich language. It is spoken by the native people of Maharashtra. The general approach used for development of tagger is statistical using trigram Method. The main concept of trigram is to explore the most likely POS for a token based on given information of previous two tags by calculating probabilities to determine which is the best sequence of a tag. In this paper we show the development of the tagger. Moreover we have also shown the evaluation done
A Lightweight Stemmer for Gujarati
Gujarati is a resource poor language with almost no language processing tools being available. In this paper we have shown an implementation of a rule based stemmer of Gujarati. We have shown the creation of rules for stemming and the richness in morphology that Gujarati possesses. We have also evaluated our results by verifying it with a human expert