174,464 research outputs found
Named Entity Recognizer for Telugu language using Hybrid approach
The main goal of Named Entity Recognition (NER) is to classify all Named Entities (NE) in a document into predefined classes like Person name, Location name, Organization name and Miscellaneous. This paper outlines Named Entity Recognizer using hybrid approach i.e., combination of Rule based approach and one of the Machine learning technique i.e, Conditional Random Field (CRF). In Rule based approach we have prepared Gazetteer lists for names of persons, locations and organizations; some suffix and prefix features and dictionary consisting 350266 words to recognize the category of named entities. If ambiguity is rised while we are using Rule based approach, we use Machine learning technique i.e., CRF in order to improve the accuracy
Hybrid Approach to English-Hindi Name Entity Transliteration
Machine translation (MT) research in Indian languages is still in its
infancy. Not much work has been done in proper transliteration of name entities
in this domain. In this paper we address this issue. We have used English-Hindi
language pair for our experiments and have used a hybrid approach. At first we
have processed English words using a rule based approach which extracts
individual phonemes from the words and then we have applied statistical
approach which converts the English into its equivalent Hindi phoneme and in
turn the corresponding Hindi word. Through this approach we have attained
83.40% accuracy.Comment: Proceedings of IEEE Students' Conference on Electrical, Electronics
and Computer Sciences 201
Rule Based Transliteration Scheme for English to Punjabi
Machine Transliteration has come out to be an emerging and a very important research area in the field of machine translation. Transliteration basically aims to preserve the phonological structure of words. Proper transliteration of name entities plays a very significant role in improving the quality of machine translation. In this paper we are doing machine transliteration for English-Punjabi language pair using rule based approach. We have constructed some rules for syllabification. Syllabification is the process to extract or separate the syllable from the words. In this we are calculating the probabilities for name entities (Proper names and location). For those words which do not come under the category of name entities, separate probabilities are being calculated by using relative frequency through a statistical machine translation toolkit known as MOSES. Using these probabilities we are transliterating our input text from English to Punjabi
- …