41 research outputs found

    Part of Speech Tagging of Marathi Text Using Trigram Method

    Get PDF
    In this paper we present a Marathi part of speech tagger. It is a morphologically rich language. It is spoken by the native people of Maharashtra. The general approach used for development of tagger is statistical using trigram Method. The main concept of trigram is to explore the most likely POS for a token based on given information of previous two tags by calculating probabilities to determine which is the best sequence of a tag. In this paper we show the development of the tagger. Moreover we have also shown the evaluation done

    Part of Speech Tagging of Marathi Text Using Trigram Method

    Full text link

    Context-based Sentiment analysis of Indian Marathi Text using Deep Learning

    Get PDF
    In Digital India, the Internet plays a crucial role in communication. The English language is widely used for such a process. The Internet has no language barrier. India is a multi-lingual country with boundless linguistic and social diversities. The most trending pattern observed in India is people intend to post their views, thoughts, feedback, and comments in their mother tongue over social media and blogs. Views posted by people is important for organization belonging to any category small, medium and large enterprises to improve their product or service. This data is hastily accumulated every day which should be necessary to identify and process. In terms of processing little work has been done for Indian languages where traditional approaches were used which are far away from the context of the text. In this research to perform sentiment analysis supervised algorithms that is Multinomial Naïve Bayes is implemented on the Marathi dataset. Along with this deep learning, Natural Language Processing approach Bidirectional Encoder Representations from Transformers (BERT) is utilized and fine-tuned for the specific work to evaluate more accuracy and State-of-the-Art results

    Exploring sentence level query expansion in language modeling based information retrieval

    Get PDF
    We introduce two novel methods for query expansion in information retrieval (IR). The basis of these methods is to add the most similar sentences extracted from pseudo-relevant documents to the original query. The first method adds a fixed number of sentences to the original query, the second a progressively decreasing number of sentences. We evaluate these methods on the English and Bengali test collections from the FIRE workshops. The major findings of this study are that: i) performance is similar for both English and Bengali; ii) employing a smaller context (similar sentences) yields a considerably higher mean average precision (MAP) compared to extracting terms from full documents (up to 5.9% improvemnent in MAP for English and 10.7% for Bengali compared to standard Blind Relevance Feedback (BRF); iii) using a variable number of sentences for query expansion performs better and shows less variance in the best MAP for different parameter settings; iv) query expansion based on sentences can improve performance even for topics with low initial retrieval precision where standard BRF fails
    corecore