1,513 research outputs found

    Survey on Publicly Available Sinhala Natural Language Processing Tools and Research

    Full text link
    Sinhala is the native language of the Sinhalese people who make up the largest ethnic group of Sri Lanka. The language belongs to the globe-spanning language tree, Indo-European. However, due to poverty in both linguistic and economic capital, Sinhala, in the perspective of Natural Language Processing tools and research, remains a resource-poor language which has neither the economic drive its cousin English has nor the sheer push of the law of numbers a language such as Chinese has. A number of research groups from Sri Lanka have noticed this dearth and the resultant dire need for proper tools and research for Sinhala natural language processing. However, due to various reasons, these attempts seem to lack coordination and awareness of each other. The objective of this paper is to fill that gap of a comprehensive literature survey of the publicly available Sinhala natural language tools and research so that the researchers working in this field can better utilize contributions of their peers. As such, we shall be uploading this paper to arXiv and perpetually update it periodically to reflect the advances made in the field

    Automatic Keyword Extraction from Dravidian Language

    Get PDF
    Keywords are significant words in a document that gives description of its content to the reader. They provide the summary of a document. Now a days the amount of electronic text increases rapidly in all the languages. So the text mining applications take the advantage of keywords for processing documents. There are few proposed methods for keyword extraction. But not much work has been done in keyword extraction for Indian languages. With exponential increase in the information in Indian languages on the web, automatic information processing and retrieval become an urgent need. Text Mining is essential for knowledge discovery from valuable texts available in many Indian languages. This paper introduces a method, which extracts the keywords from Dravidian languages of India like Tamil, Telugu, and Kannada. We made an attempt to extract the keywords by using words statistics of a documen

    A Comprehensive Review of Sentiment Analysis on Indian Regional Languages: Techniques, Challenges, and Trends

    Get PDF
    Sentiment analysis (SA) is the process of understanding emotion within a text. It helps identify the opinion, attitude, and tone of a text categorizing it into positive, negative, or neutral. SA is frequently used today as more and more people get a chance to put out their thoughts due to the advent of social media. Sentiment analysis benefits industries around the globe, like finance, advertising, marketing, travel, hospitality, etc. Although the majority of work done in this field is on global languages like English, in recent years, the importance of SA in local languages has also been widely recognized. This has led to considerable research in the analysis of Indian regional languages. This paper comprehensively reviews SA in the following major Indian Regional languages: Marathi, Hindi, Tamil, Telugu, Malayalam, Bengali, Gujarati, and Urdu. Furthermore, this paper presents techniques, challenges, findings, recent research trends, and future scope for enhancing results accuracy
    • …
    corecore