1,513 research outputs found
Survey on Publicly Available Sinhala Natural Language Processing Tools and Research
Sinhala is the native language of the Sinhalese people who make up the
largest ethnic group of Sri Lanka. The language belongs to the globe-spanning
language tree, Indo-European. However, due to poverty in both linguistic and
economic capital, Sinhala, in the perspective of Natural Language Processing
tools and research, remains a resource-poor language which has neither the
economic drive its cousin English has nor the sheer push of the law of numbers
a language such as Chinese has. A number of research groups from Sri Lanka have
noticed this dearth and the resultant dire need for proper tools and research
for Sinhala natural language processing. However, due to various reasons, these
attempts seem to lack coordination and awareness of each other. The objective
of this paper is to fill that gap of a comprehensive literature survey of the
publicly available Sinhala natural language tools and research so that the
researchers working in this field can better utilize contributions of their
peers. As such, we shall be uploading this paper to arXiv and perpetually
update it periodically to reflect the advances made in the field
Automatic Keyword Extraction from Dravidian Language
Keywords are significant words in a document that gives description of its content to the reader. They provide the summary of a document. Now a days the amount of electronic text increases rapidly in all the languages. So the text mining applications take the advantage of keywords for processing documents. There are few proposed methods for keyword extraction. But not much work has been done in keyword extraction for Indian languages. With exponential increase in the information in Indian languages on the web, automatic information processing and retrieval become an urgent need. Text Mining is essential for knowledge discovery from valuable texts available in many Indian languages. This paper introduces a method, which extracts the keywords from Dravidian languages of India like Tamil, Telugu, and Kannada. We made an attempt to extract the keywords by using words statistics of a documen
A Comprehensive Review of Sentiment Analysis on Indian Regional Languages: Techniques, Challenges, and Trends
Sentiment analysis (SA) is the process of understanding emotion within a text. It helps identify the opinion, attitude, and tone of a text categorizing it into positive, negative, or neutral. SA is frequently used today as more and more people get a chance to put out their thoughts due to the advent of social media. Sentiment analysis benefits industries around the globe, like finance, advertising, marketing, travel, hospitality, etc. Although the majority of work done in this field is on global languages like English, in recent years, the importance of SA in local languages has also been widely recognized. This has led to considerable research in the analysis of Indian regional languages. This paper comprehensively reviews SA in the following major Indian Regional languages: Marathi, Hindi, Tamil, Telugu, Malayalam, Bengali, Gujarati, and Urdu. Furthermore, this paper presents techniques, challenges, findings, recent research trends, and future scope for enhancing results accuracy
- …