Search CORE

2 research outputs found

Using sentiment analysis technique for analyzing Thai customer satisfaction from social media

Author: Chumwatana Todsanai
Publication venue
Publication date: 11/08/2015
Field of study

With the rapidly increasing number of Thai online customer reviews available in social media and websites, sentiment analysis technique, also called opinion mining, has become an important task in the past few years.This technique aims to analyze people’s emotions, opinion, attitudes and sentiments.The classical approaches for opinion mining represents the reviews as bag-of-words as many words can be used to identify positive or negative feedbacks.This makes these methods work well with European language reviews which are segmented texts.However, these bag-of-word based methods face problem with Thai customer’s review which is non-segmented text, since Thai texts are formed as a long sequence of characters without word boundaries.Up to now, not much research conducted on sentiment analysis for Thai customer reviews.This paper proposes a sentiment analysis technique for Thai customer’s reviews.The proposed technique is based on the integration of Thai word extraction and sentiment analysis techniques for mining Thai customer’s opinion. To demonstrate the proposed technique, experimental studies on analyzing Thai customer’s reviews from social media are presented in this paper.The results show that the proposed method provides significant benefits for mining Thai customer’s opinion from social media

UUM Repository

Resource Generation from Structured Documents for Low-density Languages

Author: Karagol-Ayan Burcu
Publication venue
Publication date: 27/08/2007
Field of study

The availability and use of electronic resources for both manual and automated language related processing has increased tremendously in recent years. Nevertheless, many resources still exist only in printed form, restricting their availability and use. This especially holds true in low density languages or languages with limited electronic resources. For these documents, automated conversion into electronic resources is highly desirable. This thesis focuses on the semi-automated conversion of printed structured documents (dictionaries in particular) to usable electronic representations. In the first part we present an entry tagging system that recognizes, parses, and tags the entries of a printed dictionary to reproduce the representation. The system uses the consistent layout and structure of the dictionaries, and the features that impose this structure, to capture and recover lexicographic information. We accomplish this by adapting two methods: rule-based and HMM-based. The system is designed to produce results quickly with minimal human assistance and reasonable accuracy. The use of an adaptive transformation-based learning as a post-processor at two points in the system yields significant improvements, even with an extremely small amount of user provided training data. The second part of this thesis presents Morphology Induction from Noisy Data (MIND), a natural language morphology discovery framework that operates on information from limited, noisy data obtained from the conversion process. To use the resulting resources effectively, however, users must be able to search for them using the root form of morphologically deformed variant found in the text. Stemming and data driven methods are not suitable when data are sparse. The approach is based on the novel application of string searching algorithms. The evaluations show that MIND can segment words into roots and affixes from the noisy, limited data contained in a dictionary, and it can extract prefixes, suffixes, circumfixes, and infixes. MIND can also identify morphophonemic changes, i.e., phonemic variations between allomorphs of a morpheme, specifically point-of-affixation stem changes. This, in turn, allows non-native speakers to perform multilingual tasks for applications where response must be rapid, and they have limited knowledge. In addition, this analysis can feed other natural language processing tools requiring lexicons

Digital Repository at the University of Maryland