4 research outputs found

    English and Malay cross-lingual sentiment lexicon acquisition and analysis

    Get PDF
    Sentiment analysis finds opinions, sentiments or emotions in user-generated contents. Most efforts are focusing on the English language, for which a large amount of sources and tools for sentiment analysis are available. The objective of this paper is to introduce a cross-lingual sentiment lexicon acquisition method for the Malay and English languages and further being test on a set of news test collections. Several part of speech tags are being experimented using the Word Score Summation technique in order to classify the sentiment of the news articles. This method records up to 50% as experimental accuracy result and works better for verbs and negations in both the English and Malay news articles

    Generating a Malay sentiment lexicon based on wordnet

    Get PDF
    Sentiment lexicon is a list of vocabularies that consists of positive and negative words. In opinion mining, sentiment lexicon is one of the important source in text polarity classification task in sentiment analysis model. Studies in Malay sentiment analysis is increasing since the volume of sentiment data is growing on social media. Therefore, requirement in Malay sentiment lexicon is high. However, Malay sentiment lexicon development is a difficult task due to the scarcity of Malay language resource. Thus, various approaches and techniques are used to generate sentiment lexicon. The objective of this paper is to develop Malay sentiment lexicon generation algorithm based on WordNet. In this study, the method is to map the WordNet Bahasa with English WordNet to get the offset value of a seed set of sentiment words. The seed set is used to generate the synonym and antonym semantic relation in English WordNet. The highest result achives 86.58% agreement with human annotators and 91.31% F1-measure in word polarity classification. The result shows the effectiveness of the proposed algorithm to generate Malay sentiment lexicon based on WordNet

    Developing Cross-Lingual Sentiment Analysis Of Malay Twitter Data Using Lexicon-Based Approach

    Get PDF
    Sentiment analysis is a process of detecting and classifying sentiments into positive, negative or neutral. Most sentiment analysis research focus on English lexicon vocabularies. However, Malay is still under-resourced. Research of sentiment analysis in Malaysia social media is challenging due to mixed language usage of English and Malay. The objective of this study was to develop a cross-lingual sentiment analysis using lexicon based approach. Two lexicons of languages are combined in the system, then, the Twitter data were collected and the results were determined using graph. The results showed that the classifier was able to determine the sentiments. This study is significant for companies and governments to understand people's opinion on social network especially in Malay speaking regions

    MELex: a new lexicon for sentiment analysis in mining public opinion of Malaysia affordable housing projects

    Get PDF
    Sentiment analysis has the potential as an analytical tool to understand the preferences of the public. It has become one of the most active and progressively popular areas in information retrieval and text mining. However, in the Malaysia context, the sentiment analysis is still limited due to the lack of sentiment lexicon. Thus, the focus of this study is to a new lexicon and enhance the classification accuracy of sentiment analysis in mining public opinion for Malaysia affordable housing project. The new lexicon for sentiment analysis is constructed by using a bilingual and domain-specific sentiment lexicon approach. A detailed review of existing approaches has been conducted and a new bilingual sentiment lexicon known as MELex (Malay-English Lexicon) has been generated. The developed approach is able to analyze text for two most widely used languages in Malaysia, Malay and English, with better accuracy. The process of constructing MELex involves three activities: seed words selection, polarity assignment and synonym expansions, with four different experiments have been implemented. It is evaluated based on the experimentation and case study approaches where PR1MA and PPAM are selected as case projects. Based on the comparative results over 2,230 testing data, the study reveals that the classification using MELex outperforms the existing approaches with the accuracy achieved for PR1MA and PPAM projects are 90.02% and 89.17%, respectively. This indicates the capabilities of MELex in classifying public sentiment towards PRIMA and PPAM housing projects. The study has shown promising and better results in property domain as compared to the previous research. Hence, the lexicon-based approach implemented in this study can reflect the reliability of the sentiment lexicon in classifying public sentiments
    corecore