143 research outputs found

    A context based model for sentiment analysis in twitter for the italian language

    Get PDF
    Studi recenti per la Sentiment Analysis in Twitter hanno tentato di creare modelli per caratterizzare la polarit´a di un tweet osservando ciascun messaggio in isolamento. In realt`a, i tweet fanno parte di conversazioni, la cui natura pu`o essere sfruttata per migliorare la qualit`a dell’analisi da parte di sistemi automatici. In (Vanzo et al., 2014) `e stato proposto un modello basato sulla classificazione di sequenze per la caratterizzazione della polarit` a dei tweet, che sfrutta il contesto in cui il messaggio `e immerso. In questo lavoro, si vuole verificare l’applicabilit`a di tale metodologia anche per la lingua Italiana.Recent works on Sentiment Analysis over Twitter leverage the idea that the sentiment depends on a single incoming tweet. However, tweets are plunged into streams of posts, thus making available a wider context. The contribution of this information has been recently investigated for the English language by modeling the polarity detection as a sequential classification task over streams of tweets (Vanzo et al., 2014). Here, we want to verify the applicability of this method even for a morphological richer language, i.e. Italian

    Opinion mining: Reviewed from word to document level

    Get PDF
    International audienceOpinion mining is one of the most challenging tasks of the field of information retrieval. Research community has been publishing a number of articles on this topic but a significant increase in interest has been observed during the past decade especially after the launch of several online social networks. In this paper, we provide a very detailed overview of the related work of opinion mining. Following features of our review make it stand unique among the works of similar kind: (1) it presents a very different perspective of the opinion mining field by discussing the work on different granularity levels (like word, sentences, and document levels) which is very unique and much required, (2) discussion of the related work in terms of challenges of the field of opinion mining, (3) document level discussion of the related work gives an overview of opinion mining task in blogosphere, one of most popular online social network, and (4) highlights the importance of online social networks for opinion mining task and other related sub-tasks

    A Supervised Approach for Sentiment Analysis using Skipgrams and its Application to Sentiment Visualisation in Social Media

    Get PDF
    In this Ph.D. thesis we propose, as fundamental research, the design, development and evaluation of a supervised approach for sentiment analysis. This work is based on the hypothesis that an efficient use of the skipgram modelling can improve sentiment analysis tasks and reduce the resources they need. In summary, it consists on a supervised approach that uses machine learning techniques and skipgrams as information units, mainly focused on skipgram selection and filtering. This approach will be evaluated and compared to current state-of-the-art techniques. In addition, as applied research we propose a sentiment visualisation tool, strongly integrated with our sentiment analysis approach. This tool is oriented in the context of social media, measuring reputation and user interactions in real time.This research work has been partially funded by Generalitat Valenciana through project “SIIA: Tecnologías del lenguaje humano para una sociedad inclusiva, igualitaria, y accesible" with grant reference PROMETEU/2018/089, and by the Spanish Government and FEDER through the project RTI2018-094653-B-C22: “Modelang: Modeling the behavior of digital entities by Human Language Technologies" (“LIVING-LANG: Living Digital Entities by Human Language Technologies")

    Cross-domain polarity classification using a knowledge-enhanced meta-classifier

    Get PDF
    Current approaches to single and cross-domain polarity classification usually use bag of words, n-grams or lexical resource-based classifiers. In this paper, we propose the use of meta-learning to combine and enrich those approaches by adding also other knowledge-based features. In addition to the aforementioned classical approaches, our system uses the BabelNet multilingual semantic network to generate features derived from word sense disambiguation and vocabulary expansion. Experimental results show state-of-the-art performance on single and cross-domain polarity classification. Contrary to other approaches, ours is generic. These results were obtained without any domain adaptation technique. Moreover, the use of meta-learning allows our approach to obtain the most stable results across domains. Finally, our empirical analysis provides interesting insights on the use of semantic network-based features.European Comission WIQ-EI IRSES (No. 269180)Ministerio de EconomĂ­a y Competitividad TIN2012-38603-C02-01Ministerio de EconomĂ­a y Competitividad TIN2012-38536-C03-02Junta de AndalucĂ­a P11-TIC-7684 M

    A Supervised Approach for Sentiment Lexicon Generation using Word Skipgrams

    Get PDF
    This Ph.D. thesis work proposes the design, development and evaluation of a supervised approach for sentiment lexicon generation. It is based on the hypothesis that an efficient use of the skipgram modelling can improve sentiment analysis tasks and reduce the resources needed maintaining an acceptable level of quality. In summary, the novelty of this approach lies in the use of skipgrams as information units and the way they are efficiently generated, weighed and filtered, taking advantage of the useful information they provide about the sequentiality of the language.This research work has been supported by TRIVIAL (PID2021-122263OB-C22) funded by MCIN/AEI/10.13039/501100011033 and by “European Union Regional Development Fund (ERDF) A way of making Europe”, by the “European Union NextGenerationEU/PRTR”

    A Hybrid Framework for Sentiment Analysis Using Genetic Algorithm Based Feature Reduction

    Get PDF
    © 2019 IEEE. Due to the rapid development of Internet technologies and social media, sentiment analysis has become an important opinion mining technique. Recent research work has described the effectiveness of different sentiment classification techniques ranging from simple rule-based and lexicon-based approaches to more complex machine learning algorithms. While lexicon-based approaches have suffered from the lack of dictionaries and labeled data, machine learning approaches have fallen short in terms of accuracy. This paper proposes an integrated framework which bridges the gap between lexicon-based and machine learning approaches to achieve better accuracy and scalability. To solve the scalability issue that arises as the feature-set grows, a novel genetic algorithm (GA)-based feature reduction technique is proposed. By using this hybrid approach, we are able to reduce the feature-set size by up to 42% without compromising the accuracy. The comparison of our feature reduction technique with more widely used principal component analysis (PCA) and latent semantic analysis (LSA) based feature reduction techniques have shown up to 15.4% increased accuracy over PCA and up to 40.2% increased accuracy over LSA. Furthermore, we also evaluate our sentiment analysis framework on other metrics including precision, recall, F-measure, and feature size. In order to demonstrate the efficacy of GA-based designs, we also propose a novel cross-disciplinary area of geopolitics as a case study application for our sentiment analysis framework. The experiment results have shown to accurately measure public sentiments and views regarding various topics such as terrorism, global conflicts, and social issues. We envisage the applicability of our proposed work in various areas including security and surveillance, law-and-order, and public administration

    Improving Spanish Polarity Classification Combining Different Linguistic Resources

    Get PDF
    Sentiment analysis is a challenging task which is attracting the attention of researchers. However, most of work is only focused on English documents, perhaps due to the lack of linguistic resources for other languages. In this paper, we present several Spanish opinion mining resources in order to develop a polarity classification system. In addition, we propose the combination of different features extracted from each resource in order to train a classifier over two different opinion corpora. We prove that the integration of knowledge from several resources can improve the final Spanish polarity classification system. The good results encourage us to continue developing sentiment resources for Spanish, and studying the combination of features extracted from different resourcesMinisterio de Economía y Competitividad TIN2012-38536-C03-0Junta de Andalucía P11-TIC-7684Universidad de Jaén CEATIC-2013-0

    ADAPT at IJCNLP-2017 Task 4: a multinomial naive Bayes classification approach for customer feedback analysis task

    Get PDF
    In this age of the digital economy, promoting organisations attempt their best to engage the customers in the feedback provisioning process. With the assistance of customer insights, an organisation can develop a better product and provide a better service to its customer. In this paper, we analyse the real world samples of customer feedback from Microsoft Office customers in four languages, i.e., English, French, Spanish and Japanese and conclude a five-plus-one-classes categorisation (comment, request, bug, complaint, meaningless and undetermined) for meaning classification. The task is to determine what class(es) the customer feedback sentences should be annotated as in four languages. We propose following approaches to accomplish this task: (i) a multinomial naive bayes (MNB) approach for multilabel classification, (ii) MNB with one-vsrest classifier approach, and (iii) the combination of the multilabel classification based and the sentiment classification based approach. Our best system produces F-scores of 0.67, 0.83, 0.72 and 0.7 for English, Spanish, French and Japanese, respectively. The results are competitive to the best ones for all languages and secure 3 rd and 5 the position for Japanese and French, respectively, among all submitted systems

    A review of key planning and scheduling in the rail industry in Europe and UK

    Get PDF
    Planning and scheduling activities within the rail industry have benefited from developments in computer-based simulation and modelling techniques over the last 25 years. Increasingly, the use of computational intelligence in such tasks is featuring more heavily in research publications. This paper examines a number of common rail-based planning and scheduling activities and how they benefit from five broad technology approaches. Summary tables of papers are provided relating to rail planning and scheduling activities and to the use of expert and decision systems in the rail industry.EPSR

    Contextual semantics for sentiment analysis of Twitter

    Get PDF
    Sentiment analysis on Twitter has attracted much attention recently due to its wide applications in both, commercial and public sectors. In this paper we present SentiCircles, a lexicon-based approach for sentiment analysis on Twitter. Different from typical lexicon-based approaches, which offer a fixed and static prior sentiment polarities of words regardless of their context, SentiCircles takes into account the co-occurrence patterns of words in different contexts in tweets to capture their semantics and update their pre-assigned strength and polarity in sentiment lexicons accordingly. Our approach allows for the detection of sentiment at both entity-level and tweet-level. We evaluate our proposed approach on three Twitter datasets using three different sentiment lexicons to derive word prior sentiments. Results show that our approach significantly outperforms the baselines in accuracy and F-measure for entity-level subjectivity (neutral vs. polar) and polarity (positive vs. negative) detections. For tweet-level sentiment detection, our approach performs better than the state-of-the-art SentiStrength by 4–5% in accuracy in two datasets, but falls marginally behind by 1% in F-measure in the third dataset
    • …
    corecore