29 research outputs found
Recommended from our members
Semantic Sentiment Analysis of Microblogs
Microblogs and social media platforms are now considered among the most popular forms of online communication. Through a platform like Twitter, much information reflecting people's opinions and attitudes is published and shared among users on a daily basis. This has recently brought great opportunities to companies interested in tracking and monitoring the reputation of their brands and businesses, and to policy makers and politicians to support their assessment of public opinions about their policies or political issues.
A wide range of approaches to sentiment analysis on Twitter, and other similar microblogging platforms, have been recently built. Most of these approaches rely mainly on the presence of affect words or syntactic structures that explicitly and unambiguously reflect sentiment (e.g., "great'', "terrible''). However, these approaches are semantically weak, that is, they do not account for the semantics of words when detecting their sentiment in text. This is problematic since the sentiment of words, in many cases, is associated with their semantics, either along the context they occur within (e.g., "great'' is negative in the context "pain'') or the conceptual meaning associated with the words (e.g., "Ebola" is negative when its associated semantic concept is "Virus").
This thesis investigates the role of words' semantics in sentiment analysis of microblogs, aiming mainly at addressing the above problem. In particular, Twitter is used as a case study of microblogging platforms to investigate whether capturing the sentiment of words with respect to their semantics leads to more accurate sentiment analysis models on Twitter. To this end, several approaches are proposed in this thesis for extracting and incorporating two types of word semantics for sentiment analysis: contextual semantics (i.e., semantics captured from words' co-occurrences) and conceptual semantics (i.e., semantics extracted from external knowledge sources).
Experiments are conducted with both types of semantics by assessing their impact in three popular sentiment analysis tasks on Twitter; entity-level sentiment analysis, tweet-level sentiment analysis and context-sensitive sentiment lexicon adaptation. Evaluation under each sentiment analysis task includes several sentiment lexicons, and up to 9 Twitter datasets of different characteristics, as well as comparing against several state-of-the-art sentiment analysis approaches widely used in the literature.
The findings from this body of work demonstrate the value of using semantics in sentiment analysis on Twitter. The proposed approaches, which consider words' semantics for sentiment analysis at both, entity and tweet levels, surpass non-semantic approaches in most datasets
Multilingual sentiment analysis in social media.
252 p.This thesis addresses the task of analysing sentiment in messages coming from social media. The ultimate goal was to develop a Sentiment Analysis system for Basque. However, because of the socio-linguistic reality of the Basque language a tool providing only analysis for Basque would not be enough for a real world application. Thus, we set out to develop a multilingual system, including Basque, English, French and Spanish.The thesis addresses the following challenges to build such a system:- Analysing methods for creating Sentiment lexicons, suitable for less resourced languages.- Analysis of social media (specifically Twitter): Tweets pose several challenges in order to understand and extract opinions from such messages. Language identification and microtext normalization are addressed.- Research the state of the art in polarity classification, and develop a supervised classifier that is tested against well known social media benchmarks.- Develop a social media monitor capable of analysing sentiment with respect to specific events, products or organizations
EVALITA Evaluation of NLP and Speech Tools for Italian Proceedings of the Final Workshop
Editor of the proceedings of EVALITA 2016
Real-Time Stock Market Recommendation & Prediction using Multi Source Data
Stock investors must be cognizant of both the current price of their stock and the price at which they want to sell it in the future. This does not stop investors to monitor past price patterns and apply their knowledge to the present. ’Past performance is not an indicator of future success’, as the saying goes. To put it another way, historical stock data alone isn’t enough to forecast future stock prices. Another key factor to consider in a trading strategy is the impact of market psychology. Financial data, which is a type of multimedia data, provides a wealth of information that has been widely used for data analysis tasks. However, predicting stock prices remains a popular study topic for investors and financial scholars. Forecasting stock prices has become an extremely difficult undertaking because of the significant noise, nonlinearity, and volatility of stock price statistic data
Leveraging Longitudinal Data for Personalized Prediction and Word Representations
This thesis focuses on personalization, word representations, and longitudinal dialog. We first look at users expressions of individual preferences. In this targeted sentiment task, we find that we can improve entity extraction and sentiment classification using domain lexicons and linear term weighting. This task is important to personalization and dialog systems, as targets need to be identified in conversation and personal preferences affect how the system should react. Then we examine individuals with large amounts of personal conversational data in order to better predict what people will say. We consider extra-linguistic features that can be used to predict behavior and to predict the relationship between interlocutors. We show that these features improve over just using message content and that training on personal data leads to much better performance than training on a sample from all other users. We look not just at using personal data for these end-tasks, but also constructing personalized word representations. When we have a lot of data for an individual, we create personalized word embeddings that improve performance on language modeling and authorship attribution. When we have limited data, but we have user demographics, we can instead construct demographic word embeddings. We show that these representations improve language modeling and word association performance. When we do not have demographic information, we show that using a small amount of data from an individual, we can calculate similarity to existing users and interpolate or leverage data from these users to improve language modeling performance. Using these types of personalized word representations, we are able to provide insight into what words vary more across users and demographics. The kind of personalized representations that we introduce in this work allow for applications such as predictive typing, style transfer, and dialog systems. Importantly, they also have the potential to enable more equitable language models, with improved performance for those demographic groups that have little representation in the data.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/167971/1/cfwelch_1.pd
Sentiment Analysis of Textual Content in Social Networks. From Hand-Crafted to Deep Learning-Based Models
Aquesta tesi proposa diversos mètodes avançats per analitzar automà ticament el contingut textual compartit a les xarxes socials i identificar les opinions, emocions i sentiments a diferents nivells d’anà lisi i en diferents idiomes.
Comencem proposant un sistema d’anà lisi de sentiments, anomenat SentiRich, basat en un conjunt ric d’atributs, inclosa la informació extreta de lèxics de sentiments i models de word embedding pre-entrenats. A continuació, proposem un sistema basat en Xarxes Neurals Convolucionals i regressors XGboost per resoldre una sèrie de tasques d’anà lisi de sentiments i emocions a Twitter. Aquestes tasques van des de les tasques tÃpiques d’anà lisi de sentiments fins a determinar automà ticament la intensitat d’una emoció (com ara alegria, por, ira, etc.) i la intensitat del sentiment dels autors a partir dels seus tweets. També proposem un nou sistema basat en Deep Learning per solucionar el problema de classificació de les emocions múltiples a Twitter. A més, es va considerar el problema de l’anà lisi del sentiment depenent de l’objectiu. Per a aquest propòsit, proposem un sistema basat en Deep Learning que identifica i extreu l'objectiu dels tweets. Tot i que alguns idiomes, com l’anglès, disposen d’una à mplia gamma de recursos per permetre l’anà lisi del sentiment, a la majoria de llenguatges els hi manca. Per tant, utilitzem la tècnica d'anà lisi de sentiments entre idiomes per desenvolupar un sistema nou, multilingüe i basat en Deep Learning per a llenguatges amb pocs recursos lingüÃstics. Proposem combinar l’ajuda a la presa de decisions multi-criteri i anà lisis de sentiments per desenvolupar un sistema que permeti als usuaris la possibilitat d’explotar tant les opinions com les seves preferències en el procés de classificació d’alternatives. Finalment, vam aplicar els sistemes desenvolupats al camp de la comunicació de les marques de destinació a través de les xarxes socials. Amb aquesta finalitat, hem recollit tweets de persones locals, visitants i els gabinets oficials de Turisme de diferents destinacions turÃstiques i es van analitzar les opinions i les emocions compartides en ells. En general, els mètodes proposats en aquesta tesi milloren el rendiment dels enfocaments d’última generació i mostren troballes apassionants.Esta tesis propone varios métodos avanzados para analizar automáticamente el contenido textual compartido en las redes sociales e identificar opiniones, emociones y sentimientos, en diferentes niveles de análisis y en diferentes idiomas. Comenzamos proponiendo un sistema de análisis de sentimientos, llamado SentiRich, que está basado en un conjunto rico de caracterÃsticas, que incluyen la información extraÃda de léxicos de sentimientos y modelos de word embedding previamente entrenados. Luego, proponemos un sistema basado en redes neuronales convolucionales y regresores XGboost para resolver una variedad de tareas de análisis de sentimientos y emociones en Twitter. Estas tareas van desde las tÃpicas tareas de análisis de sentimientos hasta la determinación automática de la intensidad de una emoción (como alegrÃa, miedo, ira, etc.) y la intensidad del sentimiento de los autores de los tweets. También proponemos un novedoso sistema basado en Deep Learning para abordar el problema de clasificación de emociones múltiples en Twitter. Además, consideramos el problema del análisis de sentimientos dependiente del objetivo. Para este propósito, proponemos un sistema basado en Deep Learning que identifica y extrae el objetivo de los tweets.
Si bien algunos idiomas, como el inglés, tienen una amplia gama de recursos para permitir el análisis de sentimientos, la mayorÃa de los idiomas carecen de ellos. Por lo tanto, utilizamos la técnica de Análisis de Sentimiento Inter-lingual para desarrollar un sistema novedoso, multilingüe y basado en Deep Learning para los lenguajes con pocos recursos lingüÃsticos. Proponemos combinar la Ayuda a la Toma de Decisiones Multi-criterio y el análisis de sentimientos para desarrollar un sistema que brinde a los usuarios la capacidad de explotar las opiniones junto con sus preferencias en el proceso de clasificación de alternativas. Finalmente, aplicamos los sistemas desarrollados al campo de la comunicación de las marcas de destino a través de las redes sociales. Con este fin, recopilamos tweets de personas locales, visitantes, y gabinetes oficiales de Turismo de diferentes destinos turÃsticos y analizamos las opiniones y las emociones compartidas en ellos. En general, los métodos propuestos en esta tesis mejoran el rendimiento de los enfoques de vanguardia y muestran hallazgos interesa.This thesis proposes several advanced methods to automatically analyse textual content shared on social networks and identify people’ opinions, emotions and feelings at a different level of analysis and in different languages.
We start by proposing a sentiment analysis system, called SentiRich, based on a set of rich features, including the information extracted from sentiment lexicons and pre-trained word embedding models. Then, we propose an ensemble system based on Convolutional Neural Networks and XGboost regressors to solve an array of sentiment and emotion analysis tasks on Twitter. These tasks range from the typical sentiment analysis tasks, to automatically determining the intensity of an emotion (such as joy, fear, anger, etc.) and the intensity of sentiment (aka valence) of the authors from their tweets. We also propose a novel Deep Learning-based system to address the multiple emotion classification problem on Twitter. Moreover, we considered the problem of target-dependent sentiment analysis. For this purpose, we propose a Deep Learning-based system that identifies and extracts the target of the tweets.
While some languages, such as English, have a vast array of resources to enable sentiment analysis, most low-resource languages lack them. So, we utilise the Cross-lingual Sentiment Analysis technique to develop a novel, multi-lingual and Deep Learning-based system for low resource languages. We propose to combine Multi-Criteria Decision Aid and sentiment analysis to develop a system that gives users the ability to exploit reviews alongside their preferences in the process of alternatives ranking. Finally, we applied the developed systems to the field of communication of destination brands through social networks. To this end, we collected tweets of local people, visitors, and official brand destination offices from different tourist destinations and analysed the opinions and the emotions shared in these tweets