298 research outputs found

    A Multilevel Approach to Sentiment Analysis of Figurative Language in Twitter

    Full text link
    [EN] Commendable amount of work has been attempted in the field of Sentiment Analysis or Opinion Mining from natural language texts and Twitter texts. One of the main goals in such tasks is to assign polarities (positive or negative) to a piece of text. But, at the same time, one of the important as well as difficult issues is how to assign the degree of positivity or negativity to certain texts. The answer becomes more complex when we perform a similar task on figurative language texts collected from Twitter. Figurative language devices such as irony and sarcasm contain an intentional secondary or extended meaning hidden within the expressions. In this paper we present a novel approach to identify the degree of the sentiment (fine grained in an 11-point scale) for the figurative language texts. We used several semantic features such as sentiment and intensifiers as well as we introduced sentiment abruptness, which measures the variation of sentiment from positive to negative or vice versa. We trained our systems at multiple levels to achieve the maximum cosine similarity of 0.823 and minimum mean square error of 2.170.The work reported in this paper is supported by a grant from the project “CLIA System Phase II” funded by Department of Electronics and Information Technology (DeitY), Ministry of Communications and Information Technology (MCIT), Government of India. The work of the fourth author is also supported by the SomEMBED TIN2015-71147-C2-1-P MINECO research project and by the Generalitat Valenciana under the grant ALMAPATER (PrometeoII/2014/030).Gopal Patra, B.; Mazumda, S.; Das, D.; Rosso, P.; Bandyopadhyay, S. (2018). A Multilevel Approach to Sentiment Analysis of Figurative Language in Twitter. Lecture Notes in Computer Science. 9624:281-291. https://doi.org/10.1007/978-3-319-75487-1_22S2812919624Ghosh, A., Li, G., Veale, T., Rosso, P., Shutova, E., Reyes, A., Barnden, J.: Semeval-2015 task 11: sentiment analysis of figurative language in Twitter. In: 9th International Workshop on Semantic Evaluation (SemEval), Co-located with NAACL, Denver, Colorado, pp. 470–478. Association for Computational Linguistics (2015)Reyes, A., Rosso, P., Veale, T.: A multidimensional approach for detecting irony in Twitter. Lang. Resour. Eval. 47(1), 239–268 (2013)Reyes, A., Rosso, P., Buscaldi, D.: From humor recognition to irony detection: the figurative language of social media. Data Knowl. Eng. 74, 1–12 (2012)Patra, B.G., Mandal, S., Das, D., Bandyopadhyay, S.: JU_CSE: a conditional random field (CRF) based approach to aspect based sentiment analysis. In: 8th International Workshop on Semantic Evaluation (SemEval), Co-located with COLING, Dublin, Ireland, pp. 370–374. Association for Computational Linguistics (2014)Ozdemir, C., Bergler, S.: CLaC-SentiPipe: SemEval2015 subtasks 10 B, E, and task 11. In: 9th International Workshop on Semantic Evaluation (SemEval), Co-located with NAACL, Denver, Colorado, pp. 479–485. Association for Computational Linguistics (2015)Strapparava, C., Valitutti, A.: Wordnet-affect: an affective extension of wordnet. In: 4th International Conference on Language Resources and Evaluation, pp. 1083–1086 (2004)Léger, J.C.: Menger curvature and rectifiability. Ann. Math. 149, 831–869 (1999)Lafferty, J.D., McCallum, A., Pereira, F.C.N.: Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: 18th International Conference on Machine Learning, pp. 282–289 (2001)de Albornoz, J.C., Plaza, L., Gervas, P.: SentiSense: an easily scalable concept-based affective lexicon for sentiment analysis. In: 8th International Conference on Language Resources and Evaluation, pp. 3562–3567 (2012)Taboada, M., Brooke, J., Tofiloski, M., Voll, K., Stede, M.: Lexicon-based methods for sentiment analysis. Comput. Linguist. 37(2), 267–307 (2011)Naveed, N., Gottron, T., Kunegis, J., Alhadi, A.C.: Bad news travel fast: a content-based analysis of interestingness on Twitter. In: 3rd International Web Science Conference. ACM (2011)Owoputi, O., O’Connor, B., Dyer, C., Gimpel, K., Schneider, N., Smith, N.A.: Improved part-of-speech tagging for online conversational text with word clusters. In: NAACL. Association for Computational Linguistics (2013)Mohammad, S., Turney, P.: Crowdsourcing a word-emotion association lexicon. Comput. Intell. 29(3), 436–465 (2013)Baccianella, S., Esuli, A., Sebastiani, F.: Sentiwordnet 3.0: an enhanced lexical resource for sentiment analysis and opinion mining. In: 7th Conference on International Language Resources and Evaluation, Valletta, Malta (2010)Choi, Y., Wiebe, J.: +/ - EffectWordNet: sense-level lexicon acquisition for opinion inference. In: EMNLP (2014)Whissell, C., Fournier, M., Pelland, R., Weir, D., Makarec, K.: A dictionary of affect in language: IV. Reliability, validity, and applications. Percept. Mot. Skills 62(3), 875–888 (1986)Patra, B.G., Takamura, H., Das, D., Okumura, M., Bandyopadhyay, S.: Construction of emotional lexicon using potts model. In: International Joint Conference on Natural Language Processing (IJCNLP), pp. 674–679 (2013)Pang, B., Lee, L.: Opinion mining and sentiment analysis. Found. Trends Inf. Retr. 2, 1–135 (2008)Vilares, D., Alonso, M.A., Gomez, C.: On the usefulness of lexical and syntactic processing in polarity classification of Twitter messages. J. Assoc. Inf. Sci. Technol. 66(9), 1799–1816 (2015)Barbieri, F., Ronzano, F., Saggion, H.: UPF-taln: SemEval 2015 tasks 10 and 11 sentiment analysis of literal and figurative language in Twitter. In: SemEval-2015, pp. 704–708 (2015

    Dependency Syntax in the Automatic Detection of Irony and Stance

    Get PDF
    [ES] The present thesis is part of the broad panorama of studies of Natural Language Processing (NLP). In particular, it is a work of Computational Linguistics (CL) designed to study in depth the contribution of syntax in the field of sentiment analysis and, therefore, to study texts extracted from social media or, more generally, online content. Furthermore, given the recent interest of the scientific community in the Universal Dependencies (UD) project, which proposes a morphosyntactic annotation format aimed at creating a "universal" representation of the phenomena of morphology and syntax in a manifold of languages, in this work we made use of this format, thinking of a study in a multilingual perspective (Italian, English, French and Spanish). In this work we will provide an exhaustive presentation of the morphosyntactic annotation format of UD, in particular underlining the most relevant issues regarding their application to UGC. Two tasks will be presented, and used as case studies, in order to test the research hypotheses: the first case study will be in the field of automatic Irony Detection and the second in the area of Stance Detection. In both cases, historical notes will be provided that can serve as a context for the reader, an introduction to the problems faced will be outlined and the activities proposed in the computational linguistics community will be described. Furthermore, particular attention will be paid to the resources currently available as well as to those developed specifically for the study of the aforementioned phenomena. Finally, through the description of a series of experiments, both within evaluation campaigns and within independent studies, I will try to describe the contribution that syntax can provide to the resolution of such tasks. This thesis is a revised collection of my three-year PhD career and collocates within the growing trend of studies devoted to make Artificial Intelligence results more explainable, going beyond the achievement of highest scores in performing tasks, but rather making their motivations understandable and comprehensible for experts in the domain. The novel contribution of this work mainly consists in the exploitation of features that are based on morphology and dependency syntax, which were used in order to create vectorial representations of social media texts in various languages and for two different tasks. Such features have then been paired with a manifold of machine learning classifiers, with some neural networks and also with the language model BERT. Results suggest that fine-grained dependency-based syntactic information is highly informative for the detection of irony, and less informative for what concerns stance detection. Nonetheless, dependency syntax might still prove useful in the task of stance detection if firstly irony detection is considered as a preprocessing step. I also believe that the dependency syntax approach that I propose could shed some light on the explainability of a difficult pragmatic phenomenon such as irony.[CA] La presente tesis se enmarca dentro del amplio panorama de estudios relacionados con el Procesamiento del Lenguaje Natural (NLP). En concreto, se trata de un trabajo de Lingüística Computacional (CL) cuyo objetivo principal es estudiar en profundidad la contribución de la sintaxis en el campo del análisis de sentimientos y, en concreto, aplicado a estudiar textos extraídos de las redes sociales o, más en general, de contenidos online. Además, dado el reciente interés de la comunidad científica por el proyecto Universal Dependencies (UD), en el que se propone un formato de anotación morfosintáctica destinado a crear una representación "universal" de la morfología y sintaxis aplicable a diferentes idiomas, en este trabajo se utiliza este formato con el propósito de realizar un estudio desde una perspectiva multilingüe (italiano, inglés, francés y español). En este trabajo se presenta una descripción exhaustiva del formato de anotación morfosintáctica de UD, en particular, subrayando las cuestiones más relevantes en cuanto a su aplicación a los UGC generados en las redes sociales. El objetivo final es analizar y comprobar si estas anotaciones morfosintácticas sirven para obtener información útil para los modelos de detección de la ironía y del stance o posicionamiento. Se presentarán dos tareas y se utilizarán como ejemplos de estudio para probar las hipótesis de la investigación: el primer caso se centra en el área de la detección automática de la ironía y el segundo en el área de la detección del stance o posicionamiento. En ambos casos, se proporcionan los antecendentes y trabajos relacionados notas históricas que pueden servir de contexto para el lector, se plantean los problemas encontrados y se describen las distintas actividades propuestas para resolver estos problemas en la comunidad de la lingüística computacional. Se presta especial atención a los recursos actualmente disponibles, así como a los desarrollados específicamente para el estudio de los fenómenos antes mencionados. Finalmente, a través de la descripción de una serie de experimentos, llevados a cabo tanto en campañas de evaluación como en estudios independientes, se describe la contribución que la sintaxis puede brindar a la resolución de esas tareas. Esta tesis es el resultado de toda la investigación que he llevado a cabo durante mi doctorado en una colección revisada de mi carrera de doctorado de los últimos tres años y medio, y se ubica dentro de la tendencia creciente de estudios dedicados a hacer que los resultados de la Inteligencia Artificial sean más explicables, yendo más allá del logro de puntajes más altos en la realización de tareas, sino más bien haciendo comprensibles sus motivaciones y qué los procesos sean más comprensibles para los expertos en el dominio. La contribución principal y más novedosa de este trabajo consiste en la explotación de características (o rasgos) basadas en la morfología y la sintaxis de dependencias, que se utilizaron para crear las representaciones vectoriales de textos procedentes de redes sociales en varios idiomas y para dos tareas diferentes. A continuación, estas características se han combinado con una variedad de clasificadores de aprendizaje automático, con algunas redes neuronales y también con el modelo de lenguaje BERT. Los resultados sugieren que la información sintáctica basada en dependencias utilizada es muy informativa para la detección de la ironía y menos informativa en lo que respecta a la detección del posicionamiento. No obstante, la sintaxis basada en dependencias podría resultar útil en la tarea de detección del posicionamiento si, en primer lugar, la detección de ironía se considera un paso previo al procesamiento en la detección del posicionamiento. También creo que el enfoque basado casi completamente en sintaxis de dependencias que propongo en esta tesis podría ayudar a explicar mejor un fenómeno prag[EN] La present tesi s'emmarca dins de l'ampli panorama d'estudis relacionats amb el Processament del Llenguatge Natural (NLP). En concret, es tracta d'un treball de Lingüística Computacional (CL), l'objectiu principal del qual és estudiar en profunditat la contribució de la sintaxi en el camp de l'anàlisi de sentiments i, en concret, aplicat a l'estudi de textos extrets de les xarxes socials o, més en general, de continguts online. A més, el recent interès de la comunitat científica pel projecte Universal Dependències (UD), en el qual es proposa un format d'anotació morfosintàctica destinat a crear una representació "universal" de la morfologia i sintaxi aplicable a diferents idiomes, en aquest treball s'utilitza aquest format amb el propòsit de realitzar un estudi des d'una perspectiva multilingüe (italià, anglès, francès i espanyol). En aquest treball es presenta una descripció exhaustiva del format d'anotació morfosintàctica d'UD, en particular, posant més èmfasi en les qüestions més rellevants pel que fa a la seva aplicació als UGC generats a les xarxes socials. L'objectiu final és analitzar i comprovar si aquestes anotacions morfosintàctiques serveixen per obtenir informació útil per als sistemes de detecció de la ironia i del stance o posicionament. Es presentaran dues tasques i s'utilitzaran com a exemples d'estudi per provar les hipòtesis de la investigació: el primer cas se centra en l'àrea de la detecció automàtica de la ironia i el segon en l'àrea de la detecció del stance o posicionament. En tots dos casos es proporcionen els antecedents i treballs relacionats que poden servir de context per al lector, es plantegen els problemes trobats i es descriuen les diferents activitats proposades per resoldre aquests problemes en la comunitat de la lingüística computacional. Es fa especialment referència als recursos actualment disponibles, així com als desenvolupats específicament per a l'estudi dels fenòmens abans esmentats. Finalment, a través de la descripció d'una sèrie d'experiments, duts a terme tant en campanyes d'avaluació com en estudis independents, es descriu la contribució que la sintaxi pot oferir a la resolució d'aquestes tasques. Aquesta tesi és el resultat de tota la investigació que he dut a terme durant el meu doctorat els últims tres anys i mig, i se situa dins de la tendència creixent d'estudis dedicats a fer que els resultats de la Intel·ligència Artificial siguin més explicables, que vagin més enllà de l'assoliment de puntuacions més altes en la realització de tasques, sinó més aviat fent comprensibles les seves motivacions i què els processos siguin més comprensibles per als experts en el domini. La contribució principal i més nova d'aquest treball consisteix en l'explotació de característiques (o trets) basades en la morfologia i la sintaxi de dependències, que s'utilitzen per crear les representacions vectorials de textos procedents de xarxes socials en diversos idiomes i per a dues tasques diferents. A continuació, aquestes característiques s'han combinat amb una varietat de classificadors d'aprenentatge automàtic, amb algunes xarxes neuronals i també amb el model de llenguatge BERT. Els resultats suggereixen que la informació sintàctica utilitzada basada en dependències és molt informativa per a la detecció de la ironia i menys informativa pel que fa a la detecció del posicionament. Malgrat això, la sintaxi basada en dependències podria ser útil en la tasca de detecció del posicionament si, en primer lloc, la detecció d'ironia es considera un pas previ al processament en la detecció del posicionament. També crec que l'enfocament basat gairebé completament en sintaxi de dependències que proposo en aquesta tesi podria ajudar a explicar millor un fenomen pragmàtic tan difícil de detectar i d'interpretar com la ironia.Cignarella, AT. (2021). Dependency Syntax in the Automatic Detection of Irony and Stance [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/177639TESI

    Getting emotional or cognitive on social media? Analyzing renewable energy technologies in Instagram posts

    Get PDF
    Renewable energy development is a widely and intensively discussed topic, though it is still unclear which exactly variables may influence people's evaluation of the phenomenon. There is a need to study the general public's knowledge, emotions, and cognitions linked to energy technologies especially in the context of advanced inventions. Social media is a powerful communication tool which has a huge impact on studying public opinions. This study aims to describe linguistic connections through an analysis of 1500 Instagram posts, assuming and interpreting emotional and/or cognitive words. Using a socio-cognitive approach, this research explores the salient words under a set of pre-specified renewable energy technology (RET) hashtags. Building on the appraisal theories of emotions, this research investigates the coexistence of several energy technologies (solar, wind, biomass, and geothermal) and powerlines. The results showed the highest linguistic interconnection between solar and wind energy posts. Furthermore, powerlines were not linguistically connected to the RETs, as they are not included in the schema or not salient when people write posts about renewable energy. Solar, wind, and geothermal posts evoked more emotional and positive emotions than the other RETs and powerlines. Instead, biomass posts had a high frequency of cognitive processes and causal words. Powerline posts were linked to the words of risk, body, health, and biological process showing a great concern for health and perceived threat. These differences in the words used can be a guide to understanding peoples' reactions and communication for each of the energy sources. This study, taking both emotions and cognitions into account, explains different types of considerations towards energy projects

    Interpreting text and image relations in violent extremist discourse: A mixed methods approach for big data analytics

    Get PDF
    This article presents a mixed methods approach for analysing text and image relations in violent extremist discourse. The approach involves integrating multimodal discourse analysis with data mining and information visualisation, resulting in theoretically informed empirical techniques for automated analysis of text and image relations in large datasets. The approach is illustrated by a study which aims to analyse how violent extremist groups use language and images to legitimise their views, incite violence, and influence recruits in online propaganda materials, and how the images from these materials are re-used in different media platforms in ways that support and resist violent extremism. The approach developed in this article contributes to what promises to be one of the key areas of research in the coming decades: namely the interdisciplinary study of big (digital) datasets of human discourse, and the implications of this for terrorism analysis and research

    \u27Bigger than football\u27: A capacities and signals approach to the NFL kneeling protests

    Get PDF
    The ‘kneeling protests’ happening in the National Football League (NFL) have transformed football stadiums across the country into unlikely, yet impactful, spaces of resistance to racist rhetoric and racial violence. The reactions to the protests have been split, to say the least. Some have praised the kneeling as a powerful and moving display of civil resistance, culminating in the most high profile protester, Colin Kaepernick, being recognized as Amnesty International’s 2018 Ambassador of Consciousness. Others have interpreted the protests as a sign of disrespect towards the American flag, national anthem, and military. Now well into its third season, the symbolic power associated with the act of kneeling on the NFL may have ran its course. Broadcasters made clear their decision to not televise the anthems before the games, in a sense choking the kneeling protests of the oxygen that made for their fiery support and opposition in the first place – their circulation via traditional mass media broadcast. However, Kaepernick and #TakeAKnee are as widely discussed today as they were now almost three years ago. In theorizing the athlete/activist in the digital age, the aim of this research is to answer the following central research question: How was visibility maintained and the narrative of the kneeling protests controlled through deliberate image making and circulation, considering the ever-shifting, yet overlaid, physical and digital sites of resistance? The primary focus of this paper is the ability of the social movement to adapt strategy and tactic when space/place is denied or limited. It references a theoretical model (Tufecki, 2017) that measures a social movement’s power in terms of its i) narrative, ii) disruptive, and iii) electoral/institutional “capacities,” and how it “signals” to them

    Detection of emotion by text analysis using machine learning

    Get PDF
    Emotions are an integral part of human life. We know many different definitions of emotions. They are most often defined as a complex pattern of reactions, and they could be confused with feelings or moods. They are the way in which individuals cope with matters or situations that they find personally significant. Emotion can also be characterized as a conscious mental reaction (such as anger or fear) subjectively experienced as a strong feeling, usually directed at a specific object. Emotions can be communicated in different ways. Understanding the emotions conveyed in a text or speech of a human by a machine is one of the challenges in the field of human-machine interaction. The article proposes the artificial intelligence approach to automatically detect human emotions, enabling a machine (i.e., a chatbot) to accurately assess emotional state of a human and to adapt its communication accordingly. A complete automation of this process is still a problem. This gap can be filled with machine learning approaches based on automatic learning from experiences represented by the text data from conversations. We conducted experiments with a lexicon-based approach and classic methods of machine learning, appropriate for text processing, such as Naïve Bayes (NB), support vector machine (SVM) and with deep learning using neural networks (NN) to develop a model for detecting emotions in a text. We have compared these models’ effectiveness. The NN detection model performed particularly well in a multi-classification task involving six emotions from the text data. It achieved an F1-score = 0.95 for sadness, among other high scores for other emotions. We also verified the best model in use through a web application and in a Chatbot communication with a human. We created a web application based on our detection model that can analyze a text input by web user and detect emotions expressed in a text of a post or a comment. The model for emotions detection was used also to improve the communication of the Chatbot with a human since the Chatbot has the information about emotional state of a human during communication. Our research demonstrates the potential of machine learning approaches to detect emotions from a text and improve human-machine interaction. However, it is important to note that full automation of an emotion detection is still an open research question, and further work is needed to improve the accuracy and robustness of this system. The paper also offers the description of new aspects of automated detection of emotions from philosophy-psychological point of view

    Social media mental health analysis framework through applied computational approaches

    Get PDF
    Studies have shown that mental illness burdens not only public health and productivity but also established market economies throughout the world. However, mental disorders are difficult to diagnose and monitor through traditional methods, which heavily rely on interviews, questionnaires and surveys, resulting in high under-diagnosis and under-treatment rates. The increasing use of online social media, such as Facebook and Twitter, is now a common part of people’s everyday life. The continuous and real-time user-generated content often reflects feelings, opinions, social status and behaviours of individuals, creating an unprecedented wealth of person-specific information. With advances in data science, social media has already been increasingly employed in population health monitoring and more recently mental health applications to understand mental disorders as well as to develop online screening and intervention tools. However, existing research efforts are still in their infancy, primarily aimed at highlighting the potential of employing social media in mental health research. The majority of work is developed on ad hoc datasets and lacks a systematic research pipeline. [Continues.]</div

    Communication in the Gig Economy: Buying and Selling in Online Freelance Marketplaces

    Get PDF
    The proliferating gig economy relies on online freelance marketplaces, which support relatively anonymous interactions by text-based messages. Informational asymmetries thus arise that can lead to exchange uncertainties between buyers and freelancers. Conventional marketing thought recommends reducing such uncertainty. However, uncertainty reduction and uncertainty management theories indicate that buyers and freelancers might benefit more from balancing, rather than reducing, uncertainty, such as by strategically adhering to or deviating from common principles. With dyadic analyses of calls for bids and bids from a leading online freelance marketplace, this study reveals that buyers attract more bids from freelancers when they provide moderate degrees of task information and concreteness, avoid sharing personal information, and limit the affective intensity of their communication. Freelancers’ bid success and price premiums increase when they mimic the degree of task information and affective intensity exhibited by buyers. However, mimicking a lack of personal information and concreteness reduces freelancers’ success, so freelancers should always be more concrete and offer more personal information than buyers do. These contingent perspectives offer insights into buyer–seller communication in two-sided online marketplaces; they clarify that despite, or sometimes due to, communication uncertainty, both sides can achieve success in the online gig economy

    Análisis discursivo de las vacunas anticovid-19 en Twitter

    Get PDF
    El presente informe expone los resultados de un estudio de caso descriptivo asociado a los tweets publicados bajo los hashtags #vacunacoronavirus y #coronavirusvaccine. Un aporte fundamental de la investigación es la propuesta de una metodología de análisis del discurso asistido por minería de datos en un escenario asociado a mensajes multisoportes. El presente artículo con un enfoque mixto fue aplicado sobre una muestra de 1 millón de tweets extraídos de manera aleatoria por las aplicaciones Stela y Brand24. Mostramos la necesaria interrelación metodológica del uso de softwares y herramientas cuantitativas, así como la perspectiva cualitativa para determinar la interpretación de la representación simbólica que se emitió sobre la vacunación anti-covid 19. &nbsp;Los resultados presentan los encuadres, la polaridad, y las percepciones que adoptan los usuarios respecto a vacunas de comercialización internacional