231 research outputs found

    IEST: WASSA-2018 Implicit Emotions Shared Task

    Get PDF
    Past shared tasks on emotions use data with both overt expressions of emotions (I am so happy to see you!) as well as subtle expressions where the emotions have to be inferred, for instance from event descriptions. Further, most datasets do not focus on the cause or the stimulus of the emotion. Here, for the first time, we propose a shared task where systems have to predict the emotions in a large automatically labeled dataset of tweets without access to words denoting emotions. Based on this intention, we call this the Implicit Emotion Shared Task (IEST) because the systems have to infer the emotion mostly from the context. Every tweet has an occurrence of an explicit emotion word that is masked. The tweets are collected in a manner such that they are likely to include a description of the cause of the emotion - the stimulus. Altogether, 30 teams submitted results which range from macro F1 scores of 21 % to 71 %. The baseline (MaxEnt bag of words and bigrams) obtains an F1 score of 60 % which was available to the participants during the development phase. A study with human annotators suggests that automatic methods outperform human predictions, possibly by honing into subtle textual clues not used by humans. Corpora, resources, and results are available at the shared task website at http://implicitemotions.wassa2018.com.Comment: Accepted at Proceedings of the 9th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysi

    Kомплексний порівняльний контент-аналіз промов жінок-лідерів (2009– 2013) (Comprehensive content analysis of the speeches of female leaders (2009-2013)

    Get PDF
    Тези присвячено комплексному контент-аналізу політичного дискурсу Державного секретаря США Гілларі Родем Клінтон, канцлеру Німеччини Ангели Меркель та прем’єр- міністру Австралії Джулії Ейлін Гіллард (2009–2013), здійснено класифікацію та аналіз термінологічної наповненості промов, висвітлено стилістичні особливості політичного дискурсу. (The research is devoted to the structural and lexical analysis of the political discourse of the Secretary of State Hillary Rodham Clinton, the Chancellor Angela Merkel and Prime-minister of Australia Julia Eileen Gillard (2009–2013). It deals with the classification and analysis of the terminology of speeches, stylistic peculiarities of political discourse were distinguished.

    Extending the EmotiNet Knowledge Base to Improve the Automatic Detection of Implicitly Expressed Emotions from Text

    Get PDF
    Sentiment analysis is one of the recent, highly dynamic fields in Natural Language Processing. Most existing approaches are based on word-level analysis of texts and are mostly able to detect only explicit expressions of sentiment. However, in many cases, emotions are not expressed by using words with an affective meaning (e.g. happy), but by describing real-life situations, which readers (based on their commonsense knowledge) detect as being related to a specic emotion. Given the challenges of detecting emotions from contexts in which no lexical clue is present, in this article we present a comparative analysis between the performance of well-established methods for emotion detection (supervised and lexical knowledge-based) and a method we propose and extend, which is based on commonsense knowledge stored in the EmotiNet knowledge base. Our extensive evaluations show that, in the context of this task, the approach based on EmotiNet is the most appropriate.JRC.G.2-Global security and crisis managemen

    Detecting Event-Related Links and Sentiments from Social Media Texts

    Get PDF
    Nowadays, the importance of Social Media is constantly growing, as people often use such platforms to share mainstream media news and comment on the events that they relate to. As such, people no loger remain mere spectators to the events that happen in the world, but become part of them, commenting on their developments and the entities involved, sharing their opinions and distributing related content. This paper describes a system that links the main events detected from clusters of newspaper articles to tweets related to them, detects complementary information sources from the links they contain and subsequently applies sentiment analysis to classify them into positive, negative and neutral. In this manner, readers can follow the main events happening in the world, both from the perspective of mainstream as well as social media and the public's perception on them. This system is part of a media monitoring framework working live and it will be demonstrated using Google Earth.JRC.G.2-Global security and crisis managemen

    Definición de disparador de emoción asociado a la cultura y aplicación a la clasificación de la valencia y la emoción en textos

    Get PDF
    Este artículo presenta un método de identificación y clasificación de la valencia y las emociones presentes en un texto. Para ello, se introduce un nuevo concepto denominado disparador de emoción. Inicialmente, se construye de forma incremental una base de datos léxica de disparadores de emoción asociados a la cultura con la que se quiere trabajar, basándose en tres teorías diferentes: la Teoría de la Relevancia de Pragmática, la Teoría de la Motivación de Maslow de Psicología y la Teoría de Necesidades de Neef de Economía. La base de datos creada parte de un conjunto inicial de términos y es ampliada con la información de otros recursos léxicos, como WordNet, NomLex y dominios relevantes. El enlace entre idiomas se hace por medio de EuroWordNet y se completa y adapta a diversas culturas con bases de conocimiento específicas para cada lengua. También, se demuestra cómo la base de datos construida puede ser utilizada para buscar en textos la valencia (polaridad) y el significado afectivo. Finalmente, se evalúa el método utilizando los datos de prueba de la tarea nº 14 de Semeval “Texto afectivo” y su traducción al español. Los resultados y las mejoras se presentan junto con una discusión en la que se tratan los puntos fuertes y débiles del método y las directrices para el trabajo futuro.This paper presents a method to automatically spot and classify the valence and emotions present in written text, based on a concept we introduced - of emotion triggers. The first step consists of incrementally building a culture dependent lexical database of emotion triggers, emerging from the theory of relevance from pragmatics, Maslow´s theory of human needs from psychology and Neef´s theory of human needs in economics. We start from a core of terms and expand them using lexical resources such as WordNet, completed by NomLex, sense number disambiguated using the Relevant Domains concept. The mapping among languages is accomplished using EuroWordNet and the completion and projection to different cultures is done through language-specific commonsense knowledge bases. Subsequently, we show the manner in which the constructed database can be used to mine texts for valence (polarity) and affective meaning. An evaluation is performed on the Semeval Task No. 14: Affective Text test data and their corresponding translation to Spanish. The results and improvements are presented together with an argument on the strong and weak points of the method and the directions for future work

    Sentiment Analysis in Social Media Texts

    Get PDF
    This paper presents a method for sentiment analysis specifically designed to work with Twitter data (tweets), taking into account their structure, length and specific language. The approach employed makes it easily extendible to other languages and makes it able to process tweets in near real time. The main contributions of this work are: a) the pre-processing of tweets to normalize the language and generalize the vocabulary employed to express sentiment; b) the use minimal linguistic processing, which makes the approach easily portable to other languages; c) the inclusion of higher order n-grams to spot modifications in the polarity of the sentiment expressed; d) the use of simple heuristics to select features to be employed; e) the application of supervised learning using a simple Support Vector Machines linear classifier on a set of realistic data. We show that using the training models generated with the method described we can improve the sentiment classification performance, irrespective of the domain and distribution of the test sets.JRC.G.2 - Global security and crisis managemen

    Going beyond traditional QA systems: challenges and keys in opinion question answering

    Get PDF
    The treatment of factual data has been widely studied in different areas of Natural Language Processing (NLP). However, processing subjective information still poses important challenges. This paper presents research aimed at assessing techniques that have been suggested as appropriate in the context of subjective - Opinion Question Answering (OQA). We evaluate the performance of an OQA with these new components and propose methods to optimally tackle the issues encountered. We assess the impact of including additional resources and processes with the purpose of improving the system performance on two distinct blog datasets. The improvements obtained for the different combination of tools are statistically significant. We thus conclude that the proposed approach is adequate for the OQA task, offering a good strategy to deal with opinionated questions.This paper has been partially supported by Ministerio de Ciencia e Innovación - Spanish Government (grant no. TIN2009-13391-C04-01), and Conselleria d'Educación - Generalitat Valenciana (grant no. PROMETEO/2009/119 and ACOMP/2010/286)

    Mapping Nanomedicine Terminology in the Regulatory Landscape

    Get PDF
    A common terminology is essential in any field of science and technology for a mutual understanding among different communities of experts and regulators, harmonisation of policy actions, standardisation of quality procedures and experimental testing, and the communication to the general public. It also allows effective revision of information for policy making and optimises research fund allocation. In particular, in emerging scientific fields with a high innovation potential, new terms, descriptions and definitions are quickly generated, which are then ambiguously used by stakeholders having diverse interests, coming from different scientific disciplines and/or from various regions. The application of nanotechnology in health -often called nanomedicine- is considered as such emerging and multidisciplinary field with a growing interest of various communities. In order to support a better understanding of terms used in the regulatory domain, the Nanomedicines Working Group of the International Pharmaceutical Regulators Forum (IPRF) has prioritised the need to map, compile and discuss the currently used terminology of regulatory scientists coming from different geographic areas. The JRC has taken the lead to identify and compile frequently used terms in the field by using web crawling and text mining tools as well as the manual extraction of terms. Websites of 13 regulatory authorities and clinical trial registries globally involved in regulating nanomedicines have been crawled. The compilation and analysis of extracted terms demonstrated sectorial and geographical differences in the frequency and type of nanomedicine related terms used in a regulatory context. Finally 31 relevant and most frequently used terms deriving from various agencies have been compiled, discussed and analysed for their similarities and differences. These descriptions will support the development of harmonised use of terminology in the future. The report provides necessary background information to advance the discussion among stakeholders. It will strengthen activities aiming to develop harmonised standards in the field of nanomedicine, which is an essential factor to stimulate innovation and industrial competitiveness.JRC.F.2-Consumer Products Safet

    Resource Creation and Evaluation for Multilingual Sentiment Analysis in Social Media Texts

    Get PDF
    Sentiment analysis (SA) regards the classification of texts according to the polarity of the opinions they express. SA systems are highly relevant to many real-world applications (e.g. marketing, eGovernance, business intelligence, behavioral sciences) and also to many tasks in Natural Language Processing (NLP) – information extraction, question answering, textual entailment, to name just a few. The importance of this field has been proven by the high number of approaches proposed in research, as well as by the interest that it raised from other disciplines and the applications that were created using its technology. In our case, the primary focus is to use sentiment analysis in the context of media monitoring, to enable tracking of global reactions to events. The main challenge that we face is that tweets are written in different languages and an unbiased system should be able to deal with all of them, in order to process all (possible) available data. Unfortunately, although many linguistic resources exist for processing texts written in English, for many other languages data and tools are scarce. Following our initial efforts described in (Balahur and Turchi, 2013), in this article we extend our study on the possibility to implement a multilingual system that is able to a) classify sentiment expressed in tweets in various languages using training data obtained through machine translation; b) to verify the extent to which the quality of the translations influences the sentiment classification performance, in this case, of highly informal texts; and c) to improve multilingual sentiment classification using small amounts of data annotated in the target language. To this aim, varying sizes of target language data are tested. The languages we explore are: Arabic, Turkish, Russian, Italian, Spanish, German and French.JRC.G.2-Global security and crisis managemen

    Proceedings of the First Workshop on Computing News Storylines (CNewsStory 2015)

    Get PDF
    This volume contains the proceedings of the 1st Workshop on Computing News Storylines (CNewsStory 2015) held in conjunction with the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (ACL-IJCNLP 2015) at the China National Convention Center in Beijing, on July 31st 2015. Narratives are at the heart of information sharing. Ever since people began to share their experiences, they have connected them to form narratives. The study od storytelling and the field of literary theory called narratology have developed complex frameworks and models related to various aspects of narrative such as plots structures, narrative embeddings, characters’ perspectives, reader response, point of view, narrative voice, narrative goals, and many others. These notions from narratology have been applied mainly in Artificial Intelligence and to model formal semantic approaches to narratives (e.g. Plot Units developed by Lehnert (1981)). In recent years, computational narratology has qualified as an autonomous field of study and research. Narrative has been the focus of a number of workshops and conferences (AAAI Symposia, Interactive Storytelling Conference (ICIDS), Computational Models of Narrative). Furthermore, reference annotation schemes for narratives have been proposed (NarrativeML by Mani (2013)). The workshop aimed at bringing together researchers from different communities working on representing and extracting narrative structures in news, a text genre which is highly used in NLP but which has received little attention with respect to narrative structure, representation and analysis. Currently, advances in NLP technology have made it feasible to look beyond scenario-driven, atomic extraction of events from single documents and work towards extracting story structures from multiple documents, while these documents are published over time as news streams. Policy makers, NGOs, information specialists (such as journalists and librarians) and others are increasingly in need of tools that support them in finding salient stories in large amounts of information to more effectively implement policies, monitor actions of “big players” in the society and check facts. Their tasks often revolve around reconstructing cases either with respect to specific entities (e.g. person or organizations) or events (e.g. hurricane Katrina). Storylines represent explanatory schemas that enable us to make better selections of relevant information but also projections to the future. They form a valuable potential for exploiting news data in an innovative way.JRC.G.2-Global security and crisis managemen
    corecore