169 research outputs found
Extending the EmotiNet Knowledge Base to Improve the Automatic Detection of Implicitly Expressed Emotions from Text
Sentiment analysis is one of the recent, highly dynamic fields in Natural
Language Processing. Most existing approaches are based on word-level
analysis of texts and are mostly able to detect only explicit expressions of
sentiment. However, in many cases, emotions are not expressed by using
words with an affective meaning (e.g. happy), but by describing real-life
situations, which readers (based on their commonsense knowledge) detect
as being related to a specic emotion. Given the challenges of detecting
emotions from contexts in which no lexical clue is present, in this article we
present a comparative analysis between the performance of well-established
methods for emotion detection (supervised and lexical knowledge-based) and
a method we propose and extend, which is based on commonsense knowledge
stored in the EmotiNet knowledge base. Our extensive evaluations show
that, in the context of this task, the approach based on EmotiNet is the
most appropriate.JRC.G.2-Global security and crisis managemen
Kомплексний порівняльний контент-аналіз промов жінок-лідерів (2009– 2013) (Comprehensive content analysis of the speeches of female leaders (2009-2013)
Тези присвячено комплексному контент-аналізу політичного дискурсу Державного секретаря США Гілларі Родем
Клінтон, канцлеру Німеччини Ангели Меркель та прем’єр-
міністру Австралії Джулії Ейлін Гіллард (2009–2013), здійснено класифікацію та аналіз термінологічної наповненості промов, висвітлено стилістичні особливості політичного дискурсу.
(The research is devoted to the structural and lexical analysis of the political discourse of the Secretary of State Hillary Rodham Clinton, the Chancellor Angela Merkel and Prime-minister of Australia Julia Eileen Gillard (2009–2013). It deals with the
classification and analysis of the terminology of speeches, stylistic peculiarities of political discourse were distinguished.
IEST: WASSA-2018 Implicit Emotions Shared Task
Past shared tasks on emotions use data with both overt expressions of
emotions (I am so happy to see you!) as well as subtle expressions where the
emotions have to be inferred, for instance from event descriptions. Further,
most datasets do not focus on the cause or the stimulus of the emotion. Here,
for the first time, we propose a shared task where systems have to predict the
emotions in a large automatically labeled dataset of tweets without access to
words denoting emotions. Based on this intention, we call this the Implicit
Emotion Shared Task (IEST) because the systems have to infer the emotion mostly
from the context. Every tweet has an occurrence of an explicit emotion word
that is masked. The tweets are collected in a manner such that they are likely
to include a description of the cause of the emotion - the stimulus.
Altogether, 30 teams submitted results which range from macro F1 scores of 21 %
to 71 %. The baseline (MaxEnt bag of words and bigrams) obtains an F1 score of
60 % which was available to the participants during the development phase. A
study with human annotators suggests that automatic methods outperform human
predictions, possibly by honing into subtle textual clues not used by humans.
Corpora, resources, and results are available at the shared task website at
http://implicitemotions.wassa2018.com.Comment: Accepted at Proceedings of the 9th Workshop on Computational
Approaches to Subjectivity, Sentiment and Social Media Analysi
Detecting Event-Related Links and Sentiments from Social Media Texts
Nowadays, the importance of Social Media is constantly growing, as people often use such platforms to share mainstream media news and comment on the events that they relate to. As such, people no loger remain mere spectators to the events that happen in the world, but become part of them, commenting on their developments and the entities involved, sharing their opinions and distributing related content. This paper describes a system that links the main events detected from clusters of newspaper articles to tweets related to them, detects complementary information sources from the links they contain and subsequently applies sentiment analysis to classify them into positive, negative and neutral. In this manner, readers can follow the main events happening in the world, both from the perspective of mainstream as well as social media and the public's perception on them. This system is part of a media monitoring framework working live and it will be demonstrated using Google Earth.JRC.G.2-Global security and crisis managemen
Improving Sentiment Analysis over non-English Tweets using Multilingual Transformers and Automatic Translation for Data-Augmentation
Tweets are specific text data when compared to general text. Although
sentiment analysis over tweets has become very popular in the last decade for
English, it is still difficult to find huge annotated corpora for non-English
languages. The recent rise of the transformer models in Natural Language
Processing allows to achieve unparalleled performances in many tasks, but these
models need a consequent quantity of text to adapt to the tweet domain. We
propose the use of a multilingual transformer model, that we pre-train over
English tweets and apply data-augmentation using automatic translation to adapt
the model to non-English languages. Our experiments in French, Spanish, German
and Italian suggest that the proposed technique is an efficient way to improve
the results of the transformers over small corpora of tweets in a non-English
language.Comment: Accepted to COLING202
Definición de disparador de emoción asociado a la cultura y aplicación a la clasificación de la valencia y la emoción en textos
Este artículo presenta un método de identificación y clasificación de la valencia y las
emociones presentes en un texto. Para ello, se introduce un nuevo concepto denominado
disparador de emoción. Inicialmente, se construye de forma incremental una base de datos
léxica de disparadores de emoción asociados a la cultura con la que se quiere trabajar,
basándose en tres teorías diferentes: la Teoría de la Relevancia de Pragmática, la Teoría de la
Motivación de Maslow de Psicología y la Teoría de Necesidades de Neef de Economía. La base
de datos creada parte de un conjunto inicial de términos y es ampliada con la información de
otros recursos léxicos, como WordNet, NomLex y dominios relevantes. El enlace entre idiomas
se hace por medio de EuroWordNet y se completa y adapta a diversas culturas con bases de
conocimiento específicas para cada lengua. También, se demuestra cómo la base de datos
construida puede ser utilizada para buscar en textos la valencia (polaridad) y el significado
afectivo. Finalmente, se evalúa el método utilizando los datos de prueba de la tarea nº 14 de
Semeval “Texto afectivo” y su traducción al español. Los resultados y las mejoras se presentan
junto con una discusión en la que se tratan los puntos fuertes y débiles del método y las
directrices para el trabajo futuro.This paper presents a method to automatically spot and classify the valence and
emotions present in written text, based on a concept we introduced - of emotion triggers. The
first step consists of incrementally building a culture dependent lexical database of emotion
triggers, emerging from the theory of relevance from pragmatics, Maslow´s theory of human
needs from psychology and Neef´s theory of human needs in economics. We start from a core
of terms and expand them using lexical resources such as WordNet, completed by NomLex,
sense number disambiguated using the Relevant Domains concept. The mapping among
languages is accomplished using EuroWordNet and the completion and projection to different
cultures is done through language-specific commonsense knowledge bases. Subsequently, we
show the manner in which the constructed database can be used to mine texts for valence
(polarity) and affective meaning. An evaluation is performed on the Semeval Task No. 14:
Affective Text test data and their corresponding translation to Spanish. The results and
improvements are presented together with an argument on the strong and weak points of the
method and the directions for future work
Going beyond traditional QA systems: challenges and keys in opinion question answering
The treatment of factual data has been widely studied in different areas of Natural Language Processing (NLP). However, processing subjective information still poses important challenges. This paper presents research aimed at assessing techniques that have been suggested as appropriate in the context of subjective - Opinion Question Answering (OQA). We evaluate the performance of an OQA with these new components and propose methods to optimally tackle the issues encountered. We assess the impact of including additional resources and processes with the purpose of improving the system performance on two distinct blog datasets. The improvements obtained for the different combination of tools are statistically significant. We thus conclude that the proposed approach is adequate for the OQA task, offering a good strategy to deal with opinionated questions.This paper has been partially supported by Ministerio de Ciencia e Innovación - Spanish Government (grant no. TIN2009-13391-C04-01), and Conselleria d'Educación - Generalitat Valenciana (grant no. PROMETEO/2009/119 and ACOMP/2010/286)
Identifying subjective statements in news titles using a personal sense annotation framework
This is the accepted version of the following article: Panicheva, P.; Cardiff, J.; Rosso, P. (2013).
Identifying subjective statements in news titles using a personal sense annotation framework.
Journal of the American Society for Information Science and Technology. 64(7):1411-1422
, which has been published in final form at http://dx.doi.org/10.1002/asi.22841.[EN] Subjective language contains information about private states. The goal of subjective language identification is to determine that a private state is expressed, without considering its polarity or specific emotion. A component of word meaning, "Personal Sense," has clear potential in the field of subjective language identification, as it reflects a meaning of words in terms of unique personal experience and carries personal characteristics. In this paper we investigate how Personal Sense can be harnessed for the purpose of identifying subjectivity in news titles. In the process, we develop a new Personal Sense annotation framework for annotating and classifying subjectivity, polarity, and emotion. The Personal Sense framework yields high performance in a fine-grained subsentence subjectivity classification. Our experiments demonstrate lexico-syntactic features to be useful for the identification of subjectivity indicators and the targets that receive the subjective Personal Sense.The work of Paolo Rosso was done within the EC WIQEI IRSES project (grant no. 269180) FP 7 Marie Curie People Framework, the MICINN Text-Enterprise 2.0 project (TIN2009-13391-C04-03) Plan I+D+I, and the VLC/CAMPUS Microcluster on Multimodal Interaction in Intelligent Systems. We are grateful to the anonymous reviewers for helpful comments.Panicheva, P.; Cardiff, J.; Rosso, P. (2013). Identifying subjective statements in news titles using a personal sense annotation framework. Journal of the American Society for Information Science and Technology. 64(7):1411-1422. https://doi.org/10.1002/asi.22841S1411142264
Mapping Nanomedicine Terminology in the Regulatory Landscape
A common terminology is essential in any field of science and technology for a mutual understanding among different communities of experts and regulators, harmonisation of policy actions, standardisation of quality procedures and experimental testing, and the communication to the general public. It also allows effective revision of information for policy making and optimises research fund allocation.
In particular, in emerging scientific fields with a high innovation potential, new terms, descriptions and definitions are quickly generated, which are then ambiguously used by stakeholders having diverse interests, coming from different scientific disciplines and/or from various regions. The application of nanotechnology in health -often called nanomedicine- is considered as such emerging and multidisciplinary field with a growing interest of various communities.
In order to support a better understanding of terms used in the regulatory domain, the Nanomedicines Working Group of the International Pharmaceutical Regulators Forum (IPRF) has prioritised the need to map, compile and discuss the currently used terminology of regulatory scientists coming from different geographic areas. The JRC has taken the lead to identify and compile frequently used terms in the field by using web crawling and text mining tools as well as the manual extraction of terms. Websites of 13 regulatory authorities and clinical trial registries globally involved in regulating nanomedicines have been crawled. The compilation and analysis of extracted terms demonstrated sectorial and geographical differences in the frequency and type of nanomedicine related terms used in a regulatory context. Finally 31 relevant and most frequently used terms deriving from various agencies have been compiled, discussed and analysed for their similarities and differences. These descriptions will support the development of harmonised use of terminology in the future.
The report provides necessary background information to advance the discussion among stakeholders. It will strengthen activities aiming to develop harmonised standards in the field of nanomedicine, which is an essential factor to stimulate innovation and industrial competitiveness.JRC.F.2-Consumer Products Safet
- …