36 research outputs found

    Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE 2022): Workshop and Shared Task Report

    Full text link
    We provide a summary of the fifth edition of the CASE workshop that is held in the scope of EMNLP 2022. The workshop consists of regular papers, two keynotes, working papers of shared task participants, and task overview papers. This workshop has been bringing together all aspects of event information collection across technical and social science fields. In addition to the progress in depth, the submission and acceptance of multimodal approaches show the widening of this interdisciplinary research topic.Comment: to appear at CASE 2022 @ EMNLP 202

    Automated Extraction of Socio-political Events from News (AESPEN): Workshop and Shared Task Report

    Get PDF
    We describe our effort on automated extraction of socio-political events from news in the scope of a workshop and a shared task we organized at Language Resources and Evaluation Conference (LREC 2020). We believe the event extraction studies in computational linguistics and social and political sciences should further support each other in order to enable large scale socio-political event information collection across sources, countries, and languages. The event consists of regular research papers and a shared task, which is about event sentence coreference identification (ESCI), tracks. All submissions were reviewed by five members of the program committee. The workshop attracted research papers related to evaluation of machine learning methodologies, language resources, material conflict forecasting, and a shared task participation report in the scope of socio-political event information collection. It has shown us the volume and variety of both the data sources and event information collection approaches related to socio-political events and the need to fill the gap between automated text processing techniques and requirements of social and political sciences

    Resource Creation and Evaluation for Multilingual Sentiment Analysis in Social Media Texts

    Get PDF
    Sentiment analysis (SA) regards the classification of texts according to the polarity of the opinions they express. SA systems are highly relevant to many real-world applications (e.g. marketing, eGovernance, business intelligence, behavioral sciences) and also to many tasks in Natural Language Processing (NLP) – information extraction, question answering, textual entailment, to name just a few. The importance of this field has been proven by the high number of approaches proposed in research, as well as by the interest that it raised from other disciplines and the applications that were created using its technology. In our case, the primary focus is to use sentiment analysis in the context of media monitoring, to enable tracking of global reactions to events. The main challenge that we face is that tweets are written in different languages and an unbiased system should be able to deal with all of them, in order to process all (possible) available data. Unfortunately, although many linguistic resources exist for processing texts written in English, for many other languages data and tools are scarce. Following our initial efforts described in (Balahur and Turchi, 2013), in this article we extend our study on the possibility to implement a multilingual system that is able to a) classify sentiment expressed in tweets in various languages using training data obtained through machine translation; b) to verify the extent to which the quality of the translations influences the sentiment classification performance, in this case, of highly informal texts; and c) to improve multilingual sentiment classification using small amounts of data annotated in the target language. To this aim, varying sizes of target language data are tested. The languages we explore are: Arabic, Turkish, Russian, Italian, Spanish, German and French.JRC.G.2-Global security and crisis managemen

    Multilingual Lexicalisation and Population of Event Ontologies: A Case Study for Social Media

    No full text
    In this book chapter we will describe a semi-automatic method for ontology-driven lexical acquisition and ontology population. The method is based on a multilingual weakly-supervised approach for learning of semantic classes and event patterns from unannotated text corpus. To illustrate the feasibility of our approach, we apply it on a corpus of Twitter messages for two languages and we populate a micro-ontology of events in these languages. Our proposal is about multi- and cross-lingual ontology-based information extraction and ontology population. We propose it for the section Methods. The proposed method can be viewed as a step towards the development of Social Semantic Web [3]. In the Social Semantic Web, user generated content is labeled with ontology-based semantics, linking in this way user communities from different social networks. Our lexical learning and ontology population approach can be used to automatically annotate or propose semantic annotation to status messages, written by social media users. This potential application is similar to the approach described in [11]. We already experimented with learning and populating a prototype of a micro-ontology from Twitter. We evaluated the extracted information and we found the preliminary results encouraging.JRC.E.1-Disaster Risk Managemen
    corecore