Search CORE

137 research outputs found

Toward a unifying model for Opinion, Sentiment and Emotion information extraction

Author: Fraisse Amel
Paroubek Patrick
Publication venue: European Language Resources Association (ELRA)
Publication date: 26/05/2014
Field of study

International audienceThis paper presents a logical formalization of a set 20 semantic categories related to opinion, emotion and sentiment. Our formalization is based on the BDI model (Belief, Desire and Intetion) and constitues a first step toward a unifying model for subjective information extraction. The separability of the subjective classes that we propose was assessed both formally and on two subjective reference corpora

CiteSeerX

On improving the implementation of automatic updating of systematic reviews

Author: Koroleva Anna
Olarte Parra Camila
Paroubek Patrick
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2020
Field of study

Ghent University Academic Bibliography

The NLP4NLP Corpus (I): 50 Years of Publication, Collaboration and Citation in Speech and Language Processing

Author: Gil Francopoulo
Joseph Mariani
Patrick Paroubek
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2019
Field of study

This paper introduces the NLP4NLP corpus, which contains articles published in 34 major conferences and journals in the field of speech and natural language processing over a period of 50 years (1965–2015), comprising 65,000 documents, gathering 50,000 authors, including 325,000 references and representing ~270 million words. Most of these publications are in English, some are in French, German, or Russian. Some are open access, others have been provided by the publishers. In order to constitute and analyze this corpus several tools have been used or developed. Many of them use Natural Language Processing methods that have been published in the corpus, hence its name. The paper presents the corpus and some findings regarding its content (evolution over time of the number of articles and authors, collaborations between authors, citations between papers and authors), in the context of a global or comparative analysis between sources. Numerous manual corrections were necessary, which demonstrated the importance of establishing standards for uniquely identifying authors, articles, or publications

HAL-CentraleSupelec

Directory of Open Access Journals

INRIA a CCSD electronic archive server

AppFM, une plate-forme de gestion de modules de TAL

Author: Bui-Quang Paul
Grau Brigitte
Paroubek Patrick
Publication venue: HAL CCSD
Publication date: 04/07/2016
Field of study

International audienceAppFM is a tool between a NLP pipeline framework and a system service management. It allows integration of applications with complex dependencies into functional modules workflows of convenient usage within multiples interfaces.AppFM est un outil à mi-chemin entre un environnement de création de chaînes modulaires de TAL et un gestionnaire de services systèmes. Il permet l'intégration d'applications ayant des dépendances complexes en des chaînes de traitements réutilisables facilement par le biais de multiples interfaces

NLP4NLP+5: The Deep (R)evolution in Speech and Language Processing

Author: Frédéric Vernier
Gil Francopoulo
Joseph Mariani
Patrick Paroubek
Publication venue: 'Frontiers Media SA'
Publication date: 01/07/2022
Field of study

This paper aims at analyzing the changes in the fields of speech and natural language processing over the recent past 5 years (2016–2020). It is in continuation of a series of two papers that we published in 2019 on the analysis of the NLP4NLP corpus, which contained articles published in 34 major conferences and journals in the field of speech and natural language processing, over a period of 50 years (1965–2015), and analyzed with the methods developed in the field of NLP, hence its name. The extended NLP4NLP+5 corpus now covers 55 years, comprising close to 90,000 documents [+30% compared with NLP4NLP: as many articles have been published in the single year 2020 than over the first 25 years (1965–1989)], 67,000 authors (+40%), 590,000 references (+80%), and approximately 380 million words (+40%). These analyses are conducted globally or comparatively among sources and also with the general scientific literature, with a focus on the past 5 years. It concludes in identifying profound changes in research topics as well as in the emergence of a new generation of authors and the appearance of new publications around artificial intelligence, neural networks, machine learning, and word embedding

HAL-CentraleSupelec

Directory of Open Access Journals

INRIA a CCSD electronic archive server

PubMed Central

Natural Language Processing for Cognitive Analysis of Emotions

Author: Cortal Gustave
Finkel Alain
Paroubek Patrick
Ye Lina
Publication venue: HAL CCSD
Publication date: 05/09/2022
Field of study

International audienc

INRIA a CCSD electronic archive server

Natural Language Processing for Cognitive Analysis of Emotions

Author: Cortal Gustave
Finkel Alain
Paroubek Patrick
Ye Lina
Publication venue: HAL CCSD
Publication date: 06/09/2022
Field of study

International audienceEmotion analysis in texts suffers from two major limitations: annotated gold-standard corpora are mostly small and homogeneous, and emotion identification is often simplified as a sentence-level classification problem. To address these issues, we introduce a new annotation scheme for exploring emotions and their causes, along with a new French dataset composed of autobiographical accounts of an emotional scene. The texts were collected by applying the Cognitive Analysis of Emotions developed by A. Finkel to help people improve on their emotion management. The method requires the manual analysis of an emotional event by a coach trained in Cognitive Analysis. We present a rule-based approach to automatically annotate emotions and their semantic roles (e.g. emotion causes) to facilitate the identification of relevant aspects by the coach. We investigate future directions for emotion analysis using graph structures

INRIA a CCSD electronic archive server

FRASQUES, le système du groupe LIR

Author: Grau Brigitte
Illouz Gabriel
Monceaux Laura
Paroubek Patrick
Pons Olivier
Robba Isabelle
Vilnat Anne
Publication venue: HAL CCSD
Publication date: 01/01/2005
Field of study

National audienceno abstrac

Actes de la conférence Traitement Automatique de la Langue Naturelle, TALN 2018: Volume 2 : Démonstrations, articles des Rencontres Jeunes Chercheurs, ateliers DeFT

Author: Cellier Peggy
Claveau Vincent
Grouin Cyril
Ligozat Anne-Laure
Minard Anne-Lyse
Paroubek Patrick
Publication venue: HAL CCSD
Publication date: 14/05/2018
Field of study

International audienc

INRIA a CCSD electronic archive server

A Dataset for Pharmacovigilance in German, French, and Japanese: Annotating Adverse Drug Reactions across Languages

Author: Aramaki Eiji
Grouin Cyril
Lavergne Thomas
Matsumoto Yuji
Möller Sebastian
Nishiyama Tomohiro
Névéol Aurélie
Paroubek Patrick
Raithel Lisa
Roller Roland
Thomas Philippe
Yada Shuntaro
Yeh Hui-Syuan
Zweigenbaum Pierre
Publication venue
Publication date: 27/03/2024
Field of study

User-generated data sources have gained significance in uncovering Adverse Drug Reactions (ADRs), with an increasing number of discussions occurring in the digital world. However, the existing clinical corpora predominantly revolve around scientific articles in English. This work presents a multilingual corpus of texts concerning ADRs gathered from diverse sources, including patient fora, social media, and clinical reports in German, French, and Japanese. Our corpus contains annotations covering 12 entity types, four attribute types, and 13 relation types. It contributes to the development of real-world multilingual language models for healthcare. We provide statistics to highlight certain challenges associated with the corpus and conduct preliminary experiments resulting in strong baselines for extracting entities and relations between these entities, both within and across languages.Comment: Accepted at LREC-COLING 202

arXiv.org e-Print Archive