136 research outputs found

    Toward a unifying model for Opinion, Sentiment and Emotion information extraction

    Get PDF
    International audienceThis paper presents a logical formalization of a set 20 semantic categories related to opinion, emotion and sentiment. Our formalization is based on the BDI model (Belief, Desire and Intetion) and constitues a first step toward a unifying model for subjective information extraction. The separability of the subjective classes that we propose was assessed both formally and on two subjective reference corpora

    AppFM, une plate-forme de gestion de modules de TAL

    Get PDF
    International audienceAppFM is a tool between a NLP pipeline framework and a system service management. It allows integration of applications with complex dependencies into functional modules workflows of convenient usage within multiples interfaces.AppFM est un outil à mi-chemin entre un environnement de création de chaßnes modulaires de TAL et un gestionnaire de services systÚmes. Il permet l'intégration d'applications ayant des dépendances complexes en des chaßnes de traitements réutilisables facilement par le biais de multiples interfaces

    The NLP4NLP Corpus (I): 50 Years of Publication, Collaboration and Citation in Speech and Language Processing

    Get PDF
    This paper introduces the NLP4NLP corpus, which contains articles published in 34 major conferences and journals in the field of speech and natural language processing over a period of 50 years (1965–2015), comprising 65,000 documents, gathering 50,000 authors, including 325,000 references and representing ~270 million words. Most of these publications are in English, some are in French, German, or Russian. Some are open access, others have been provided by the publishers. In order to constitute and analyze this corpus several tools have been used or developed. Many of them use Natural Language Processing methods that have been published in the corpus, hence its name. The paper presents the corpus and some findings regarding its content (evolution over time of the number of articles and authors, collaborations between authors, citations between papers and authors), in the context of a global or comparative analysis between sources. Numerous manual corrections were necessary, which demonstrated the importance of establishing standards for uniquely identifying authors, articles, or publications

    NLP4NLP+5: The Deep (R)evolution in Speech and Language Processing

    Get PDF
    This paper aims at analyzing the changes in the fields of speech and natural language processing over the recent past 5 years (2016–2020). It is in continuation of a series of two papers that we published in 2019 on the analysis of the NLP4NLP corpus, which contained articles published in 34 major conferences and journals in the field of speech and natural language processing, over a period of 50 years (1965–2015), and analyzed with the methods developed in the field of NLP, hence its name. The extended NLP4NLP+5 corpus now covers 55 years, comprising close to 90,000 documents [+30% compared with NLP4NLP: as many articles have been published in the single year 2020 than over the first 25 years (1965–1989)], 67,000 authors (+40%), 590,000 references (+80%), and approximately 380 million words (+40%). These analyses are conducted globally or comparatively among sources and also with the general scientific literature, with a focus on the past 5 years. It concludes in identifying profound changes in research topics as well as in the emergence of a new generation of authors and the appearance of new publications around artificial intelligence, neural networks, machine learning, and word embedding

    Natural Language Processing for Cognitive Analysis of Emotions

    Get PDF
    International audienceEmotion analysis in texts suffers from two major limitations: annotated gold-standard corpora are mostly small and homogeneous, and emotion identification is often simplified as a sentence-level classification problem. To address these issues, we introduce a new annotation scheme for exploring emotions and their causes, along with a new French dataset composed of autobiographical accounts of an emotional scene. The texts were collected by applying the Cognitive Analysis of Emotions developed by A. Finkel to help people improve on their emotion management. The method requires the manual analysis of an emotional event by a coach trained in Cognitive Analysis. We present a rule-based approach to automatically annotate emotions and their semantic roles (e.g. emotion causes) to facilitate the identification of relevant aspects by the coach. We investigate future directions for emotion analysis using graph structures

    FRASQUES, le systĂšme du groupe LIR

    Get PDF
    National audienceno abstrac

    De l'importance des synonymes pour la sélection de passages en question-réponse

    Get PDF
    National audienceMost of the question answering systems currently developed adopt a fairly similar architecture, which can be divided into three modules: question analysis, document retrieval, and answer extraction. However, they differ in their tools (indexing engine, parsers...) and the knowledge bases they use. Thus, for each of these systems, it is important to estimate the contribution of these tools or knowledge bases. In the context of the Equer campaign (evaluation campaign for French question answering systems), our system FRASQUES produced two runs: one used synonyms for bi-terms only, the other for mono-terms too. The comparison of these two tests and the study of a broader corpus, in French and in English, allow us to measure the contribution of this kind of semantic knowledge.Les systĂšmes de question-rĂ©ponse dĂ©veloppĂ©s actuellement adoptent pour la plupart et Ă  peu de chose prĂšs le mĂȘme type d'architecture que l'on peut schĂ©matiser en trois modules : l'analyse de la question, la sĂ©lection des documents, l'extraction de la rĂ©ponse. Mais ce en quoi ils diffĂšrent, ce sont les outils (moteur d'indexation, analyseurs...) et les bases de connaissances qu'ils utilisent. Pour chacun de ces systĂšmes, il est donc important d'Ă©valuer l'apport de ces outils ou bases de connaissances. Dans le cadre de la campagne Equer (campagne d'Ă©valuation des systĂšmes de question-rĂ©ponse pour le français), notre systĂšme FRASQUES a produit deux jeux de rĂ©sultats : l'un utilise des synonymes dans les bi-termes, l'autre pour les mono-termes aussi. La comparaison de ces deux tests et l'Ă©tude d'un corpus plus large, en français et en anglais, permet de mesurer l'apport de ces connaissances sĂ©mantiques
    • 

    corecore