Search CORE

42 research outputs found

Harnessing GPT-3.5-turbo for Rhetorical Role Prediction in Legal Cases

Author: Belfathi Anas
Hernandez Nicolas
Monceaux Laura
Publication venue
Publication date: 26/10/2023
Field of study

We propose a comprehensive study of one-stage elicitation techniques for querying a large pre-trained generative transformer (GPT-3.5-turbo) in the rhetorical role prediction task of legal cases. This task is known as requiring textual context to be addressed. Our study explores strategies such as zero-few shots, task specification with definitions and clarification of annotation ambiguities, textual context and reasoning with general prompts and specific questions. We show that the number of examples, the definition of labels, the presentation of the (labelled) textual context and specific questions about this context have a positive influence on the performance of the model. Given non-equivalent test set configurations, we observed that prompting with a few labelled examples from direct context can lead the model to a better performance than a supervised fined-tuned multi-class classifier based on the BERT encoder (weighted F1 score of = 72%). But there is still a gap to reach the performance of the best systems = 86%) in the LegalEval 2023 task which, on the other hand, require dedicated resources, architectures and training

arXiv.org e-Print Archive

Exploitation de lexiques pour la catégorisation fine d'émotions, de sentiments et d'opinions

Author: Hernandez Nicolas
Jadi Grégoire
Lark Joseph
Monceaux Laura
Publication venue: HAL CCSD
Publication date: 22/06/2015
Field of study

International audienceNous présentons dans cet article notre proposition pour la 11 ème édition du Défi Fouille de Textes (DEFT). Nous participons à trois tâches proposées dans le cadre de cet atelier en fouille d'opinion. Les objectifs de ces tâches sont de classer des tweets en français sur le sujet des énergies renouvelables, respectivement du point de vue de la polarité, du type général d'information énoncé, et enfin de la classe fine du sentiment, de l'émotion ou de l'opinion exprimée. Pour réaliser cette catégorisation, nous proposons d'explorer et d'évaluer différentes méthodes de construction de lexiques typés sémantiquement : outre des lexiques affectifs construits manuellement, nous expérimentons des lexiques typés construits semi-automatiquement sur le corpus d'évaluation et d'autres sur un corpus tiers

Language Model Adaptation to Specialized Domains through Selective Masking based on Genre and Topical Characteristics

Author: Belfathi Anas
Dufour Richard
Gallina Ygor
Hernandez Nicolas
Monceaux Laura
Publication venue
Publication date: 26/02/2024
Field of study

Recent advances in pre-trained language modeling have facilitated significant progress across various natural language processing (NLP) tasks. Word masking during model training constitutes a pivotal component of language modeling in architectures like BERT. However, the prevalent method of word masking relies on random selection, potentially disregarding domain-specific linguistic attributes. In this article, we introduce an innovative masking approach leveraging genre and topicality information to tailor language models to specialized domains. Our method incorporates a ranking process that prioritizes words based on their significance, subsequently guiding the masking procedure. Experiments conducted using continual pre-training within the legal domain have underscored the efficacy of our approach on the LegalGLUE benchmark in the English language. Pre-trained language models and code are freely available for use

arXiv.org e-Print Archive

FRASQUES : A Question-Answering System in the EQueR Evaluation Campaign

Author: Grau Brigitte
Ligozat Anne-Laure
Monceaux Laura
Robba Isabelle
Vilnat Anne
Publication venue: HAL CCSD
Publication date: 01/01/2006
Field of study

à paraîtreInternational audienceQuestion-answering (QA) systems aim at providing either a small passage or just the answer to a question in natural language. We have developed several QA systems that work on both English and French. This way, we are able to provide answers to questions given in both languages by searching documents in both languages also. In this article, we present our French monolingual system FRASQUES which participated in the EQueR evaluation campaign of QA systems for French in 2004. First, the QA architecture common to our systems is shown. Then, for every step of the QA process, we consider which steps are language-independent, and for those that are language-dependent, the tools or processes that need to be adapted to switch for one language to another. Finally, our results at EQueR are given and commented; an error analysis is conducted, and the kind of knowledge needed to answer a question is studied

Comparability measurement for terminology extraction

Author: Blancafort Helena
Daille Béatrice
Jacquin Christine
Monceaux Laura
Morin Emmanuel
Poulard Fabien
Publication venue
Publication date: 08/05/2011
Field of study

Proceedings of the Workshop CHAT 2011: Creation, Harmonization and Application of Terminology Resources. Editors: Tatiana Gornostay and Andrejs Vasiļjevs. NEALT Proceedings Series, Vol. 12 (2011), 3-10. © 2011 The editors and contributors. Published by Northern European Association for Language Technology (NEALT) http://omilia.uio.no/nealt . Electronically published at Tartu University Library (Estonia) http://hdl.handle.net/10062/16956

DSpace at Tartu University Library

Extraction d'expressions-cibles de l'opinion : de l'anglais au français

Author: Claveau Vincent
Daille Béatrice
Jadi Grégoire
Monceaux Laura
Publication venue: HAL CCSD
Publication date: 04/07/2016
Field of study

National audienceIn this paper, we present the development of an Opinion Target Extraction system in english and transpose it to french. In addition, we realize an analysis of the features and their effectiveness in english and french which suggest that it is possible to build an Opinion Target Extraction system independant of the domain. Finally, we propose a comparative study of the errors of our systems in both english and french and propose several solutions to these problems.Dans cet article, nous présentons le développement d'un système d'extraction d'expressions-cibles pour l'anglais et sa transposition au français. En complément, nous avons réalisé une étude de l'efficacité des traits en anglais et en français qui tend à montrer qu'il est possible de réaliser un système d'extraction d'expressions-cibles indépendant du domaine. Pour finir, nous proposons une analyse comparative des erreurs commises par nos systèmes en anglais et français et envisageons différentes solutions à ces problèmes

INRIA a CCSD electronic archive server

Coping with Alternate Formulations of Questions and Answers

Author: Ferret Olivier
Grau Brigitte
Hurault-Plantet Martine
Jacquemin Christian
Monceaux Laura
Robba Isabelle
Vilnat Anne
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/07/2006
Field of study

We present in this chapter the QALC system which has participated in the four TREC QA evaluations. We focus here on the problem of linguistic variation in order to be able to relate questions and answers. We present first, variation at the term level which consists in retrieving questions terms in document sentences even if morphologic, syntactic or semantic variations alter them. Our second subject matter concerns variation at the sentence level that we handle as different partial reformulations of questions. Questions are associated with extraction patterns based on the question syntactic type and the object that is under query. We present the whole system thus allowing situating how QALC deals with variation, and different evaluations

How NLP Can Improve Question Answering

Author: Ferret Olivier
Grau Brigitte
Hurault-Plantet Martine
Illouz Gabriel
Jacquemin Christian
Monceaux Laura
Robba Isabelle
Vilnat Anne
Publication venue: Ergon Verlag
Publication date: 01/01/2002
Field of study

Answering open-domain factual questions requires Natural Language processing for refining document selection and answer identification. With our system QALC, we have participated to the Question Answering track of the TREC8, TREC9, and TREC10 evaluations. QALC performs an analysis of documents relying on multi-word term search and their linguistic variation both to minimize the number of documents selected and to provide additional clues when comparing question and sentence representations. This comparison process also makes use of the results of a syntactic parsing of the questions and Named Entity recognition functionalities. Answer extraction relies on the application of syntactic patterns chosen according to the kind of information that is sought for, and categorized depending on the syntactic form of the question. These patterns allow QALC to handle nicely linguistic variations at the answer leve

Evaluating Lexical Similarity to build Sentiment Similarity

Author: Claveau Vincent
Daille Béatrice
Jadi Grégoire
Monceaux-Cachard Laura
Publication venue: HAL CCSD
Publication date: 23/05/2016
Field of study

International audienceIn this article, we propose to evaluate the lexical similarity information provided by word representations against several opinion resourcesusing traditional Information Retrieval tools. Word representation have been used to build and to extend opinion resources such aslexicon, and ontology and their performance have been evaluated on sentiment analysis tasks. We question this method by measuring thecorrelation between the sentiment proximity provided by opinion resources and the semantic similarity provided by word representationsusing different correlation coefficients. We also compare the neighbors found in word representations and list of similar opinion words.Our results show that the proximity of words in state-of-the-art word representations is not very effective to build sentiment similarity

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

HAL-Rennes 1

FRASQUES, le système du groupe LIR

Author: Grau Brigitte
Illouz Gabriel
Monceaux Laura
Paroubek Patrick
Pons Olivier
Robba Isabelle
Vilnat Anne
Publication venue: HAL CCSD
Publication date: 01/01/2005
Field of study

National audienceno abstrac