42 research outputs found

    Harnessing GPT-3.5-turbo for Rhetorical Role Prediction in Legal Cases

    Full text link
    We propose a comprehensive study of one-stage elicitation techniques for querying a large pre-trained generative transformer (GPT-3.5-turbo) in the rhetorical role prediction task of legal cases. This task is known as requiring textual context to be addressed. Our study explores strategies such as zero-few shots, task specification with definitions and clarification of annotation ambiguities, textual context and reasoning with general prompts and specific questions. We show that the number of examples, the definition of labels, the presentation of the (labelled) textual context and specific questions about this context have a positive influence on the performance of the model. Given non-equivalent test set configurations, we observed that prompting with a few labelled examples from direct context can lead the model to a better performance than a supervised fined-tuned multi-class classifier based on the BERT encoder (weighted F1 score of = 72%). But there is still a gap to reach the performance of the best systems = 86%) in the LegalEval 2023 task which, on the other hand, require dedicated resources, architectures and training

    Exploitation de lexiques pour la catégorisation fine d'émotions, de sentiments et d'opinions

    Get PDF
    International audienceNous présentons dans cet article notre proposition pour la 11 ème édition du Défi Fouille de Textes (DEFT). Nous participons à trois tâches proposées dans le cadre de cet atelier en fouille d'opinion. Les objectifs de ces tâches sont de classer des tweets en français sur le sujet des énergies renouvelables, respectivement du point de vue de la polarité, du type général d'information énoncé, et enfin de la classe fine du sentiment, de l'émotion ou de l'opinion exprimée. Pour réaliser cette catégorisation, nous proposons d'explorer et d'évaluer différentes méthodes de construction de lexiques typés sémantiquement : outre des lexiques affectifs construits manuellement, nous expérimentons des lexiques typés construits semi-automatiquement sur le corpus d'évaluation et d'autres sur un corpus tiers

    Language Model Adaptation to Specialized Domains through Selective Masking based on Genre and Topical Characteristics

    Full text link
    Recent advances in pre-trained language modeling have facilitated significant progress across various natural language processing (NLP) tasks. Word masking during model training constitutes a pivotal component of language modeling in architectures like BERT. However, the prevalent method of word masking relies on random selection, potentially disregarding domain-specific linguistic attributes. In this article, we introduce an innovative masking approach leveraging genre and topicality information to tailor language models to specialized domains. Our method incorporates a ranking process that prioritizes words based on their significance, subsequently guiding the masking procedure. Experiments conducted using continual pre-training within the legal domain have underscored the efficacy of our approach on the LegalGLUE benchmark in the English language. Pre-trained language models and code are freely available for use

    FRASQUES : A Question-Answering System in the EQueR Evaluation Campaign

    Get PDF
    à paraîtreInternational audienceQuestion-answering (QA) systems aim at providing either a small passage or just the answer to a question in natural language. We have developed several QA systems that work on both English and French. This way, we are able to provide answers to questions given in both languages by searching documents in both languages also. In this article, we present our French monolingual system FRASQUES which participated in the EQueR evaluation campaign of QA systems for French in 2004. First, the QA architecture common to our systems is shown. Then, for every step of the QA process, we consider which steps are language-independent, and for those that are language-dependent, the tools or processes that need to be adapted to switch for one language to another. Finally, our results at EQueR are given and commented; an error analysis is conducted, and the kind of knowledge needed to answer a question is studied

    Comparability measurement for terminology extraction

    Get PDF
    Proceedings of the Workshop CHAT 2011: Creation, Harmonization and Application of Terminology Resources. Editors: Tatiana Gornostay and Andrejs Vasiļjevs. NEALT Proceedings Series, Vol. 12 (2011), 3-10. © 2011 The editors and contributors. Published by Northern European Association for Language Technology (NEALT) http://omilia.uio.no/nealt . Electronically published at Tartu University Library (Estonia) http://hdl.handle.net/10062/16956

    Extraction d'expressions-cibles de l'opinion : de l'anglais au français

    Get PDF
    National audienceIn this paper, we present the development of an Opinion Target Extraction system in english and transpose it to french. In addition, we realize an analysis of the features and their effectiveness in english and french which suggest that it is possible to build an Opinion Target Extraction system independant of the domain. Finally, we propose a comparative study of the errors of our systems in both english and french and propose several solutions to these problems.Dans cet article, nous présentons le développement d'un système d'extraction d'expressions-cibles pour l'anglais et sa transposition au français. En complément, nous avons réalisé une étude de l'efficacité des traits en anglais et en français qui tend à montrer qu'il est possible de réaliser un système d'extraction d'expressions-cibles indépendant du domaine. Pour finir, nous proposons une analyse comparative des erreurs commises par nos systèmes en anglais et français et envisageons différentes solutions à ces problèmes

    Coping with Alternate Formulations of Questions and Answers

    Get PDF
    We present in this chapter the QALC system which has participated in the four TREC QA evaluations. We focus here on the problem of linguistic variation in order to be able to relate questions and answers. We present first, variation at the term level which consists in retrieving questions terms in document sentences even if morphologic, syntactic or semantic variations alter them. Our second subject matter concerns variation at the sentence level that we handle as different partial reformulations of questions. Questions are associated with extraction patterns based on the question syntactic type and the object that is under query. We present the whole system thus allowing situating how QALC deals with variation, and different evaluations

    How NLP Can Improve Question Answering

    Get PDF
    Answering open-domain factual questions requires Natural Language processing for refining document selection and answer identification. With our system QALC, we have participated to the Question Answering track of the TREC8, TREC9, and TREC10 evaluations. QALC performs an analysis of documents relying on multi-word term search and their linguistic variation both to minimize the number of documents selected and to provide additional clues when comparing question and sentence representations. This comparison process also makes use of the results of a syntactic parsing of the questions and Named Entity recognition functionalities. Answer extraction relies on the application of syntactic patterns chosen according to the kind of information that is sought for, and categorized depending on the syntactic form of the question. These patterns allow QALC to handle nicely linguistic variations at the answer leve

    Evaluating Lexical Similarity to build Sentiment Similarity

    Get PDF
    International audienceIn this article, we propose to evaluate the lexical similarity information provided by word representations against several opinion resourcesusing traditional Information Retrieval tools. Word representation have been used to build and to extend opinion resources such aslexicon, and ontology and their performance have been evaluated on sentiment analysis tasks. We question this method by measuring thecorrelation between the sentiment proximity provided by opinion resources and the semantic similarity provided by word representationsusing different correlation coefficients. We also compare the neighbors found in word representations and list of similar opinion words.Our results show that the proximity of words in state-of-the-art word representations is not very effective to build sentiment similarity

    FRASQUES, le système du groupe LIR

    Get PDF
    National audienceno abstrac
    corecore