Search CORE

13 research outputs found

Learning Word Embeddings: Unsupervised Methods for Fixed-size Representations of Variable-length Speech Segments

Author: Du Mingxing
Dupoux Emmanuel
Holzenberger Nils
Karadayi Julien
Riad Rachid
Publication venue: 'International Speech Communication Association'
Publication date: 02/09/2018
Field of study

International audienceFixed-length embeddings of words are very useful for a variety of tasks in speech and language processing. Here we systematically explore two methods of computing fixed-length embeddings for variable-length sequences. We evaluate their susceptibility to phonetic and speaker-specific variability on English, a high resource language and Xitsonga, a low resource language, using two evaluation metrics: ABX word discrimination and ROC-AUC on same-different phoneme n-grams. We show that a simple downsampling method supplemented with length information can outperform the variable-length input feature representation on both evaluations. Recurrent autoencoders, trained without supervision, can yield even better results at the expense of increased computational complexity

INRIA a CCSD electronic archive server

LegalBench: A Collaboratively Built Benchmark for Measuring Legal Reasoning in Large Language Models

Author: Chilton Adam
Chohlas-Wood Alex
Choi Jonathan H.
Dickinson Gregory M.
Fagan Frank
Gandhi Sunny
Gao Shang
Goel Sharad
Guha Neel
Hagan Margaret
Hegland Jason
Henderson Peter
Ho Daniel E.
Holzenberger Nils
Hoque Enam
Iyer Varun
Kolt Noam
Li Zehua
Livermore Michael A.
Ma Megan
Narayana Aditya
Nay John
Niklaus Joel
Nudell Joe
Nyarko Julian
Peters Austin
Porat Haggai
Rasumov-Rahe Nikon
Rehaag Sean
Rockmore Daniel
Ré Christopher
Sarfaty Galit
Surani Faiz
Talisman Dmitry
Tobia Kevin
Waldon Brandon
Williams Spencer
Wu Jessica
Zambrano Diego A.
Zur Tom
Publication venue: Osgoode Digital Commons
Publication date: 26/09/2023
Field of study

The advent of large language models (LLMs) and their adoption by the legal community has given rise to the question: what types of legal reasoning can LLMs perform? To enable greater study of this question, we present LegalBench: a collaboratively constructed legal reasoning benchmark consisting of 162 tasks covering six different types of legal reasoning. LegalBench was built through an interdisciplinary process, in which we collected tasks designed and hand-crafted by legal professionals. Because these subject matter experts took a leading role in construction, tasks either measure legal reasoning capabilities that are practically useful, or measure reasoning skills that lawyers find interesting. To enable cross-disciplinary conversations about LLMs in the law, we additionally show how popular legal frameworks for describing legal reasoning—which distinguish between its many forms—correspond to LegalBench tasks, thus giving lawyers and LLM developers a common vocabulary. This paper describes LegalBench, presents an empirical evaluation of 20 open-source and commercial LLMs, and illustrates the types of research explorations LegalBench enables

York University, Osgoode Hall Law School

LegalBench: A Collaboratively Built Benchmark for Measuring Legal Reasoning in Large Language Models

The advent of large language models (LLMs) and their adoption by the legal community has given rise to the question: what types of legal reasoning can LLMs perform? To enable greater study of this question, we present LegalBench: a collaboratively constructed legal reasoning benchmark consisting of 162 tasks covering six different types of legal reasoning. LegalBench was built through an interdisciplinary process, in which we collected tasks designed and hand-crafted by legal professionals. Because these subject matter experts took a leading role in construction, tasks either measure legal reasoning capabilities that are practically useful, or measure reasoning skills that lawyers find interesting. To enable cross-disciplinary conversations about LLMs in the law, we additionally show how popular legal frameworks for describing legal reasoning -- which distinguish between its many forms -- correspond to LegalBench tasks, thus giving lawyers and LLM developers a common vocabulary. This paper describes LegalBench, presents an empirical evaluation of 20 open-source and commercial LLMs, and illustrates the types of research explorations LegalBench enables.Comment: 143 pages, 79 tables, 4 figure

arXiv.org e-Print Archive

COMPUTATIONAL STATUTORY REASONING

Author: Holzenberger Nils
Publication venue: 'The Busan Gyeongnam Mathematical Society'
Publication date: 30/01/2023
Field of study

Statutory reasoning is the task of determining how laws apply to a legal case. This is a basic skill for lawyers, and in its computational form, a fundamental task for legal Artificial Intelligence (AI) systems. In this thesis, I take a first step towards solving computational statutory reasoning. First, I define computational statutory reasoning, in the context of legal practice, and AI more broadly (Chapter 1). I detail why statutory reasoning is important for legal AI, and how solving it will require addressing multiple problems in general AI research. In Chapter 2, I review research in reasoning, knowledge representation and legal AI that is relevant to the work in this thesis. My first contribution is the StAtutory Reasoning Assessment dataset (SARA), a benchmark dataset for computational statutory reasoning (Chapter 3). With the ability to measure performance on statutory reasoning, I show how a symbolic system can solve the SARA dataset, while state-of-the-art Machine Reading (MR) struggles. In Chapter 4, I connect statutory reasoning to established natural language processing tasks, in an attempt to diagnose MR errors. This yields more annotations on SARA, and a performance boost compared to my initial MR baselines. In Chapter 5, I return to the symbolic approach of Chapter 3. Revising the ontology used in Chapter 3, I introduce models for Information Extraction (IE) from SARA cases. The attained performance opens up new perspectives on how to solve statutory reasoning. I close by summarizing the contributions of this thesis, and expanding on possible future research (Chapter 6)

Johns Hopkins University

COMPUTATIONAL STATUTORY REASONING

Author: Holzenberger Nils
Publication venue: 'The Busan Gyeongnam Mathematical Society'
Publication date: 30/01/2023
Field of study

Johns Hopkins University

JScholarship

Asking the Right Questions in Low Resource Template Extraction

Author: Chen Yunmo
Holzenberger Nils
Van Durme Benjamin
Publication venue
Publication date: 25/05/2022
Field of study

Information Extraction (IE) researchers are mapping tasks to Question Answering (QA) in order to leverage existing large QA resources, and thereby improve data efficiency. Especially in template extraction (TE), mapping an ontology to a set of questions can be more time-efficient than collecting labeled examples. We ask whether end users of TE systems can design these questions, and whether it is beneficial to involve an NLP practitioner in the process. We compare questions to other ways of phrasing natural language prompts for TE. We propose a novel model to perform TE with prompts, and find it benefits from questions over other styles of prompts, and that they do not require an NLP background to author

arXiv.org e-Print Archive

Can GPT-3 Perform Statutory Reasoning?

Author: Blair-Stanek Andrew
Holzenberger Nils
Van Durme Benjamin
Publication venue
Publication date: 10/05/2023
Field of study

Statutory reasoning is the task of reasoning with facts and statutes, which are rules written in natural language by a legislature. It is a basic legal skill. In this paper we explore the capabilities of the most capable GPT-3 model, text-davinci-003, on an established statutory-reasoning dataset called SARA. We consider a variety of approaches, including dynamic few-shot prompting, chain-of-thought prompting, and zero-shot prompting. While we achieve results with GPT-3 that are better than the previous best published results, we also identify several types of clear errors it makes. We investigate why these errors happen. We discover that GPT-3 has imperfect prior knowledge of the actual U.S. statutes on which SARA is based. More importantly, we create simple synthetic statutes, which GPT-3 is guaranteed not to have seen during training. We find GPT-3 performs poorly at answering straightforward questions about these simple synthetic statutes.Comment: 10 page

arXiv.org e-Print Archive