Search CORE

6 research outputs found

Pragmatic efficiency and text comprehension in L1 and L2 spoken and written communication

Author: Cappelli Gloria
Noccetti Sabrina
Simi Nicoletta
Publication venue: Coordinamento SIBA - Università del Salento
Publication date: 29/12/2022
Field of study

ESE - Salento University Publishing

Università del Salento: ESE - Salento University Publishing

Pragmatic efficiency and text comprehension in L1 and L2 spoken and written communication

Author: Cappelli Gloria
Noccetti Sabrina
Simi Nicoletta
Publication venue: Coordinamento SIBA - Università del Salento
Publication date: 29/12/2022
Field of study

ESE - Salento University Publishing

Proceedings of the Eighth Italian Conference on Computational Linguistics CliC-it 2021

Author
Publication venue: 'OpenEdition'
Publication date: 15/12/2022
Field of study

The eighth edition of the Italian Conference on Computational Linguistics (CLiC-it 2021) was held at Università degli Studi di Milano-Bicocca from 26th to 28th January 2022. After the edition of 2020, which was held in fully virtual mode due to the health emergency related to Covid-19, CLiC-it 2021 represented the first moment for the Italian research community of Computational Linguistics to meet in person after more than one year of full/partial lockdown

Directory of Open Access Books (DOAB)

Workshop Proceedings of the 12th edition of the KONVENS conference

Author: Faaß Gertrud
Ruppenhofer Josef
Publication venue: Mannheim : Leibniz-Institut für Deutsche Sprache (IDS)
Publication date: 11/07/2023
Field of study

The 2014 issue of KONVENS is even more a forum for exchange: its main topic is the interaction between Computational Linguistics and Information Science, and the synergies such interaction, cooperation and integrated views can produce. This topic at the crossroads of different research traditions which deal with natural language as a container of knowledge, and with methods to extract and manage knowledge that is linguistically represented is close to the heart of many researchers at the Institut für Informationswissenschaft und Sprachtechnologie of Universität Hildesheim: it has long been one of the institute’s research topics, and it has received even more attention over the last few years

Publikationsserver des Instituts für Deutsche Sprache

24th Nordic Conference on Computational Linguistics (NoDaLiDa)

Author
Publication venue: University of Tartu Library
Publication date: 01/05/2023
Field of study

DSpace at Tartu University Library

Decoding Legalese Without Borders: Multilingual Evaluation of Language Models on Long Legal Texts

Author: Niklaus Joël
Publication venue: Universität Bern
Publication date
Field of study

Pretrained transformers have sparked an explosion of research in the field of Natural Language Processing (NLP). Scaling up language models based on the transformer architecture in terms of size, compute, and data led to impressive emergent capabilities that were considered unattainable in such a brief span, a mere three years ago, prior to the launch of GPT-3. These advances catapulted the previously niche field of legal NLP into the mainstream, at the latest, with GPT-4 passing the bar. Many products based on GPT-4 and other large language models are entering the market at an increasing pace, many of those targeting the legal field. This dissertation makes contributions in two key areas within Natural Language Processing (NLP) focused on legal text: resource curation and detailed model analysis. First, we curate an extensive set of multilingual legal datasets, train a variety of language models on these, and establish comprehensive benchmarks for evaluating Large Language Models (LLMs) in the legal domain. Second, we conduct a multidimensional analysis of model performance, focusing on metrics like explainability and calibration in the context of Legal Judgment Prediction. We introduce novel evaluation frameworks and find that while our trained models exhibit high performance and better calibration than human experts, they do not necessarily offer improved explainability. Furthermore, we investigate the feasibility of re-identification in anonymized legal texts, concluding that large-scale re-identification using LLMs is currently unfeasible. For future work, we propose exploring domain adaptation and instruction tuning to enhance language model performance on legal benchmarks, while also advocating for a detailed examination of dataset overlaps and model interpretability. Additionally, we emphasize the need for dataset extension to unexplored legal tasks and underrepresented jurisdictions, aiming for a more comprehensive coverage of the global legal landscape in NLP resources

BORIS Theses