Search CORE

63 research outputs found

Effectiveness of Data Augmentation and Pretraining for Improving Neural Headline Generation in Low-Resource Settings

Author: Martinc Matej
Montariol Syrielle
Pivovarova Lidia
Zosa Elaine
Publication venue: European Language Resources Association (ELRA)
Publication date: 01/07/2022
Field of study

We tackle the problem of neural headline generation in a low-resource setting, where only limited amount of data is available to train a model. We compare the ideal high-resource scenario on English with results obtained on a smaller subset of the same data and also run experiments on two small news corpora covering low-resource languages, Croatian and Estonian. Two options for headline generation in a multilingual low-resource scenario are investigated: a pretrained multilingual encoder-decoder model and a combination of two pretrained language models, one used as an encoder and the other as a decoder, connected with a cross-attention layer that needs to be trained from scratch. The results show that the first approach outperforms the second one by a large margin. We explore several data augmentation and pretraining strategies in order to improve the performance of both models and show that while we can drastically improve the second approach using these strategies, they have little to no effect on the performance of the pretrained encoder-decoder model. Finally, we propose two new measures for evaluating the performance of the models besides the classic ROUGE scores.Peer reviewe

Helsingin yliopiston digitaalinen arkisto

CRoW: Benchmarking Commonsense Reasoning in Real-World Tasks

Author: Bosselut Antoine
Geva Mor
Ismayilzada Mete
Montariol Syrielle
Paul Debjit
Publication venue
Publication date: 23/10/2023
Field of study

Recent efforts in natural language processing (NLP) commonsense reasoning research have yielded a considerable number of new datasets and benchmarks. However, most of these datasets formulate commonsense reasoning challenges in artificial scenarios that are not reflective of the tasks which real-world NLP systems are designed to solve. In this work, we present CRoW, a manually-curated, multi-task benchmark that evaluates the ability of models to apply commonsense reasoning in the context of six real-world NLP tasks. CRoW is constructed using a multi-stage data collection pipeline that rewrites examples from existing datasets using commonsense-violating perturbations. We use CRoW to study how NLP systems perform across different dimensions of commonsense knowledge, such as physical, temporal, and social reasoning. We find a significant performance gap when NLP systems are evaluated on CRoW compared to humans, showcasing that commonsense reasoning is far from being solved in real-world task settings. We make our dataset and leaderboard available to the research community at https://github.com/mismayil/crow.Comment: 37 pages, camera-ready for EMNLP 202

arXiv.org e-Print Archive

MEDITRON-70B: Scaling Medical Pretraining for Large Language Models

Author: Bayazit Deniz
Bonnet Antoine
Bosselut Antoine
Cano Alejandro Hernández
Chen Zeming
Fan Simin
Hartley Mary-Anne
Jaggi Martin
Krawczuk Igor
Köpf Andreas
Marmet Axel
Matoba Kyle
Mohtashami Amirkeivan
Montariol Syrielle
Pagliardini Matteo
Romanou Angelika
Sakhaeirad Alireza
Sallinen Alexandre
Salvi Francesco
Swamy Vinitra
Publication venue
Publication date: 27/11/2023
Field of study

Large language models (LLMs) can potentially democratize access to medical knowledge. While many efforts have been made to harness and improve LLMs' medical knowledge and reasoning capacities, the resulting models are either closed-source (e.g., PaLM, GPT-4) or limited in scale (<= 13B parameters), which restricts their abilities. In this work, we improve access to large-scale medical LLMs by releasing MEDITRON: a suite of open-source LLMs with 7B and 70B parameters adapted to the medical domain. MEDITRON builds on Llama-2 (through our adaptation of Nvidia's Megatron-LM distributed trainer), and extends pretraining on a comprehensively curated medical corpus, including selected PubMed articles, abstracts, and internationally-recognized medical guidelines. Evaluations using four major medical benchmarks show significant performance gains over several state-of-the-art baselines before and after task-specific finetuning. Overall, MEDITRON achieves a 6% absolute performance gain over the best public baseline in its parameter class and 3% over the strongest baseline we finetuned from Llama-2. Compared to closed-source LLMs, MEDITRON-70B outperforms GPT-3.5 and Med-PaLM and is within 5% of GPT-4 and 10% of Med-PaLM-2. We release our code for curating the medical pretraining corpus and the MEDITRON model weights to drive open-source development of more capable medical LLMs

arXiv.org e-Print Archive

Positron annihilation studies of recrystallization in the subsurface zone induced by friction in magnesium—effect of the inhomogeneity on measured positron annihilation characteristics

Author: A. Dupasquier
A.R. Barnett
B. Bergersen
B. Evans
B. Oberdorfer
C. Hübner
D.C. Connors
E. Dryzek
F. Göhler Von
F. Haessner
F. Montariol
F.J. Humphreys
G. Dlubek
G. Dlubek
G. Kögel
H. Hansen
J. Dryzek
J. Dryzek
J. Dryzek
J. Dryzek
J. Dryzek
J. Dryzek
J. Dryzek
J. Dryzek
J. Dryzek
J. Dryzek
J. Dryzek
J. Kansy
J. Rio del
J. Čižek
J.C. Nicoud
J.E. Burke
J.G. Byrne
J.M. Campillo Robles
J.P. Hirth
Jerzy Dryzek
K. Detert
K. Petersen
M. Myllylä
P. Hautojärvi
R. Würschum
S. Tanigawa
U. Holzwarth
W. Brandt
W. Brandt
W. Brandt
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Municipal Library of Toulouse

Author: Montariol Jean
Publication venue
Publication date: 01/01/1935
Field of study

Detail, west side; Jean Montariol was the chief architect of the city of Toulouse from 1929 to 1949. The library collection contains historical manuscripts and other works once owned by various regional monasteries and churches, confiscated in the French Revolution, as well as modern regional works and archives. The library opened one of the first Children's Libraries in France in 1940 (a division in the building). Source: Wikipedia; http://en.wikipedia.org/wiki/Main_Page (accessed 5/18/2011

MIT Libraries Dome