Search CORE

22 research outputs found

The effect of word sense disambiguation accuracy on literature based discovery

Author: AJ Jimeno-Yepes
AR Aronson
D West
DR Swanson
DR Swanson
E Agirre
E Agirre
H Liu
J Preiss
J Preiss
J Preiss
Judita Preiss
M Carpuat
M Carpuat
M Rimmer
M Sanderson
M Stevenson
M Weeber
M Weeber
M Weeber
M Yetisgen-Yildiz
Mark Stevenson
O Bodenreider
P Resnik
RN Kostoff
W Cheng
YS Chan
Z Zhong
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Background The volume of research published in the biomedical domain has increasingly lead to researchers focussing on specific areas of interest and connections between findings being missed. Literature based discovery (LBD) attempts to address this problem by searching for previously unnoticed connections between published information (also known as “hidden knowledge”). A common approach is to identify hidden knowledge via shared linking terms. However, biomedical documents are highly ambiguous which can lead LBD systems to over generate hidden knowledge by hypothesising connections through different meanings of linking terms. Word Sense Disambiguation (WSD) aims to resolve ambiguities in text by identifying the meaning of ambiguous terms. This study explores the effect of WSD accuracy on LBD performance. Methods An existing LBD system is employed and four approaches to WSD of biomedical documents integrated with it. The accuracy of each WSD approach is determined by comparing its output against a standard benchmark. Evaluation of the LBD output is carried out using timeslicing approach, where hidden knowledge is generated from articles published prior to a certain cutoff date and a gold standard extracted from publications after the cutoff date. Results WSD accuracy varies depending on the approach used. The connection between the performance of the LBD and WSD systems are analysed to reveal a correlation between WSD accuracy and LBD performance. Conclusion This study reveals that LBD performance is sensitive to WSD accuracy. It is therefore concluded that WSD has the potential to improve the output of LBD systems by reducing the amount of spurious hidden knowledge that is generated. It is also suggested that further improvements in WSD accuracy have the potential to improve LBD accuracy

University of Salford Institutional Repository

Crossref

Springer - Publisher Connector

PubMed Central

White Rose Research Online

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

Author: :
Abdollahi Arezoo
Abdulmumin Idris
Abrar Nafis
Adelani David Ifeoluwa
Aghagol Arash
Aji Alham Fikri
Ajibade Benjamin
Akiki Christopher
Akinlolu Martha
Al-shaibani Maged S.
Albanie Samuel
Alfassy Amit
Alizadeh Samira
allal Loubna Ben
Almubarak Khalid
Altay Gabriel
Alyafeai Zaid
Ammanamanchi Pawan Sasanka
Amuok Priscilla
An Ran
Antverg Omer
Bach Stephen H.
Bajaj Yash Shailesh
Bamberger Zachary
Bari M Saiful
Barth Fabio
Baruwa Ahmed
Bawden Rachel
Baylor Emi
Bayrak Giyaseddin
Behroozi Bahareh
Beilharz Benjamin
Bekman Stas
Belinkov Yonatan
Belkada Younes
Bello Imane
Beltagy Iz
Ben-David Srulik
Benyamina Hamza
Bers Tali
Bharati Sushil
Bhattacharjee Joydeep
Bhattacharya Indrani
Biderman Stella
Bogdanov Eli
Bommasani Rishi
Bose Shamik
Bourfoune Hatim
Bras Mathilde
Brito Caio
Broad Nicholas Michio
Brody Shaked
Bulchandani Lokesh
Burns Gully
Burynok Mykola
Cahyawijaya Samuel
Callahan Alison
Canalli Rodrigo
Carpuat Marine
Casper Jared
Castagné Roman
Castillo Maria A
Chaffin Antoine
Chandrasekhar Ramya
Chang Jonathan
Chen Kimbo
Cheng Newton
Cheveleva Anastasia
Chhablani Gunjan
Chim Jenny
Chung Hyung Won
Clinciu Miruna
Clive Jordan
Coavoux Maximin
Colombo Pierre
Contractor Danish
Cornette Pierre
Cullan Michael
Dahlberg Nathan
Danchev Valentin
Dash Ishani
Datta Debajyoti
David Davis
de Bykhovetz Madeleine Hahn
de Gibert Ona
de la Rosa Javier
De Toni Francesco
De Wolf Michiel
del Moral Albert Villanova
Deshmukh Shlok S
Dettmers Tim
Dey Manan
Dodge Jesse
Dupont Gérard
Dutra Livia
Eisenberg Renata
Elbadri Maraim
Elkott Nour
Elsahar Hady
Emezue Chris
Espejel Omar
Fahmy Nour
Fan Angela
Faranak Amy
Feizpour Amir
Ferrandis Carlos Muñoz
Fevry Thibault
Forde Jessica Zosa
Fourrier Clémentine
Freidank Moritz
Fries Jason Alan
Frohberg Jörg
Fuhrimann Florian
Fung Pascale
Gallé Matthias
Gandhi Sanchit
Gao Leo
Garda Samuele
Garrette Dan
Gehrmann Sebastian
Gerchick Marissa
Ghaleb Mustafa
Ghauri Muhammed
Gigant Théo
Giorgi John
Gokaslan Aaron
Golde Jonas
Gonzalez-Dios Itziar
Grandury María
HajiHosseini Azadeh
Haller Patrick
Hao Ryan
Harliman Rheza
Hazan Liam
Heinzerling Benjamin
Henderson Peter
Hesslow Daniel
Hevia Anthony
Huang Max
Ilić Suzana
Jain Chirag
Jauhar Mohammad A.
Jernite Yacine
Jiang Mike Tian-Jian
Johnson Isaac
Jones Hessie
Kainuma Tomoya
Kalo Jan-Christoph
Kang Jihyun
Kang Myungsun
Kasai Jungo
Kashyap Abhinav Ramesh
Kasner Zdeněk
Kassner Nora
Kawamura Ken
Khamis Nurulaqilla
Khan Ammar
Kiblawi Sid
Kiela Douwe
Kim Ethan
Kim Najoung
Kim Taewoon
Klamm Christopher
Kromann Rasmus
Kruszewski Germán
Kumar Srishti
Kusa Wojciech
Labrak Yanis
Lacroix Rémi
Laippala Veronika
Lansky David
Laud Tanmay
Launay Julien
Laurençon Hugo
Lavallée Pierre François
Le Thanh
Le Trieu
Lee Wilson Y.
Leong Colin
Lepercq Violette
Levkovizh Efrat
Lhoest Quentin
Li Conglong
Ligozat Anne-Laure
Limisiewicz Tomasz
Liu Lu
Liu Minna
Lo Kyle
Longpre Shayne
Lovering Charles
Luccioni Alexandra Sasha
López Roberto Luis
Manica Matteo
Manjavacas Enrique
Martin Robert
Masoud Maraim
McKenna Michael
McMillan-Major Angelina
Mielke Sabrina J.
Mieskes Margot
Mihaljcic Mina
Mikhailov Vladislav
Miranda-Escalada Antonio
Mirkin Shachar
Mirza Fatima
Mishra Mayank
Mishra Shubhanshu
Mitchell Margaret
Molano Daniel
Mou Chenghao
Muellner Nikolaus
Muennighoff Niklas
Muhammad Shamsuddeen Hassan
Muñoz Manuel Romero
Nagel Sebastian
Narayanan Deepak
Natan Eyal Bar
Nayak Nihal
Neeraj Trishala
Nejadgholi Isar
Nezhurina Marianna
Nguyen Duong A.
Nguyen Huu
Nguyen Olivier
Nguyen Zach
Nikoulina Vassilina
Nikpoor Somaieh
Nitzav Ariel Kreisberg
Novikova Jekaterina
Névéol Aurélie
Ononiwu Frankline
Osei Salomey
Ott Simon
Oyebade Tobi
Ozoani Ezinwanne
Pai Suhas
Pais Shani
Palasciano Alfredo
Pandey Harshit
Passmore Jesse
Patil Suraj
Patry Nicolas
Pavlick Ellie
Periñán Daniel León
Pestana Amanda
Peyrounette Myriam
Phan Long
Phang Jason
Pistilli Giada
Ponferrada Eduardo González
Posada Jose David
Prabhu Vrinda
Press Ofir
Protasov Vitaly
Pruksachatkun Yada
Pyysalo Sampo
Pàmies Marc
Qiu Mike
Radev Dragomir
Raffel Colin
Raja Arun
Rajani Nazneen
Rajbhandari Samyam
Rasley Jeff
Raunak Vikas
Reiter Ehud
Requena Stéphane
Rezanejad Habib
Ribeiro Rui
Rieser Verena
Roberts Adam
Rogers Anna
Roy Sourav
Rozen Jos
Rueda Alice
Rush Alexander M.
Ruwase Olatunji
Ryabinin Max
Sagot Benoît
Salesky Elizabeth
Samagaio Mairon
Samuel Olanrewaju
Samwald Matthias
Sang-aroonsiri Sinee
Sanh Victor
Sanseviero Omar
Santilli Andrea
Santos Ana
Sanz Julio Bonis
Saulnier Lucile
Saxena Bharat
Scao Teven Le
Schick Timo
Schoelkopf Hailey
Schweter Stefan
Scialom Thomas
Sedenko Irina
Seelam Natasha
Seltzer Josh
Serikov Oleg
Sharma Abheesht
Sharma Shanya
Shavrina Tatiana
Shen Sheng
Shinzato Luisa
Shoeybi Mohammad
Shubber Sarmad
Shukla Anima
Si Chenglei
Silberberg Stanislav
Simhi Adi
Singh Amanpreet
Singh Ayush
Singh Mayank
Sivaraman Karthik Rangasai
Smith Shaden
Solaiman Irene
Soroa Aitor
Stiegler Arnaud
Strobelt Hendrik
Su Rosaline
Su Ruisi
Suarez Pedro Ortiz
Subramani Nishant
Subramonian Arjun
Sun Zhiqing
Sutawika Lintang
Szczechla Eliza
Sänger Mario
Tae Jaesung
Takeuchi Maiko
Taktasheva Ekaterina
Talat Zeerak
Tammour Aycha
Tan Edward
Tan Samson
Tan Zhe
Tang Xiangru
Tanguy Ludovic
Tazi Nouamane
Taşar Davut Emre
Teehan Ryan
Thakker Urmish
Thrush Tristan
Tobing Joseph
Tojarieh Hadar
Torrent Tiago Timponi
Tow Jonathan
Tran Hieu
Tunuguntla Deepak
Unldreaj Antigona
Uri Yallow
van der Wal Oskar
van Strien Daniel
Venkatraman Yash
Viguier Sylvain
Villegas Paulo
Voloshina Ekaterina
von Platen Patrick
Von Werra Leandro
Vrabec Helena U.
Vu Minh Chien
Wang Bo
Wang Han
Wang Silas
Wang Thomas
Weber Leon
Webson Albert
Weinberg Michael
Winata Genta Indra
Wolf Thomas
Workshop BigScience
Xie Zhongli
Xu Canwen
Xu Chuxin
Xu Yifan
Xu Yingxin
Xu Yu
Yang Yoyo
Ye Zifan
Yong Zheng-Xin
Yu Dian
Yu Ian
Yun Tian
Yvon François
Zhang Minjia
Zhang Rui
Zhang Ruochen
Zhou Chenxi
Zhu Jian
Zink Sydney
Šaško Mario
Publication venue
Publication date: 10/12/2022
Field of study

Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License

arXiv.org e-Print Archive

Constraint Grammar-Based Swedish-Danish Machine Translation

Author: F.M. Tyers
M. Carpuat
P. Xu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

Crossref

Effet hall plan dans les couches minces ferromagnétiques déposées sous vide

Author: B. Carpuat
M. Galinier
R. Morille
Publication venue: 'EDP Sciences'
Publication date: 01/01/1969
Field of study

An experimental study of ferromagnetic thin film conditions of evaporation, film thickness, composition and shape of the electrodes has been undertaken. Anisotropic Ni-Fe films with various additions of Pd, V, Co, Mo, showed a maximum planar Hall effect for the composition Ni-Fe 86/14. The optimization of the geometrical parameters of the electrodes and the magnetic film elements is described, allowing one to design for maximum output voltage or maximum output current in a short circuited loop.Nous avons effectué une étude expérimentale en fonction des conditions d'évaporation, de l'épaisseur du film, de la composition et de la forme des électrodes. Les films anisotropes de Ni-Fe ou contenant différents pourcentages de Pd, V, Co, Mo présentent un effet Hall plan maximum pour la composition Ni-Fe 86/14. L'optimisation des paramètres géométriques des électrodes et de l'élément magnétique permet d'obtenir une tension de sortie maximum ou un courant de sortie en court-circuit maximum

Crossref

EDP Sciences OAI-PMH repository (1.2.0)

HAL Descartes

Hal-Diderot

Demystifying the Semantics of Relevant Objects in Scholarly Collections

Author: Blei D. M.
Carpuat M.
Goulart R. R. V.
Teh Y. W.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

Discriminative Phrase-Based Models for Arabic Machine Translation

Author: Bangalore S.
Bishop C. M.
Carpuat M.
Carpuat M.
Cristina España-Bonet
Diab M.
Fellbaum C.
Giménez J.
Giménez J.
Jesús Giménez
Koehn P.
Koehn P.
Koehn P.
Lluís Màrquez
Specia L.
Stolcke A.
Stroppa N.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

Joining Forces To Resolve Lexical Ambiguity: East Meets West In Barcelona

Author: Carpuat M.
Ngai G.
Packel Adrian D., , \u2704
Thomforde Emily J., , \u2704
Wicentowski Richard H.
Wu D.
Publication venue: 'Transformative Works and Cultures'
Publication date: 01/01/2004
Field of study

This paper describes the component models and combination model built as a joint effort between Swarthmore College, Hong Kong PolyU, and HKUST. Though other models described else where contributed to the final combination model, this paper focuses solely on the joint contributions to the ”Swat-HK” effort

Works

UG18 at SemEval-2018 Task 1: Generating Additional Training Data for Predicting Emotion Intensity in Spanish

Author: Apidianaki Marianna
Bethard Steven
Carpuat Marine
Kuijper Marloes
May Jonathan
Mohammad Saif M.
Shutova Ekaterina
van Lenthe Mike
van Noord Rik
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 05/06/2018
Field of study

The present study describes our submission to SemEval 2018 Task 1: Affect in Tweets. Our Spanish-only approach aimed to demonstrate that it is beneficial to automatically generate additional training data by (i) translating training data from other languages and (ii) applying a semi-supervised learning method. We find strong support for both approaches, with those models outperforming our regular models in all subtasks. However, creating a stepwise ensemble of different models as opposed to simply averaging did not result in an increase in performance. We placed second (EI-Reg), second (EI-Oc), fourth (V-Reg) and fifth (V-Oc) in the four Spanish subtasks we participated in

A computational model of logical metonymy

Author: Agirre E.
Andersen O. E.
Anna Korhonen
Apresjan J. D.
Baroni M.
Boguraev B.
Carpuat M.
Chan Y. S.
Copestake A.
Ekaterina Shutova
Fung B. C. M.
Jakub Kaplan
Kelly C.
Pasca M.
Pustejovsky J.
Schutze H.
Simone Teufel
Sun L.
Verspoor C. M.
Vlachos A.
Voorhees E. M.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

Introduction to the special issue on multiword expressions

Author: Aline Villavicencio
Anastasiou D.
Bond F.
Carlos Ramisch
Carpuat M.
Choueka Y.
Grégoire N.
Grégoire N.
Kim S. N.
Kordoni V.
Laporte
Moirón B. V.
Morin E.
Pal S.
Rayson P.
Sag I.
Seretan V.
Tanaka T.
Valia Kordoni
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref