Search CORE

69 research outputs found

A Deep Learning Approach to Geographical Candidate Selection through Toponym Matching

Author: Ardanuy Mariona Coll
Hosseini Kasra
Krause Amrey
McDonough Katherine
Nanni Federico
Van Strien Daniel
Publication venue
Publication date: 03/11/2020
Field of study

Recognizing toponyms and resolving them to their real-world referents is required to provide advanced semantic access to textual data. This process is often hindered by the high degree of variation in toponyms. Candidate selection is the task of identifying the potential entities that can be referred to by a previously recognized toponym. While it has traditionally received little attention, candidate selection has a significant impact on downstream tasks (i.e. entity resolution), especially in noisy or non-standard text. In this paper, we introduce a deep learning method for candidate selection through toponym matching, using state-of-the-art neural network architectures. We perform an intrinsic toponym matching evaluation based on several datasets, which cover various challenging scenarios (cross-lingual and regional variations, as well as OCR errors) and assess its performance in the context of geographical candidate selection in English and Spanish. </p

Edinburgh Research Explorer

Recommended from our members

Library Carpentry: software skills training for library professionals

Author: Alegre Raquel
Baker James
Cope Jez
Moore Caitlin
Price Ludi
Priego Ernesto
Stephens Owen
van Strien Daniel
Wilson Greg
Publication venue: 'Uopen Journals'
Publication date: 01/11/2016
Field of study

Librarians play a crucial role in cultivating world-class research and in most disciplinary areas today world-class research relies on the use of software. This paper describes Library Carpentry, an introductory software skills training programme with a focus on the needs and requirements of library and information professionals. Using Library Carpentry as a case study of the development and delivery of software skills focused professional development, this paper describes the institutional and intellectual contexts in which Library Carpentry was conceived, the syllabus used for the initial exploratory programme, the administrative apparatus through which the programme was delivered, and the analysis of data collection exercises conducted during the programme. As many university librarians already have substantial expertise working with data, it argues that adding software skills (that is, coding and data manipulation that goes beyond the use of familiar office suites) to their armoury is an effective and important use of professional development resource

City Research Online

Crossref

Directory of Open Access Journals

UCL Discovery

White Rose Research Online

Sussex Research Online

Datasheets for Digital Cultural Heritage Datasets

Author: Clemens Neudecker
Daniel van Strien
Giovanni Colavizza
Giulia Osti
Henk Alkemade
Jörg Lehmann
Nuno Freire
Steven Claeyssens
Publication venue: Ubiquity Press
Publication date: 01/01/2023
Field of study

Sparked by issues of quality and lack of proper documentation for datasets, the machine learning community has begun developing standardised processes for establishing datasheets for machine learning datasets, with the intent to provide context and information on provenance, purposes, composition, the collection process, recommended uses or societal biases reflected in training datasets. This approach fits well with practices and procedures established in GLAM institutions, such as establishing collections’ descriptions. However, digital cultural heritage datasets are marked by specific characteristics. They are often the product of multiple layers of selection; they may have been created for different purposes than establishing a statistical sample according to a specific research question; they change over time and are heterogeneous. Punctuated by a series of recommendations to create datasheets for digital cultural heritage, the paper addresses the scope and characteristics of digital cultural heritage datasets; possible metrics and measures; lessons from concepts similar to datasheets and/or established workflows in the cultural heritage sector. This paper includes a proposal for a datasheet template that has been adapted for use in cultural heritage institutions, and which proposes to incorporate information on the motivation and selection criteria, digitisation pipeline, data provenance, the use of linked open data, and version information

Directory of Open Access Journals

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Uncertainty in thermal tolerances and climatic debt

Author: . Roy D.
Brereton C.
Brotons T.
Chamberlain Daniel Edward
Devictor V.
Heli&#246
Herrando S.
Jiguet I.
Julliard R.
Kuussaari M.
Lindstr&#246
Reif J
Schweiger O.
Settele J.
Stefanescu C.
Van Strien
Van Swaay
Van Turnhout
Vermouzek C.
WallisDeVries Z.
Wynhoff M.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

Institutional Research Information System University of Turin

Protein status elicits compensatory changes in food intake and food preferences123

Author: Apolzan
Association of Official Analytical Chemists International
Baker
Barkeling
Bensaïd
Berridge
Bingham
Blatt
Booth
Booth
Booth
Booth
Cees de Graaf
Daniel Tomé
de Castro
de Graaf
EFSA Panel on Dietetic Products Nutrition and Allergies (NDA)
Els Siebelink
Finlayson
Gibson
Gibson
Gietzen
Graham Finlayson
Greenwald
Griffioen-Roose
Griffioen-Roose
Halton
Hill
Hulshof
Long
Luscombe-Marsh
Metcalf
Monica Mars
NEVO
Prestwich
Rolls
Sanne Griffioen-Roose
Siebelink
Simpson
Simpson
Simpson
Tome
Van Strien
Vazquez
WHO
WHO
Yamaguchi
Publication venue: American Society for Nutrition
Publication date: 01/01/2012
Field of study

Background: Protein is an indispensable component within the human diet. It is unclear, however, whether behavioral strategies exist to avoid shortages

Crossref

PubMed Central

Wageningen University & Research Publications

Handedness, hemispheric asymmetries, and joke comprehension

Author: Alexander
Barrett
Benton
Bihrle
Bradshaw
Broca
Brownell
Brownell
Bryden
Bryden
Cabeza
Christopher Lovett
Chwilla
Coltheart
Coulson
Coulson
Coulson
Coulson
Coulson
D'Esposito
Daniel
Dee
Derks
Driessen
Gardner
Gloning
Goel
Gunter
Hardyck
Hecaen
Hecaen
Hecaen
Hecaen
Hellige
Hellige
Higgenbottom
Hines
Hoffman
Holcomb
Jasper
Joanette
Kansaku
Kimura
Knecht
Knecht
Knight
Knox
Kroger
Kutas
Lake
Levy
Lishman
Loring
McDonald
McDonald
McGlone
McKeever
McKeever
McKeever
Nuwer
Oldfield
Orsini
Prabhakaran
Pujol
Rasmussen
Rugg
Rugg
Seana Coulson
Searleman
Shammi
Szaflarski
Tzourio
Van Strien
Wernicke
Winner
Witelson
Yamaguchi
Yamaguchi
Zurif
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref

Long-term and large-scale multispecies dataset tracking population changes of common European breeding birds

Author: Alonso Hany
Anton Marc
Aunins Ainars
Benkö Zoltán
Biver Gilles
Brlík Vojtěch
Busch Malte
Chodkiewicz Tomasz
Chylarecki Przemysław
Coombes Dick
de Carli Elisabetta
del Moral Juan C.
Derouaux Antoine
Dumbović Mazal Vlatka
Escandell Virginia
Eskildsen Daniel P.
Fontaine Benoît
Foppen Ruud P. B.
Gamero Anna
Gregory Richard D.
Harris Sarah
Herrando Sergi
Hristov Iordan
Husby Magne
Ieronymidou Christina
Jiguet Frédéric
Kamp Johannes
Klvaňová Alena
Kmecl Primož
Kurlavičius Petras
Kålås John A.
Lehikoinen Aleksi
Lewis Lesley
Lindström Åke
Manolopoulos Aris
Martí David
Massimino Dario
Moshøj Charlotte
Nellis Renno
Noble David
Paquet Alain
Paquet Jean-Yves
Portolou Danae
Ramírez Iván
Redel Cindy
Reif Jiří
Ridzoň Jozef
Schmid Hans
Seaman Benjamin
Silva Laura
Soldaat Leo
Spasov Svetoslav
Staneva Anna
Szép Tibor
Tellini Florenzano Guido
Teufelbauer Norbert
Trautmann Sven
van der Meij Tom
van Strien Arco
van Turnhout Chris
Vermeersch Glenn
Vermouzek Zdeněk
Vikstrøm Thomas
Voříšek Petr
Weiserbs Anne
Šilarová Eva
Škorpilová Jana
Publication venue: Nature
Publication date: 05/09/2020
Field of study

Around fifteen thousand fieldworkers annually count breeding birds using standardized protocols in 28 European countries. The observations are collected by using country-specific and standardized protocols, validated, summarized and finally used for the production of continent-wide annual and long-term indices of population size changes of 170 species. Here, we present the database and provide a detailed summary of the methodology used for fieldwork and calculation of the relative population size change estimates. We also provide a brief overview of how the data are used in research, conservation and policy. We believe this unique database, based on decades of bird monitoring alongside the comprehensive summary of its methodology, will facilitate and encourage further use of the Pan-European Common Bird Monitoring Scheme results.publishedVersio

Brage Nord Open Research Archive

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Helsingin yliopiston digitaalinen arkisto

NORA - Norwegian Open Research Archives

Transglutaminase 6: a protein associated with central nervous system development and motor function.

Author: A Leblanc
A Monsonego
A Murayama
AE Tee
B Ahvazi
B Ahvazi
B Ahvazi
B Jabri
BA Citron
BA Citron
BA Fox
BJ Molyneaux
BM Fraij
C Hohenadl
CD Bailey
CD Bailey
CD Bailey
CM Cohen
D Aeschlimann
D Aeschlimann
D Aeschlimann
D Aeschlimann
D Aeschlimann
Daniel Aeschlimann
DM Pinkas
E Candi
E Candi
EG Candi
G Andringa
GM Zainelli
Helen Thomas
J Stamnaes
J Stamnaes
J Tucholski
JE Folk
JJ Gorman
JJ Zone
JL Wang
JP Kleman
K Arnold
K Hitomi
K Venkataraman
KC Hwang
KE Achyuthan
KM Boeshans
KN Parameswaran
Konrad Beck
L Lorand
LA Sung
Lars Thiebach
LC Pedersen
LW Vader
M Hadjivassiliou
M Hadjivassiliou
M Lesort
MA Antonyak
Magdalena Adamczyk
Martin Hils
Martin Langley
ME Strien van
MJ Im
N Smyth
O Molberg
P Dehal
P Grenard
P Mariani
P Stephens
Pascale Aeschlimann
PG Mastroberardino
R Ientile
Radu C. Oita
RG Tremblay
S Boros
S Caja
S Liu
S Sergent-Tanguy
SE Iismaa
SJ McConoughey
SY Cho
SY Kim
T Cheng
TM Jeitner
TM Jeitner
TS Lai
TS Lai
VC Yee
W Dietrich
Y Wal van de
Z Nemes
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

Transglutaminases (TG) form a family of enzymes that catalyse various post-translational modifications of glutamine residues in proteins and peptides including intra- and intermolecular isopeptide bond formation, esterification and deamidation. We have characterized a novel member of the mammalian TG family, TG6, which is expressed in a human carcinoma cell line with neuronal characteristics and in mouse brain. Besides full-length protein, alternative splicing results in a short variant lacking the second β-barrel domain in man and a variant with truncated β-sandwich domain in mouse. Biochemical data show that TG6 is allosterically regulated by Ca(2+) and guanine nucleotides. Molecular modelling indicates that TG6 could have Ca(2+) and GDP-binding sites related to those of TG3 and TG2, respectively. Localization of mRNA and protein in the mouse identified abundant expression of TG6 in the central nervous system. Analysis of its temporal and spatial pattern of induction in mouse development indicates an association with neurogenesis. Neuronal expression of TG6 was confirmed by double-labelling of mouse forebrain cells with cell type-specific markers. Induction of differentiation in mouse Neuro 2a cells with NGF or dibutyryl cAMP is associated with an upregulation of TG6 expression. Familial ataxia has recently been linked to mutations in the TGM6 gene. Autoantibodies to TG6 were identified in immune-mediated ataxia in patients with gluten sensitivity. These findings suggest a critical role for TG6 in cortical and cerebellar neurons

Crossref

Online Research @ Cardiff

PubMed Central

White Rose Research Online

Molecularly defined circuitry reveals input-output segregation in deep layers of the medial entorhinal cortex

Author: Agster
Anderson
Arber
Arber
Atasoy
Benjamini
Boccara
Burwell
Burwell
Canto
Canto
Chen
Christina McClure
Daniel Cosmin Marcu
Derek L.F. Garden
Dolorfo
Garden
Gloveli
Gonzalez-Sulser
Greig
Groh
Gülşen Sürmeli
Hafting
Hamam
Harwell
Hugh Pastoll
Insausti
Kitamura
Klink
Kloosterman
Kloosterman
Kolodkin
Lefort
Lickiss
Markram
Matthew F. Nolan
McClure
McNaughton
Meredith
Miyoshi
Moser
Murray
Pastoll
Pastoll
Pastoll
Paxinos
Ramsden
Sargolini
Saunders
Simon
Steward
Stoya
Swanson
Sürmeli
Tang
Tocker
van Strien
Varga
Yoneshima
Publication venue: 'Elsevier BV'
Publication date: 02/12/2015
Field of study

SummaryDeep layers of the medial entorhinal cortex are considered to relay signals from the hippocampus to other brain structures, but pathways for routing of signals to and from the deep layers are not well established. Delineating these pathways is important for a circuit level understanding of spatial cognition and memory. We find that neurons in layers 5a and 5b have distinct molecular identities, defined by the transcription factors Etv1 and Ctip2, and divergent targets, with extensive intratelencephalic projections originating in layer 5a, but not 5b. This segregation of outputs is mirrored by the organization of glutamatergic input from stellate cells in layer 2 and from the hippocampus, with both preferentially targeting layer 5b over 5a. Our results suggest a molecular and anatomical organization of input-output computations in deep layers of the MEC, reveal precise translaminar microcircuitry, and identify molecularly defined pathways for spatial signals to influence computation in deep layers

Elsevier - Publisher Connector

Crossref

PubMed Central

Edinburgh Research Explorer

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

Author: :
Abdollahi Arezoo
Abdulmumin Idris
Abrar Nafis
Adelani David Ifeoluwa
Aghagol Arash
Aji Alham Fikri
Ajibade Benjamin
Akiki Christopher
Akinlolu Martha
Al-shaibani Maged S.
Albanie Samuel
Alfassy Amit
Alizadeh Samira
allal Loubna Ben
Almubarak Khalid
Altay Gabriel
Alyafeai Zaid
Ammanamanchi Pawan Sasanka
Amuok Priscilla
An Ran
Antverg Omer
Bach Stephen H.
Bajaj Yash Shailesh
Bamberger Zachary
Bari M Saiful
Barth Fabio
Baruwa Ahmed
Bawden Rachel
Baylor Emi
Bayrak Giyaseddin
Behroozi Bahareh
Beilharz Benjamin
Bekman Stas
Belinkov Yonatan
Belkada Younes
Bello Imane
Beltagy Iz
Ben-David Srulik
Benyamina Hamza
Bers Tali
Bharati Sushil
Bhattacharjee Joydeep
Bhattacharya Indrani
Biderman Stella
Bogdanov Eli
Bommasani Rishi
Bose Shamik
Bourfoune Hatim
Bras Mathilde
Brito Caio
Broad Nicholas Michio
Brody Shaked
Bulchandani Lokesh
Burns Gully
Burynok Mykola
Cahyawijaya Samuel
Callahan Alison
Canalli Rodrigo
Carpuat Marine
Casper Jared
Castagné Roman
Castillo Maria A
Chaffin Antoine
Chandrasekhar Ramya
Chang Jonathan
Chen Kimbo
Cheng Newton
Cheveleva Anastasia
Chhablani Gunjan
Chim Jenny
Chung Hyung Won
Clinciu Miruna
Clive Jordan
Coavoux Maximin
Colombo Pierre
Contractor Danish
Cornette Pierre
Cullan Michael
Dahlberg Nathan
Danchev Valentin
Dash Ishani
Datta Debajyoti
David Davis
de Bykhovetz Madeleine Hahn
de Gibert Ona
de la Rosa Javier
De Toni Francesco
De Wolf Michiel
del Moral Albert Villanova
Deshmukh Shlok S
Dettmers Tim
Dey Manan
Dodge Jesse
Dupont Gérard
Dutra Livia
Eisenberg Renata
Elbadri Maraim
Elkott Nour
Elsahar Hady
Emezue Chris
Espejel Omar
Fahmy Nour
Fan Angela
Faranak Amy
Feizpour Amir
Ferrandis Carlos Muñoz
Fevry Thibault
Forde Jessica Zosa
Fourrier Clémentine
Freidank Moritz
Fries Jason Alan
Frohberg Jörg
Fuhrimann Florian
Fung Pascale
Gallé Matthias
Gandhi Sanchit
Gao Leo
Garda Samuele
Garrette Dan
Gehrmann Sebastian
Gerchick Marissa
Ghaleb Mustafa
Ghauri Muhammed
Gigant Théo
Giorgi John
Gokaslan Aaron
Golde Jonas
Gonzalez-Dios Itziar
Grandury María
HajiHosseini Azadeh
Haller Patrick
Hao Ryan
Harliman Rheza
Hazan Liam
Heinzerling Benjamin
Henderson Peter
Hesslow Daniel
Hevia Anthony
Huang Max
Ilić Suzana
Jain Chirag
Jauhar Mohammad A.
Jernite Yacine
Jiang Mike Tian-Jian
Johnson Isaac
Jones Hessie
Kainuma Tomoya
Kalo Jan-Christoph
Kang Jihyun
Kang Myungsun
Kasai Jungo
Kashyap Abhinav Ramesh
Kasner Zdeněk
Kassner Nora
Kawamura Ken
Khamis Nurulaqilla
Khan Ammar
Kiblawi Sid
Kiela Douwe
Kim Ethan
Kim Najoung
Kim Taewoon
Klamm Christopher
Kromann Rasmus
Kruszewski Germán
Kumar Srishti
Kusa Wojciech
Labrak Yanis
Lacroix Rémi
Laippala Veronika
Lansky David
Laud Tanmay
Launay Julien
Laurençon Hugo
Lavallée Pierre François
Le Thanh
Le Trieu
Lee Wilson Y.
Leong Colin
Lepercq Violette
Levkovizh Efrat
Lhoest Quentin
Li Conglong
Ligozat Anne-Laure
Limisiewicz Tomasz
Liu Lu
Liu Minna
Lo Kyle
Longpre Shayne
Lovering Charles
Luccioni Alexandra Sasha
López Roberto Luis
Manica Matteo
Manjavacas Enrique
Martin Robert
Masoud Maraim
McKenna Michael
McMillan-Major Angelina
Mielke Sabrina J.
Mieskes Margot
Mihaljcic Mina
Mikhailov Vladislav
Miranda-Escalada Antonio
Mirkin Shachar
Mirza Fatima
Mishra Mayank
Mishra Shubhanshu
Mitchell Margaret
Molano Daniel
Mou Chenghao
Muellner Nikolaus
Muennighoff Niklas
Muhammad Shamsuddeen Hassan
Muñoz Manuel Romero
Nagel Sebastian
Narayanan Deepak
Natan Eyal Bar
Nayak Nihal
Neeraj Trishala
Nejadgholi Isar
Nezhurina Marianna
Nguyen Duong A.
Nguyen Huu
Nguyen Olivier
Nguyen Zach
Nikoulina Vassilina
Nikpoor Somaieh
Nitzav Ariel Kreisberg
Novikova Jekaterina
Névéol Aurélie
Ononiwu Frankline
Osei Salomey
Ott Simon
Oyebade Tobi
Ozoani Ezinwanne
Pai Suhas
Pais Shani
Palasciano Alfredo
Pandey Harshit
Passmore Jesse
Patil Suraj
Patry Nicolas
Pavlick Ellie
Periñán Daniel León
Pestana Amanda
Peyrounette Myriam
Phan Long
Phang Jason
Pistilli Giada
Ponferrada Eduardo González
Posada Jose David
Prabhu Vrinda
Press Ofir
Protasov Vitaly
Pruksachatkun Yada
Pyysalo Sampo
Pàmies Marc
Qiu Mike
Radev Dragomir
Raffel Colin
Raja Arun
Rajani Nazneen
Rajbhandari Samyam
Rasley Jeff
Raunak Vikas
Reiter Ehud
Requena Stéphane
Rezanejad Habib
Ribeiro Rui
Rieser Verena
Roberts Adam
Rogers Anna
Roy Sourav
Rozen Jos
Rueda Alice
Rush Alexander M.
Ruwase Olatunji
Ryabinin Max
Sagot Benoît
Salesky Elizabeth
Samagaio Mairon
Samuel Olanrewaju
Samwald Matthias
Sang-aroonsiri Sinee
Sanh Victor
Sanseviero Omar
Santilli Andrea
Santos Ana
Sanz Julio Bonis
Saulnier Lucile
Saxena Bharat
Scao Teven Le
Schick Timo
Schoelkopf Hailey
Schweter Stefan
Scialom Thomas
Sedenko Irina
Seelam Natasha
Seltzer Josh
Serikov Oleg
Sharma Abheesht
Sharma Shanya
Shavrina Tatiana
Shen Sheng
Shinzato Luisa
Shoeybi Mohammad
Shubber Sarmad
Shukla Anima
Si Chenglei
Silberberg Stanislav
Simhi Adi
Singh Amanpreet
Singh Ayush
Singh Mayank
Sivaraman Karthik Rangasai
Smith Shaden
Solaiman Irene
Soroa Aitor
Stiegler Arnaud
Strobelt Hendrik
Su Rosaline
Su Ruisi
Suarez Pedro Ortiz
Subramani Nishant
Subramonian Arjun
Sun Zhiqing
Sutawika Lintang
Szczechla Eliza
Sänger Mario
Tae Jaesung
Takeuchi Maiko
Taktasheva Ekaterina
Talat Zeerak
Tammour Aycha
Tan Edward
Tan Samson
Tan Zhe
Tang Xiangru
Tanguy Ludovic
Tazi Nouamane
Taşar Davut Emre
Teehan Ryan
Thakker Urmish
Thrush Tristan
Tobing Joseph
Tojarieh Hadar
Torrent Tiago Timponi
Tow Jonathan
Tran Hieu
Tunuguntla Deepak
Unldreaj Antigona
Uri Yallow
van der Wal Oskar
van Strien Daniel
Venkatraman Yash
Viguier Sylvain
Villegas Paulo
Voloshina Ekaterina
von Platen Patrick
Von Werra Leandro
Vrabec Helena U.
Vu Minh Chien
Wang Bo
Wang Han
Wang Silas
Wang Thomas
Weber Leon
Webson Albert
Weinberg Michael
Winata Genta Indra
Wolf Thomas
Workshop BigScience
Xie Zhongli
Xu Canwen
Xu Chuxin
Xu Yifan
Xu Yingxin
Xu Yu
Yang Yoyo
Ye Zifan
Yong Zheng-Xin
Yu Dian
Yu Ian
Yun Tian
Yvon François
Zhang Minjia
Zhang Rui
Zhang Ruochen
Zhou Chenxi
Zhu Jian
Zink Sydney
Šaško Mario
Publication venue
Publication date: 10/12/2022
Field of study

Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License

arXiv.org e-Print Archive