Search CORE

19 research outputs found

Tracking Discourses on Public and Hidden People in Historical Newspapers

Author: Clematide Simon
Coll-Ardanuy Mariona
Maurer Yves
Publication venue: Schloss Dagstuhl — Leibniz-Zentrum für Informatik
Publication date: 01/01/2023
Field of study

ZORA

The Living Machine:A Computational Approach to the Nineteenth-Century Language of Technology

Author: Ahnert Ruth
Beelen Kaspar
Coll Ardanuy Mariona
McGillivray Barbara
Wilson Daniel
Publication venue
Publication date: 11/08/2023
Field of study

King's Research Portal

A Deep Learning Approach to Geographical Candidate Selection through Toponym Matching

Author: Ardanuy Mariona Coll
Hosseini Kasra
Krause Amrey
McDonough Katherine
Nanni Federico
Van Strien Daniel
Publication venue
Publication date: 03/11/2020
Field of study

Recognizing toponyms and resolving them to their real-world referents is required to provide advanced semantic access to textual data. This process is often hindered by the high degree of variation in toponyms. Candidate selection is the task of identifying the potential entities that can be referred to by a previously recognized toponym. While it has traditionally received little attention, candidate selection has a significant impact on downstream tasks (i.e. entity resolution), especially in noisy or non-standard text. In this paper, we introduce a deep learning method for candidate selection through toponym matching, using state-of-the-art neural network architectures. We perform an intrinsic toponym matching evaluation based on several datasets, which cover various challenging scenarios (cross-lingual and regional variations, as well as OCR errors) and assess its performance in the context of geographical candidate selection in English and Spanish. </p

Edinburgh Research Explorer

Living Machines: A study of atypical animacy

Author: Ahnert Ruth
Beelen Kaspar
Coll Ardanuy Mariona
Hosseini Kasra
Lawrence Jon
McDonough Katherine
McGillivray Barbara
Nanni Federico
Tolfo Giorgia
Wilson Daniel CS
Publication venue: Proceedings of the 28th International Conference on Computational Linguistics
Publication date: 01/01/2020
Field of study

This paper proposes a new approach to animacy detection, the task of determining whether an entity is represented as animate in a text. In particular, this work is focused on atypical animacy and examines the scenario in which typically inanimate objects, specifically machines, are given animate attributes. To address it, we have created the first dataset for atypical animacy detection, based on nineteenth-century sentences in English, with machines represented as either animate or inanimate. Our method builds on recent innovations in language modeling, specifically BERT contextualized word embeddings, to better capture fine-grained contextual properties of words. We present a fully unsupervised pipeline, which can be easily adapted to different contexts, and report its performance on an established animacy dataset and our newly introduced resource. We show that our method provides a substantially more accurate characterization of atypical animacy, especially when applied to highly complex forms of language use

arXiv.org e-Print Archive

Crossref

Apollo (Cambridge)

Queen Mary Research Online

Lancaster E-Prints

Jardins per a la salut

Author: Albertí Sancho Blanca
Alguacil Aguilar Julia
Ardanuy Comas Albert
Arderiu Formentí Alba
Armengol Andos Rosa
Arribas Queralt Teresa
Bachiller García Mireia
Baladón Ramírez Jorge
Balcells de Martí Inés
Baldi Lartuna Judith
Bentanachs Raset Roger
Berredo Roldán Noelia
Berrio Avalos Víctor
Bolance Navarro Raquel
Bordoy Guerra Maria Milagros
Borràs Expósito Mireia
Borràs Rodrigo Marta
Caballero Prior Laura
Calafat Pla Joan Feliu
Calvo Silveria Sara
Canillas Mata Laura
Carbonell Vergés Núria
Cardoso Gasch Maria
Carrillo Ruíz Andrea
Casanovas Montasell Mireia
Castellà Soler Àngels
Cavero Garriga Eduard
Chanla Pizá Marina
Codina Jiménez Carla
Coll Satué Clara
Collado Lorenzo Jessica
Comajuan Mendoza Carla
Creixell Turón Anna
Desoi Artús Anna
Domingo Llopart Joan
Díaz Tejada Héctor
El Muhandiz Ikram
Escudero Rotger Jose María
Espinosa Busquets Martí
Estrada Nieto Lidia
Farré Altarriba Nil
Fernandez Martinez Marta
Fernández Tomás Carlos
Ferré Viña Gemma
Franco Fobe Laura
Franco Pons Clàudia
Fraschi Nieto Alex
Frigola Beván Gemma
Garcia Gonzalez Susana
Garcia Pastallé Arnau
Garcia Xipell Sandra
García Navarro Sandra
Gomez Olivella Adrià
González Cerezo Patricia
González Melarde Blanca
González Molina Paula
Grañana Castillo Sandra
Gurung Ashma
Gómez Folch Paula
Gómez i Serra Enric
Hermoso Gallego Yaiza
Hernández Hotter Elena
Hidalgo Josa Dana
Jaume Capó Marta
Jiménez Martín Paola
Jorba Cortada Cèlia
Lalueza Puértolas Jana
Lamiel Membrilla Alberto
Lasurt Barés Claudia
Llop Algueró Alba
Luque Gimeno Paula
López Sánchez Irene
Manchón Contreras Marc
Martell Alonso Clàudia
Martínez Castro Paula
Martínez Escobar Maria
Martínez Montañez Noelia
Masip Guasch Victor
Molina Pita Patricia
Mondejar Ferrer Júlia
Muro Blanc Elena
Oliva Falcó Laura
Ortuño Ruiz Yaiza
Padilla Patón Laura
Pagès Sanchis Marta
Palma Galeto Sara
Paredes García Maria Luisa
Pascual Carbonell David
Pegueroles Monllau Lluís
Picazos Muniesa Maria del Mar
Porta Magriña Maria
Pou Torres Pilar
Pradera Carazo Elena
Prat Castro Sandra
Puente Rodríguez Celia
Pérez Cao Gerard
Pérez Herrero Sara
Ramis Barceló Marina
Ramírez Martín Ana
Ramírez Rojo Paula
Redondo Vahle Ana
Rodríguez Sabates Mercè
Roig Rallo Laia
Roig Turner Gemma
Rosés Gimeno Marta
Ruyra Ripoll Jordi
Safont-Tria Jové Laura
San José Oliva Nerea
Sastre Gelabert Aina
Saurí Ramos Albert
Sempere Comet Anna
Sendín Emiliano Alejandro
Shults Vladyslav
Tarré Vandrell Gina
Tomàs Güell Núria
Torrandell i Haro Georgina
Torras Romero Mariona
Torres Solera Olga
Valero Via Eugènia
Vargas Guerras Pablo
Ventós Martí Laia
Vidosa Artigas Guillem
Villanova Errando Santiago
Álvarez Aunòs Maria
Álvarez Lorenzo Paula
Publication venue
Publication date: 01/07/2015
Field of study

Facultat de Farmàcia, Universitat de Barcelona. Ensenyament: Grau de Farmàcia. Assignatura: Botànica farmacèutica. Curs: 2014-2015. Coordinadors: Joan Simon, Cèsar Blanché i Maria Bosch.Els materials que aquí es presenten són el recull de les fitxes botàniques de 128 espècies presents en el Jardí Ferran Soldevila de l’Edifici Històric de la UB. Els treballs han estat realitzats manera individual per part dels estudiants dels grups M-3 i T-1 de l’assignatura Botànica Farmacèutica durant els mesos de febrer a maig del curs 2014-15 com a resultat final del Projecte d’Innovació Docent «Jardins per a la salut: aprenentatge servei a Botànica farmacèutica» (codi 2014PID-UB/054). Tots els treballs s’han dut a terme a través de la plataforma de GoogleDocs i han estat tutoritzats pels professors de l’assignatura. L’objectiu principal de l’activitat ha estat fomentar l’aprenentatge autònom i col·laboratiu en Botànica farmacèutica. També s’ha pretès motivar els estudiants a través del retorn de part del seu esforç a la societat a través d’una experiència d’Aprenentatge-Servei, deixant disponible finalment el treball dels estudiants per a poder ser consultable a través d’una Web pública amb la possibilitat de poder-ho fer in-situ en el propi jardí mitjançant codis QR amb un smartphone

Diposit Digital de la Universitat de Barcelona

Del poder del llenguatge al llenguatge del poder : anàlisi del llenguatge i de la traducció de dues distopies

Author: Coll Ardanuy Mariona
Publication venue
Publication date: 01/03/2010
Field of study

Aquest treball ofereix una crítica i una proposta de traducció de dues novel•les distòpiques: Nineteen Eighty-Four de George Orwell i The Handmaid's Tale de Margaret Atwoodm, les quals tracten el tema del llenguatge com a un dels elements clau del sistema totalitari que descriuen

UPF Digital Repository

Del poder del llenguatge al llenguatge del poder : anàlisi del llenguatge i de la traducció de dues distopies

Author: Coll Ardanuy Mariona
Publication venue
Publication date
Field of study

RECERCAT

Datasets for toponym recognition and disambiguation for nineteenth-century English newspapers

Author: Coll Ardanuy Mariona
Nanni Federico
Publication venue: British Library
Publication date: 01/07/2023
Field of study

We present two datasets, one for the task of toponym recognition and one for the task of toponym disambiguation. The datasets are derived from the "Dataset for Toponym Resolution in Nineteenth-Century English Newspapers" (DOI: https://doi.org/10.23636/r7d4-kw08). The toponym recognition dataset consists of two JSON files (ner_fine_train.json and ner_fine_dev.json), whereas the toponym disambiguation dataset is provided as a TSV file (linking_df_split.tsv)

Shared Research Repository

DeezyMatch: A Flexible Deep Learning Approach to Fuzzy String Matching

Author: Coll Ardanuy Mariona
Hosseini Kasra
Nanni Federico
Publication venue: Association for Computational Linguistics
Publication date: 01/10/2020
Field of study

We present DeezyMatch, a free, open-source software library written in Python for fuzzy string matching and candidate ranking. Its pair classifier supports various deep neural network architectures for training new classifiers and for fine-tuning a pretrained model, which paves the way for transfer learning in fuzzy string matching. This approach is especially useful where only limited training examples are available. The learned DeezyMatch models can be used to generate rich vector representations from string inputs. The candidate ranker component in DeezyMatch uses these vector representations to find, for a given query, the best matching candidates in a knowledge base. It uses an adaptive searching algorithm applicable to large knowledge bases and query sets. We describe DeezyMatch’s functionality, design and implementation, accompanied by a use case in toponym matching and candidate ranking in realistic noisy datasets