Search CORE

5 research outputs found

Multilingual enrichment of disease biomedical ontologies

Author: Bonnefoy Antoine
Bouscarrat Léo
Capponi Cécile
Ramisch Carlos
Publication venue: HAL CCSD
Publication date: 07/04/2020
Field of study

International audienceTranslating biomedical ontologies is an important challenge, but doing it manually requires much time and money. We study the possibility to use open-source knowledge bases to translate biomedical ontologies. We focus on two aspects: coverage and quality. We look at the coverage of two biomedical ontologies focusing on diseases with respect to Wikidata for 9 European languages (Czech, Dutch, English, French, German, Italian, Polish, Portuguese and Spanish) for both ontologies, plus Arabic, Chinese and Russian for the second one. We first use direct links between Wikidata and the studied ontologies and then use second-order links by going through other intermediate ontologies. We then compare the quality of the translations obtained thanks to Wikidata with a commercial machine translation tool, here Google Cloud Translation

arXiv.org e-Print Archive

HAL AMU

AMU-EURANOVA at CASE 2021 Task 1: Assessing the stability of multilingual BERT

Author: Bonnefoy Antoine
Bouscarrat Léo
Capponi Cécile
Ramisch Carlos
Publication venue: HAL CCSD
Publication date: 10/06/2021
Field of study

International audienceThis paper explains our participation in task 1 of the CASE 2021 shared task. This task is about multilingual event extraction from news. We focused on sub-task 4, event information extraction. This sub-task has a small training dataset and we fine-tuned a multilingual BERT to solve this sub-task. We studied the instability problem on the dataset and tried to mitigate it

arXiv.org e-Print Archive

HAL AMU

Pruning Random Forest with Orthogonal Matching Trees

Author: Bouscarrat Léo
Cherfaoui Farah
Giffon Luc
Koço Sokol
Lamothe Charly
Milanesi Paolo
Publication venue: HAL CCSD
Publication date: 15/04/2020
Field of study

In this paper we propose a new method to reduce the size of Breiman's Random Forests. Given a Random Forest and a target size, our algorithm builds a linear combination of trees which minimizes the training error. Selected trees, as well as weights of the linear combination are obtained by mean of the Orthogonal Matching Pursuit algorithm. We test our method on many public benchmark datasets both on regression and binary classification and we compare it to other pruning techniques. Experiments show that our technique performs significantly better or equally good on many datasets 1. We also discuss the benefit and shortcoming of learning weights for the pruned forest which lead us to propose to use a non-negative constraint on the OMP weights for better empirical results

HAL AMU