Search CORE

138 research outputs found

HypeRS: Building a Hypergraph-driven ensemble Recommender System

Author: Gharahighehi Alireza
Pliakos Konstantinos
Vens Celine
Publication venue
Publication date: 22/06/2023
Field of study

Recommender systems are designed to predict user preferences over collections of items. These systems process users' previous interactions to decide which items should be ranked higher to satisfy their desires. An ensemble recommender system can achieve great recommendation performance by effectively combining the decisions generated by individual models. In this paper, we propose a novel ensemble recommender system that combines predictions made by different models into a unified hypergraph ranking framework. This is the first time that hypergraph ranking has been employed to model an ensemble of recommender systems. Hypergraphs are generalizations of graphs where multiple vertices can be connected via hyperedges, efficiently modeling high-order relations. We differentiate real and predicted connections between users and items by assigning different hyperedge weights to individual recommender systems. We perform experiments using four datasets from the fields of movie, music and news media recommendation. The obtained results show that the ensemble hypergraph ranking method generates more accurate recommendations compared to the individual models and a weighted hybrid approach. The assignment of different hyperedge weights to the ensemble hypergraph further improves the performance compared to a setting with identical hyperedge weights

arXiv.org e-Print Archive

Deep tree-ensembles for multi-output prediction

Author: Nakano Felipe Kenji
Pliakos Konstantinos
Vens Celine
Publication venue
Publication date: 03/11/2020
Field of study

Recently, deep neural networks have expanded the state-of-art in various scientific fields and provided solutions to long standing problems across multiple application domains. Nevertheless, they also suffer from weaknesses since their optimal performance depends on massive amounts of training data and the tuning of an extended number of parameters. As a countermeasure, some deep-forest methods have been recently proposed, as efficient and low-scale solutions. Despite that, these approaches simply employ label classification probabilities as induced features and primarily focus on traditional classification and regression tasks, leaving multi-output prediction under-explored. Moreover, recent work has demonstrated that tree-embeddings are highly representative, especially in structured output prediction. In this direction, we propose a novel deep tree-ensemble (DTE) model, where every layer enriches the original feature set with a representation learning component based on tree-embeddings. In this paper, we specifically focus on two structured output prediction tasks, namely multi-label classification and multi-target regression. We conducted experiments using multiple benchmark datasets and the obtained results confirm that our method provides superior results to state-of-the-art methods in both tasks

arXiv.org e-Print Archive

A machine learning based framework to identify and classify long terminal repeat retrotransposons

Author: Blockeel Hendrik
Carareto Claudia MA
Cerri Ricardo
Costa Eduardo
Fischer Carlos N
Ramon Jan
Schietgat Leander
Vens Celine
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2018
Field of study

Transposable elements (TEs) are repetitive nucleotide sequences that make up a large portion of eukaryotic genomes. They can move and duplicate within a genome, increasing genome size and contributing to genetic diversity within and across species. Accurate identification and classification of TEs present in a genome is an important step towards understanding their effects on genes and their role in genome evolution. We introduce TE-LEARNER, a framework based on machine learning that automatically identifies TEs in a given genome and assigns a classification to them. We present an implementation of our framework towards LTR retrotransposons, a particular type of TEs characterized by having long terminal repeats (LTRs) at their boundaries. We evaluate the predictive performance of our framework on the well-annotated genomes of Drosophila melanogaster and Arabidopsis thaliana and we compare our results for three LTR retrotransposon superfamilies with the results of three widely used methods for TE identification or classification: REPEATMASKER, CENSOR and LTRDIGEST. In contrast to these methods, TE-LEARNER is the first to incorporate machine learning techniques, outperforming these methods in terms of predictive performance , while able to learn models and make predictions efficiently. Moreover, we show that our method was able to identify TEs that none of the above method could find, and we investigated TE-LEARNER'S predictions which did not correspond to an official annotation. It turns out that many of these predictions are in fact strongly homologous to a known TE

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Crossref

Ghent University Academic Bibliography

Directory of Open Access Journals

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

Predicting adverse long-term neurocognitive outcomes after pediatric intensive care unit admission

Author: Dulfer Karolijn
Güiza Grandas Fabian
Joosten Koen F.
Nakano Felipe Kenji
Van den Berghe Greet
Vanhorebeek Ilse
Vens Celine
Verbruggen Sascha C.
Wouters Pieter J.
Publication venue
Publication date: 01/06/2024
Field of study

Background and objective: Critically ill children may suffer from impaired neurocognitive functions years after ICU (intensive care unit) discharge. To assess neurocognitive functions, these children are subjected to a fixed sequence of tests. Undergoing all tests is, however, arduous for former pediatric ICU patients, resulting in interrupted evaluations where several neurocognitive deficiencies remain undetected. As a solution, we propose using machine learning to predict the optimal order of tests for each child, reducing the number of tests required to identify the most severe neurocognitive deficiencies. Methods: We have compared the current clinical approach against several machine learning methods, mainly multi-target regression and label ranking methods. We have also proposed a new method that builds several multi-target predictive models and combines the outputs into a ranking that prioritizes the worse neurocognitive outcomes. We used data available at discharge, from children who participated in the PEPaNIC-RCT trial (ClinicalTrials.gov-NCT01536275), as well as data from a 2-year follow-up study. The institutional review boards at each participating site have also approved this follow-up study (ML8052; NL49708.078; Pro00038098). Results: Our proposed method managed to outperform other machine learning methods and also the current clinical practice. Precisely, our method reaches approximately 80% precision when considering top-4 outcomes, in comparison to 65% and 78% obtained by the current clinical practice and the state-of-the-art method in label ranking, respectively. Conclusions: Our experiments demonstrated that machine learning can be competitive or even superior to the current testing order employed in clinical practice, suggesting that our model can be used to severely reduce the number of tests necessary for each child. Moreover, the results indicate that possible long-term adverse outcomes are already predictable as early as at ICU discharge. Thus, our work can be seen as the first step to allow more personalized follow-up after ICU discharge leading to preventive care rather than curative.</p

EUR Research Repository

Predicting adverse long-term neurocognitive outcomes after pediatric intensive care unit admission

Author: Dulfer Karolijn
Güiza Grandas Fabian
Joosten Koen F.
Nakano Felipe Kenji
Van den Berghe Greet
Vanhorebeek Ilse
Vens Celine
Verbruggen Sascha C.
Wouters Pieter J.
Publication venue
Publication date: 01/06/2024
Field of study

EUR Research Repository

Predicting gene function using hierarchical multi-label decision tree ensembles

Author: A Clare
A Clare
A Clare
B Hayete
C Vens
Celine Vens
D Kocev
Dragi Kocev
E Zdobnov
F Provost
F Wilcoxon
G Obozinski
GR Lanckriet
H Blockeel
H Blockeel
H Blockeel
H Chua
H Drucker
H Lee
H Mewes
Hendrik Blockeel
J Davis
J Gough
J Quinlan
J Rousu
J Struyf
Jan Struyf
L Breiman
L Breiman
L Breiman
L Breiman
L Pena-Castillo
Leander Schietgat
M Ashburner
M Deng
M Ouali
N Cesa-Bianchi
O Troyanskaya
R Caruana
S Altschul
S Mostafavi
Sašo Džeroski
T Hughes
T Joachims
U Karaoz
W Kim
W Tian
Y Chen
Y Guan
Z Barutcuoglu
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background <it>S. cerevisiae</it>, <it>A. thaliana </it>and <it>M. musculus </it>are well-studied organisms in biology and the sequencing of their genomes was completed many years ago. It is still a challenge, however, to develop methods that assign biological functions to the ORFs in these genomes automatically. Different machine learning methods have been proposed to this end, but it remains unclear which method is to be preferred in terms of predictive performance, efficiency and usability. Results We study the use of decision tree based models for predicting the multiple functions of ORFs. First, we describe an algorithm for learning hierarchical multi-label decision trees. These can simultaneously predict all the functions of an ORF, while respecting a given hierarchy of gene functions (such as FunCat or GO). We present new results obtained with this algorithm, showing that the trees found by it exhibit clearly better predictive performance than the trees found by previously described methods. Nevertheless, the predictive performance of individual trees is lower than that of some recently proposed statistical learning methods. We show that ensembles of such trees are more accurate than single trees and are competitive with state-of-the-art statistical learning and functional linkage methods. Moreover, the ensemble method is computationally efficient and easy to use. Conclusions Our results suggest that decision tree based methods are a state-of-the-art, efficient and easy-to-use approach to ORF function prediction.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Leiden University Scholary Publications

First order random forests: Learning relational classifiers with complex aggregates

Author: Anneleen Van Assche
Celine Vens
F. J. Provost
G. Plotkin
H. Blockeel
H. Blockeel
Hendrik Blockeel
J. Quinlan
L. Breiman
L. Breiman
L. Hansen
R. E. Schapire
R. Michalski
S. Džeroski
S. Muggleton
Sašo Džeroski
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Complex aggregates in relational learning

Author: Vens Celine
Publication venue: 'IOS Press'
Publication date: 01/01/2008
Field of study

In relational learning, one learns patterns from relational databases, which usually contain multiple tables that are interconnected via relations. Thus, an example for which a prediction is to be given may be related to a set of objects that are possibly relevant for that prediction. Relational classifiers differ with respect to how they handle these sets: some use properties of the set as a whole (using aggregation), some refer to properties of specific individuals, however, most classifiers do not combine both. This imposes an undesirable bias on these learners. This dissertation describes a learning approach that avoids this bias, using complex aggregates, i.e., aggregates that impose selection conditions on the set to aggregate on.status: publishe

Lirias