Search CORE

63 research outputs found

Neural network ensembles: Evaluation of aggregation algorithms

Author: Ceccatto H. A.
Granitto P. M.
Verdes P. F.
Publication venue
Publication date: 01/02/2005
Field of study

Ensembles of artificial neural networks show improved generalization capabilities that outperform those of single networks. However, for aggregation to be effective, the individual networks must be as accurate and diverse as possible. An important problem is, then, how to tune the aggregate members in order to have an optimal compromise between these two conflicting conditions. We present here an extensive evaluation of several algorithms for ensemble construction, including new proposals and comparing them with standard methods in the literature. We also discuss a potential problem with sequential aggregation algorithms: the non-frequent but damaging selection through their heuristics of particularly bad ensemble members. We introduce modified algorithms that cope with this problem by allowing individual weighting of aggregate members. Our algorithms and their weighted modifications are favorably tested against other methods in the literature, producing a sensible improvement in performance on most of the standard statistical databases used as benchmarks.Comment: 35 pages, 2 figures, In press AI Journa

arXiv.org e-Print Archive

CiteSeerX

Elsevier - Publisher Connector

Prediction of minimum temperatures in an alpine region by linear and non-linear post-processing of meteorological models

Author: Barbiero R.
Cesari D.
Eccel E.
Ghielmi L.
Granitto P.
Grazzini F.
Publication venue: European Geosciences Union (EGU)
Publication date: 01/01/2007
Field of study

International audienceModel Output Statistics (MOS) refers to a method of post-processing the direct outputs of numerical weather prediction (NWP) models in order to reduce the biases introduced by a coarse horizontal resolution. This technique is especially useful in orographically complex regions, where large differences can be found between the NWP elevation model and the true orography. This study carries out a comparison of linear and non-linear MOS methods, aimed at the prediction of minimum temperatures in a fruit-growing region of the Italian Alps, based on the output of two different NWPs (ECMWF T511?L60 and LAMI-3). Temperature, of course, is a particularly important NWP output; among other roles it drives the local frost forecast, which is of great interest to agriculture. The mechanisms of cold air drainage, a distinctive aspect of mountain environments, are often unsatisfactorily captured by global circulation models. The simplest post-processing technique applied in this work was a correction for the mean bias, assessed at individual model grid points. We also implemented a multivariate linear regression on the output at the grid points surrounding the target area, and two non-linear models based on machine learning techniques: Neural Networks and Random Forest. We compare the performance of all these techniques on four different NWP data sets. Downscaling the temperatures clearly improved the temperature forecasts with respect to the raw NWP output, and also with respect to the basic mean bias correction. Multivariate methods generally yielded better results, but the advantage of using non-linear algorithms was small if not negligible. RF, the best performing method, was implemented on ECMWF prognostic output at 06:00 UTC over the 9 grid points surrounding the target area. Mean absolute errors in the prediction of 2 m temperature at 06:00 UTC were approximately 1.2°C, close to the natural variability inside the area itself

Archivio istituzionale della ricerca - Fondazione Edmund Mach

Directory of Open Access Journals

HAL-INSU

Latent Patient Network Learning for Automatic Diagnosis

Author: A Kazi
A Kazi
A Ortega
B Fischl
G Vivar
H Burwinkel
K Zhan
KL Miller
L Qiao
MM Bronstein
P Gainza
PM Granitto
S Ktena
S Parisot
S Parisot
Y Wang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 27/03/2020
Field of study

Recently, Graph Convolutional Networks (GCNs) has proven to be a powerful machine learning tool for Computer Aided Diagnosis (CADx) and disease prediction. A key component in these models is to build a population graph, where the graph adjacency matrix represents pair-wise patient similarities. Until now, the similarity metrics have been defined manually, usually based on meta-features like demographics or clinical scores. The definition of the metric, however, needs careful tuning, as GCNs are very sensitive to the graph structure. In this paper, we demonstrate for the first time in the CADx domain that it is possible to learn a single, optimal graph towards the GCN's downstream task of disease classification. To this end, we propose a novel, end-to-end trainable graph learning architecture for dynamic and localized graph pruning. Unlike commonly employed spectral GCN approaches, our GCN is spatial and inductive, and can thus infer previously unseen patients as well. We demonstrate significant classification improvements with our learned graph on two CADx problems in medicine. We further explain and visualize this result using an artificial dataset, underlining the importance of graph learning for more accurate and robust inference with GCNs in medical applications

arXiv.org e-Print Archive

Crossref

A computer vision approach for weeds identification through Support Vector Machines

Author: Aitkenhead
Alberto Tellaeche
Angela Ribeiro
Astrand
Barroso
Billingsley
Cherkassky
Cloutier
Duda
Evgeniou
Gerhards
Gonzalez
Gonzalez
Gonzalo Pajares
Granitto
Kapur
Kropff
Onyango
Pajares
Pérez
Radics
Ribeiro
Rosin
Sneath
Søgaard
Tellaeche
Tellaeche
Tellaeche
Tellaeche
Tellaeche
Thorp
Tian
Tian
Vapnik
Xavier P. Burgos-Artizzu
Yang
Publication venue: 'Elsevier BV'
Publication date: 01/01/2011
Field of study

This paper outlines an automatic computervision system for the identification of avena sterilis which is a special weed seed growing in cereal crops. The final goal is to reduce the quantity of herbicide to be sprayed as an important and necessary step for precision agriculture. So, only areas where the presence of weeds is important should be sprayed. The main problems for the identification of this kind of weed are its similar spectral signature with respect the crops and also its irregular distribution in the field. It has been designed a new strategy involving two processes: image segmentation and decision making. The image segmentation combines basic suitable image processing techniques in order to extract cells from the image as the low level units. Each cell is described by two area-based attributes measuring the relations among the crops and weeds. The decision making is based on the SupportVectorMachines and determines if a cell must be sprayed. The main findings of this paper are reflected in the combination of the segmentation and the SupportVectorMachines decision processes. Another important contribution of this approach is the minimum requirements of the system in terms of memory and computation power if compared with other previous works. The performance of the method is illustrated by comparative analysis against some existing strategies

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Archivo Digital UPM

Clustering gene expression data with a penalized graph-based metric

Author: A Baya
A Ben-Hur
A Fred
A Karatzoglou
A Ng
A Richards
A Soukas
A Thalamuthu
AA Alizadeh
AI Su
AK Jain
AK Jain
Ariel E Bayá
B Fischer
B Fischer
B Fischer
B King
B Tjaden
BJ Frey
EJ Yeoh
EP Xing
EY Kim
G McLachlan
G Milligan
J McQueen
J Risinger
J Shawe-Taylor
J Shi
J Tenenbaum
JP Brunet
K Yeung
L Dyrskjot
L Heyer
L Kaufman
L Li
L Liu
M Belkin
M Brito
M de Souto
M Dettling
M Filippone
M Polito
MB Eisen
N Mekuz
P Arabie
P Franti
P Marttinen
Pablo M Granitto
PHA Sneath
R Shai
R Tibshirani
R Tibshirani
R Waite
R Xu
S Calza
S Michele Leone
S Monti
S Pomeroy
S Ramaswamy
S Roweis
TH Cormen
TR Golub
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background The search for cluster structure in microarray datasets is a base problem for the so-called "-omic sciences". A difficult problem in clustering is how to handle data with a manifold structure, i.e. data that is not shaped in the form of compact clouds of points, forming arbitrary shapes or paths embedded in a high-dimensional space, as could be the case of some gene expression datasets. Results In this work we introduce the Penalized k-Nearest-Neighbor-Graph (PKNNG) based metric, a new tool for evaluating distances in such cases. The new metric can be used in combination with most clustering algorithms. The PKNNG metric is based on a two-step procedure: first it constructs the k-Nearest-Neighbor-Graph of the dataset of interest using a low k-value and then it adds edges with a highly penalized weight for connecting the subgraphs produced by the first step. We discuss several possible schemes for connecting the different sub-graphs as well as penalization functions. We show clustering results on several public gene expression datasets and simulated artificial problems to evaluate the behavior of the new metric. Conclusions In all cases the PKNNG metric shows promising clustering results. The use of the PKNNG metric can improve the performance of commonly used pairwise-distance based clustering methods, to the level of more advanced algorithms. A great advantage of the new procedure is that researchers do not need to learn a new method, they can simply compute distances with the PKNNG metric and then, for example, use hierarchical clustering to produce an accurate and highly interpretable dendrogram of their high-dimensional data.</p

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Crossref

CONICET Digital

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Repositorio Hipermedial de la Universidad Nacional de Rosario

Classification of Caesarean Section and Normal Vaginal Deliveries Using Foetal Heart Rate Signals and Advanced Machine Learning Algorithms

Author: A Georgieva
A Pinas
A Sola
A Ugwumadu
Abir Hussain
AL Goldberger
AR Webb
B Chudacek
CK Karmakar
D Silver
De-Shuang Huang
Dhiya Al-Jumeily
DP Williams
E Kreyszig
F Tetschke
G Koop
H Ocak
J Camm
J Hand
J Kessler
J Nahar
J Nahar
J Spilka
J Spilka
J Spilka
JB Warren
L Omo-Aghoja
L Tong
LM Taft
LM Taft
ME Menai
MG Signorini
N Sarkar
N Srivastava
Nizar Bouguila
NV Chawla
P Fergus
P Pinto
PA Warrick
Paul Fergus
PD Welch
PM Granitto
R Blagus
R Blagus
R Blagus
R Brown
R Czabanski
R Mantel
R Vressler
S Schiermeier
T Sun
T Sun
T Sun
TM Khoshgoftaar
V Lopez
W Lin
W Lin
WL Maner
Y Wang
Y Wang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/07/2017
Field of study

ABSTRACT – Background: Visual inspection of Cardiotocography traces by obstetricians and midwives is the gold standard for monitoring the wellbeing of the foetus during antenatal care. However, inter- and intra-observer variability is high with only a 30% positive predictive value for the classification of pathological outcomes. This has a significant negative impact on the perinatal foetus and often results in cardio-pulmonary arrest, brain and vital organ damage, cerebral palsy, hearing, visual and cognitive defects and in severe cases, death. This paper shows that using machine learning and foetal heart rate signals provides direct information about the foetal state and helps to filter the subjective opinions of medical practitioners when used as a decision support tool. The primary aim is to provide a proof-of-concept that demonstrates how machine learning can be used to objectively determine when medical intervention, such as caesarean section, is required and help avoid preventable perinatal deaths. Methodology: This is evidenced using an open dataset that comprises 506 controls (normal virginal deliveries) and 46 cases (caesarean due to pH ≤7.05 and pathological risk). Several machine-learning algorithms are trained, and validated, using binary classifier performance measures. Results: The findings show that deep learning classification achieves Sensitivity = 94%, Specificity = 91%, Area under the Curve = 99%, F-Score = 100%, and Mean Square Error = 1%. Conclusions: The results demonstrate that machine learning significantly improves the efficiency for the detection of caesarean section and normal vaginal deliveries using foetal heart rate signals compared with obstetrician and midwife predictions and systems reported in previous studies

LJMU Research Online (Liverpool John Moores University)

Crossref

Directory of Open Access Journals

PTR-MS as a new tool for fruit metabolomics

Author: Aprea E.
Biasioli F.
Cappellin L.
Costa F.
Gasperi F.
Granitto P.
Romano A.
Publication venue
Publication date: 01/01/2012
Field of study

Archivio istituzionale della ricerca - Fondazione Edmund Mach

Archivio istituzionale della ricerca - Università di Padova