Search CORE

7 research outputs found

Dealing with spatial autocorrelation when learning predictive clustering trees

Author: Appice Annalisa
Ceci Michelangelo
Dzeroski Saso
Malerba Donato
Stojanova Daniela
Publication venue
Publication date: 01/01/2013
Field of study

Crossref

Archivio istituzionale della ricerca - Università di Bari

Open Access Repository

Autocart: Spatially-Aware Regression Trees for Ecological and Spatial Modeling

Author: Ancell Ethan
Publication venue: DigitalCommons@USU
Publication date: 10/12/2020
Field of study

Many ecological and spatial processes are complex in nature and are not accurately modeled by linear models. Regression trees promise to handle the high-order interactions that are present in ecological and spatial datasets, but fail to produce physically realistic characterizations of the underlying landscape. The autocart\u27\u27 (autocorrelative regression trees) R package extends the functionality of previously proposed spatial regression tree methods through a spatially aware splitting function and novel adaptive inverse distance weighting method in each terminal node. The efficacy of these autocart models, including an autocart extension of random forest, is demonstrated on multiple datasets. This highlights the ability of autocart to model complex interactions between spatial variables while still providing physically realistic representations of the landscape.https://digitalcommons.usu.edu/fsrs2020/1000/thumbnail.jp

DigitalCommons@USU

Event detection from geotagged tweets considering spatial autocorrelation and heterogeneity

Author: Farnaghi M.
Ghaemi Zeinab
Publication venue
Publication date: 23/12/2021
Field of study

University of Twente Research Information

DENCAST: distributed density-based clustering for multi-target regression

Author: Donato Malerba
Gianvito Pio
Michelangelo Ceci
Roberto Corizzo
Publication venue
Publication date: 03/06/2019
Field of study

Recent developments in sensor networks and mobile computing led to a huge increase in data generated that need to be processed and analyzed efficiently. In this context, many distributed data mining algorithms have recently been proposed. Following this line of research, we propose the DENCAST system, a novel distributed algorithm implemented in Apache Spark, which performs density-based clustering and exploits the identified clusters to solve both single- and multi-target regression tasks (and thus, solves complex tasks such as time series prediction). Contrary to existing distributed methods, DENCAST does not require a final merging step (usually performed on a single machine) and is able to handle large-scale, high-dimensional data by taking advantage of locality sensitive hashing. Experiments show that DENCAST performs clustering more efficiently than a state-of-the-art distributed clustering algorithm, especially when the number of objects increases significantly. The quality of the extracted clusters is confirmed by the predictive capabilities of DENCAST on several datasets: It is able to significantly outperform (p-value

<0.05

) state-of-the-art distributed regression methods, in both single and multi-target settings

Open Access Repository

Using PPI network autocorrelation in hierarchical multi-label classification trees for gene function prediction

Author: Daniela Stojanova
Donato Malerba
Michelangelo Ceci
Saso Dzeroski
Publication venue: Springer Nature
Publication date: 26/09/2013
Field of study

BACKGROUND: Ontologies and catalogs of gene functions, such as the Gene Ontology (GO) and MIPS-FUN, assume that functional classes are organized hierarchically, that is, general functions include more specific ones. This has recently motivated the development of several machine learning algorithms for gene function prediction that leverages on this hierarchical organization where instances may belong to multiple classes. In addition, it is possible to exploit relationships among examples, since it is plausible that related genes tend to share functional annotations. Although these relationships have been identified and extensively studied in the area of protein-protein interaction (PPI) networks, they have not received much attention in hierarchical and multi-class gene function prediction. Relations between genes introduce autocorrelation in functional annotations and violate the assumption that instances are independently and identically distributed (i.i.d.), which underlines most machine learning algorithms. Although the explicit consideration of these relations brings additional complexity to the learning process, we expect substantial benefits in predictive accuracy of learned classifiers. RESULTS: This article demonstrates the benefits (in terms of predictive accuracy) of considering autocorrelation in multi-class gene function prediction. We develop a tree-based algorithm for considering network autocorrelation in the setting of Hierarchical Multi-label Classification (HMC). We empirically evaluate the proposed algorithm, called NHMC (Network Hierarchical Multi-label Classification), on 12 yeast datasets using each of the MIPS-FUN and GO annotation schemes and exploiting 2 different PPI networks. The results clearly show that taking autocorrelation into account improves the predictive performance of the learned models for predicting gene function. CONCLUSIONS: Our newly developed method for HMC takes into account network information in the learning phase: When used for gene function prediction in the context of PPI networks, the explicit consideration of network autocorrelation increases the predictive performance of the learned models. Overall, we found that this holds for different gene features/ descriptions, functional annotation schemes, and PPI networks: Best results are achieved when the PPI network is dense and contains a large proportion of function-relevant interactions

Springer - Publisher Connector

PubMed Central

Dealing with spatial autocorrelation when learning predictive clustering trees

Author: Annalisa Appice
Anselin
Appice
Arthur
Bel
Besag
Blockeel
Bogorny
Breiman
Brent
Ceci
Cortez
Cressie
Daniela Stojanova
Davis
Debeljak
Demšar
Donato Malerba
Dubin
Džeroski
Ester
Fotheringham
Glotsos
Goodchild
Gora
Griffith
Huang
Jahani
Jensen
Legendre
Legendre
LeSage
Li
Li
Macchia
Malerba
Malerba
Mehta
Michalski
Michelangelo Ceci
Moran
Ohashi
Pace
Quinlan
Rinzivillo
Rinzivillo
Robinson
Sampson
Sašo Džeroski
Scrucca
Shepard
Stojanova
Stojanova
Tobler
Wang
Witten
Zhang
Zhao
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref