Search CORE

78 research outputs found

A note on an Adaptive Goodness-of-Fit test with Finite Sample Validity for Random Design Regression Models

Author: Brutti Pierpaolo
Publication venue
Publication date: 18/02/2015
Field of study

Given an i.i.d. sample

\{(X_i,Y_i)\}_{i \in \{1 \ldots n\}}

from the random design regression model

Y = f(X) + \epsilon

with

(X,Y) \in [0,1] \times [-M,M]

, in this paper we consider the problem of testing the (simple) null hypothesis

f = f_0

, against the alternative

f \neq f_0

for a fixed

f_0 \in L^2([0,1],G_X)

, where

G_X(\cdot)

denotes the marginal distribution of the design variable

X

. The procedure proposed is an adaptation to the regression setting of a multiple testing technique introduced by Fromont and Laurent (2005), and it amounts to consider a suitable collection of unbiased estimators of the

L^2

--distance

d_2(f,f_0) = \int {[f(x) - f_0 (x)]^2 d\,G_X (x)}

, rejecting the null hypothesis when at least one of them is greater than its

(1-u_\alpha)

quantile, with

u_\alpha

calibrated to obtain a level--

\alpha

test. To build these estimators, we will use the warped wavelet basis introduced by Picard and Kerkyacharian (2004). We do not assume that the errors are normally distributed, and we do not assume that

X

and

\epsilon

are independent but, mainly for technical reasons, we will assume, as in most part of the current literature in learning theory, that

|f(x) - y|

is uniformly bounded (almost everywhere). We show that our test is adaptive over a particular collection of approximation spaces linked to the classical Besov spaces

arXiv.org e-Print Archive

CiteSeerX

Topological summaries for Time-Varying Data

Author: Brutti Pierpaolo
Padellini Tullia
Publication venue: 'Firenze University Press'
Publication date: 01/01/2017
Field of study

Topology has proven to be a useful tool in the current quest for ”insights on the data”, since it characterises objects through their connectivity structure, in an easy and interpretable way. More specifically, the new, but growing, field of TDA (Topological Data Analysis) deals with Persistent Homology, a multiscale version of Homology Groups summarized by the Persistence Diagram and its functional representations (Persistence Landscapes, Silhouettes etc). All of these objects, how- ever, are designed and work only for static point clouds. We define a new topological summary, the Landscape Surface, that takes into account the changes in the topology of a dynamical point cloud such as a (possibly very high dimensional) time series. We prove its continuity and its stability and, finally, we sketch a simple example

Archivio della ricerca- Università di Roma La Sapienza

Persistence Flamelets: multiscale Persistent Homology for kernel density exploration

Author: Brutti Pierpaolo
Padellini Tullia
Publication venue
Publication date: 01/01/2017
Field of study

In recent years there has been noticeable interest in the study of the "shape of data". Among the many ways a "shape" could be defined, topology is the most general one, as it describes an object in terms of its connectivity structure: connected components (topological features of dimension 0), cycles (features of dimension 1) and so on. There is a growing number of techniques, generally denoted as Topological Data Analysis, aimed at estimating topological invariants of a fixed object; when we allow this object to change, however, little has been done to investigate the evolution in its topology. In this work we define the Persistence Flamelets, a multiscale version of one of the most popular tool in TDA, the Persistence Landscape. We examine its theoretical properties and we show how it could be used to gain insights on KDEs bandwidth parameter

arXiv.org e-Print Archive

Archivio della ricerca- Università di Roma La Sapienza

Supervised Learning with Indefinite Topological Kernels

Author: Brutti Pierpaolo
Padellini Tullia
Publication venue
Publication date: 01/01/2017
Field of study

Topological Data Analysis (TDA) is a recent and growing branch of statistics devoted to the study of the shape of the data. In this work we investigate the predictive power of TDA in the context of supervised learning. Since topological summaries, most noticeably the Persistence Diagram, are typically defined in complex spaces, we adopt a kernel approach to translate them into more familiar vector spaces. We define a topological exponential kernel, we characterize it, and we show that, despite not being positive semi-definite, it can be successfully used in regression and classification tasks

arXiv.org e-Print Archive

Archivio della ricerca- Università di Roma La Sapienza

Warped Wavelet and Vertical Thresholding

Author: Brutti Pierpaolo
Publication venue
Publication date: 01/01/2008
Field of study

Let

\{(X_i,Y_i)\}_{i\in \{1,..., n\}}

be an i.i.d. sample from the random design regression model

Y=f(X)+\epsilon

with

(X,Y)\in [0,1]\times [-M,M]

. In dealing with such a model, adaptation is naturally to be intended in terms of

L^2([0,1],G_X)

norm where

G_X(\cdot)

denotes the (known) marginal distribution of the design variable

X

. Recently much work has been devoted to the construction of estimators that adapts in this setting (see, for example, [5,24,25,32]), but only a few of them come along with a easy--to--implement computational scheme. Here we propose a family of estimators based on the warped wavelet basis recently introduced by Picard and Kerkyacharian [36] and a tree-like thresholding rule that takes into account the hierarchical (across-scale) structure of the wavelet coefficients. We show that, if the regression function belongs to a certain class of approximation spaces defined in terms of

G_X(\cdot)

, then our procedure is adaptive and converge to the true regression function with an optimal rate. The results are stated in terms of excess probabilities as in [19].Comment: Submitted to the Electronic Journal of Statistics (http://www.i-journals.org/ejs/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

CiteSeerX

Archivio della ricerca- Università di Roma La Sapienza

Reference charts for fetal cerebellar vermis height: A prospective cross-sectional study of 10605 fetuses

Author: Aloisi Alessia
Brutti Pierpaolo
Cignini Pietro
Giorlandino Claudio
Giorlandino Maurizio
Mangiafico Lucia
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2016
Field of study

A prospective cross-sectional study between September 2009 and December 2014 was carried out at ALTAMEDICA Fetal–Maternal Medical Centre, Rome, Italy. Of 25203 fetal biometric measurements, 12167 (48%) measurements of the cerebellar vermis were available. After excluding 1562 (12.8%) measurements, a total of 10605 (87.2%) fetuses were considered and analyzed once only. Parametric and nonparametric quantile regression models were used for the statistical analysis. In order to evaluate the robustness of the proposed reference charts regarding various distributional assumptions on the ultrasound measurements at hand, we compared the gestational age-specific reference curves we produced through the statistical methods used. Normal mean height based on parametric and nonparametric methods were defined for each week of gestation and the regression equation expressing the height of the cerebellar vermis as a function of gestational age was calculated. Finally the correlation between dimension/gestation was measured

Directory of Open Access Journals

PubMed Central

Archivio della ricerca- Università di Roma La Sapienza

A comparison of the CAR and DAGAR spatial random effects models with an application to diabetics rate estimation in Belgium

Author: Christel Faes
Niel Hens
Pierpaolo Brutti
Vittoria La Serra
Publication venue: place:Milano
Publication date: 01/01/2020
Field of study

When hierarchically modelling an epidemiological phenomenon on a finite collection of sites in space, one must always take a latent spatial effect into account in order to capture the correlation structure that links the phenomenon to the territory. In this work, we compare two autoregressive spatial models that can be used for this purpose: the classical CAR model and the more recent DAGAR model. Differently from the former, the latter has a desirable property: its ρ parameter can be naturally interpreted as the average neighbor pair correlation and, in addition, this parameter can be directly estimated when the effect is modelled using a DAGAR rather than a CAR structure. As an application, we model the diabetics rate in Belgium in 2014 and show the adequacy of these models in predicting the response variable when no covariates are available

Archivio della ricerca- Università di Roma La Sapienza

RODEO for Sparse Nonparametric Regression and Quantile Regression with Censored Data

Author: BRUTTI Pierpaolo
Publication venue: place:Padova
Publication date: 01/01/2007
Field of study

RODEO is a recently developed general strategy for nonparametric estimation based on the regularization of the estimator derivatives with respect to the smoothing parameters. In the original nonparametric regression framework, RODEO results in a simple yet effective new algorithm for simultaneous bandwidth and variable selection with interesting theoretical properties. In this work we focus on a censored regression model in which only the response variable is (right) censored whereas the covariates, although fully observed, are supposed to live in a high dimensional space. In order to recover a sparse representation of both the regression function and the quantile regression function, we adapt RODEO to the present setting starting from the weighted local linear estimator proposed by Cai (2003). We study its theoretical properties and evaluate its performance on both real and simulated data sets

Archivio della ricerca- Università di Roma La Sapienza

CNRS 2009

Author: BRUTTI Pierpaolo
Publication venue: place:Parigi
Publication date: 01/01/2009
Field of study

Archivio della ricerca- Università di Roma La Sapienza