44 research outputs found

    Statistical Computing in Functional Data Analysis: The R Package fda.usc

    Get PDF
    This paper is devoted to the R package fda.usc which includes some utilities for functional data analysis. This package carries out exploratory and descriptive analysis of functional data analyzing its most important features such as depth measurements or functional outliers detection, among others. The R package fda.usc also includes functions to compute functional regression models, with a scalar response and a functional explanatory data via non-parametric functional regression, basis representation or functional principal components analysis. There are natural extensions such as functional linear models and semi-functional partial linear models, which allow non-functional covariates and factors and make predictions. The functions of this package complement and incorporate the two main references of functional data analysis: The R package fda and the functions implemented by Ferraty and Vieu (2006)S

    The DDG^G-classifier in the functional setting

    Get PDF
    The Maximum Depth was the first attempt to use data depths instead of multivariate raw data to construct a classification rule. Recently, the DD-classifier has solved several serious limitations of the Maximum Depth classifier but some issues still remain. This paper is devoted to extending the DD-classifier in the following ways: first, to surpass the limitation of the DD-classifier when more than two groups are involved. Second to apply regular classification methods (like kkNN, linear or quadratic classifiers, recursive partitioning,...) to DD-plots to obtain useful insights through the diagnostics of these methods. And third, to integrate different sources of information (data depths or multivariate functional data) in a unified way in the classification procedure. Besides, as the DD-classifier trick is especially useful in the functional framework, an enhanced revision of several functional data depths is done in the paper. A simulation study and applications to some classical real datasets are also provided showing the power of the new proposal.Comment: 29 pages, 6 figures, 6 tables, Supplemental R Code and Dat

    Assessing spatial dependence for clustered data

    Get PDF
    Variogram analysis provides a useful tool for measuring the dependence between spatial locations. Suppose that the nature of the sampling process leads to the presence of clustered data; the latter makes it advisable to use a variogram estimator that aims to adjust for clustering of samples. In this setting, the use of a nonparametric weighted estimator, obtained by considering an inverse weight to the neighborhood density combined with the kernel method, seems to have a satisfactory behavior in practice. Thus, we proceed in this work with the theoretical study of the latter estimator, by proving that it is asymptotically unbiased as well as consistent and by providing criteria for selection of the bandwidth parameter and the neighborhood radius

    A comparison of approaches for valid variogram achievement

    Get PDF
    Variogram estimation is a major issue for statistical inference of spatially correlated random variables. Most natural empirical estimators of the variogram cannot be used for this purpose, as they do not achieve the conditional negative-definite property. Typically, this problem's resolution is split into three stages: empirical variogram estimation; valid model selection; and model fitting. To accomplish these tasks, there are several different approaches strongly defended by their authors. Our work's main purpose was to identify these approaches and compare them based on a numerical study, covering different kind of spatial dependence situations. The comparisons are based on the integrated squared errors of the resulting valid estimators. Additionally, we propose an easily implementable empirical method to compare the main features of the estimated variogram function

    Novel specification tests for synchronous additive concurrent model formulation based on martingale difference divergence

    Get PDF
    This paper presents new specification tests for a general synchronous additive concurrent model formulation. As a novelty, our proposal does not require a preliminary model or error structure estimation. No tuning parameters are involved either. We develop a suitable test statistic using the martingale difference divergence coefficient. As a result, this statistic measures the departure from the conditional mean independence in the concurrent model framework, considering the information of all observed time instants. In particular, global as well as partial dependence tests are introduced. Then, we allow one to quantify the effect of a group of covariates or to apply covariates selection one by one. We obtain its asymptotic distribution under the null and propose a bootstrap algorithm to compute the p-values in practice. Through simulations, we illustrate our method, and its performance is compared to existing competitors. In addition, we use this in the analysis of three real datasets related to gait data, flu activity, and casual bike rentalsThe research of Laura Freijeiro-González is supported by the Consellería de Cultura, Educación e Ordenación Universitaria along with the Consellería de Economía, Emprego e Industria of the Xunta de Galicia (project ED481A-2018/264). Laura Freijeiro-González, Wenceslao González-Manteiga and Manuel Febrero-Bande acknowledged the support from Project PID2020-116587GB-I00 funded by MCIN/AEI/10.13039/501100011033 and by “ERDF A way of making Europe” and the Competitive Reference Groups 2021-2024 (ED431C 2021/24) from the Xunta de Galicia through the ERDF. We also acknowledge the Centro de Supercomputación de Galicia (CESGA) for computational resources. Open Access funding provided thanks to the CRUE-CSIC agreement with Springer NatureS

    Variable selection in Functional Additive Regression Models

    Get PDF
    This is a post-peer-review, pre-copyedit version of an chapter published in Functional Statistics and Related Fields. The final authenticated version is available online at: https://doi.org/10.1007/978-3-319-55846-2_15This paper considers the problem of variable selection when some of the variables have a functional nature and can be mixed with other type of variables (scalar, multivariate, directional, etc). Our proposal begins with a simple null model and sequentially selects a new variable to be incorporated into the model. For the sake of simplicity, this paper only uses additive models. However, the proposed algorithm may assess the type of contribution (linear, non linear, …) of each variable. The algorithm have showed quite promising results when applied to real data setsThe authors acknowledge financial support from Ministerio de Economía y Competitividad grant MTM2013-41383-

    Estimation, imputation and prediction for the functional linear model with scalar response with responses missing at random

    Get PDF
    Two different methods for estimation, imputation and prediction for the functional linear model with scalar response when some of the responses are missing at random (MAR) are developed. The simplified method consists in estimating the model parameters using only the pairs of predictors and responses observed completely. In addition the imputed method consists in estimating the model parameters using both the pairs of predictors and responses observed completely and the pairs of predictors and responses imputed with the parameters estimated with the simplified method. The two methodologies are compared in an extensive simulation study and the analysis of two real data examples. The comparison provides evidence that the imputed method might have better performance than the simplified method if the numbers of functional principal components used in the former strategy are selected appropriately.The first and third author acknowledge financial support from Ministerio de Economía y Competitividad grant MTM2013-41383-P and MTM2016-76969-P. The second author acknowledges financial support from Ministerio de Economía y Competitividad grant ECO2015-66593-

    Parametric Estimation of Diffusion Processes: A Review and Comparative Study

    Get PDF
    This paper provides an in-depth review about parametric estimation methods for stationary stochastic differential equations (SDEs) driven by Wiener noise with discrete time observations. The short-term interest rate dynamics are commonly described by continuous-time diffusion processes, whose parameters are subject to estimation bias, as data are highly persistent, and discretization bias, as data are discretely sampled despite the continuous-time nature of the model. To assess the role of persistence and the impact of sampling frequency on the estimation, we conducted a simulation study under different settings to compare the performance of the procedures and illustrate the finite sample behavior. To complete the survey, an application of the procedures to real data is providedThe authors acknowledge support from grant MTM2016-76969-P from the Spanish Ministry of Economy and Competitiveness (cofunded with FEDER funds) and gratefully thank Spanish National Research Council for providing the computing resources of the Supercomputing Center of Galicia (CESGA)S

    Functional Regression Models with Functional Response: New Approaches and a Comparative Study

    Full text link
    This paper proposes three new approaches for additive functional regression models with functional responses. The first one is a reformulation of the linear regression model, and the last two are on the yet scarce case of additive nonlinear functional regression models. Both proposals are based on extensions of similar models for scalar responses. One of our nonlinear models is based on constructing a Spectral Additive Model (the word "Spectral" refers to the representation of the covariates in an \mcal{L}_2 basis), which is restricted (by construction) to Hilbertian spaces. The other one extends the kernel estimator, and it can be applied to general metric spaces since it is only based on distances. We include our new approaches as well as real datasets in an R package. The performances of the new proposals are compared with previous ones, which we review theoretically and practically in this paper. The simulation results show the advantages of the nonlinear proposals and the small loss of efficiency when the simulation scenario is truly linear. Finally, the supplementary material provides a visualization tool for checking the linearity of the relationship between a single covariate and the response.Comment: Submitte
    corecore