2,202 research outputs found

    Asymptotics for high-dimensional covariance matrices and quadratic forms with applications to the trace functional and shrinkage

    Full text link
    We establish large sample approximations for an arbitray number of bilinear forms of the sample variance-covariance matrix of a high-dimensional vector time series using 1 \ell_1-bounded and small 2\ell_2-bounded weighting vectors. Estimation of the asymptotic covariance structure is also discussed. The results hold true without any constraint on the dimension, the number of forms and the sample size or their ratios. Concrete and potential applications are widespread and cover high-dimensional data science problems such as tests for large numbers of covariances, sparse portfolio optimization and projections onto sparse principal components or more general spanning sets as frequently considered, e.g. in classification and dictionary learning. As two specific applications of our results, we study in greater detail the asymptotics of the trace functional and shrinkage estimation of covariance matrices. In shrinkage estimation, it turns out that the asymptotics differs for weighting vectors bounded away from orthogonaliy and nearly orthogonal ones in the sense that their inner product converges to 0.Comment: 42 page

    Time-frequency analysis of locally stationary Hawkes processes

    Full text link
    Locally stationary Hawkes processes have been introduced in order to generalise classical Hawkes processes away from stationarity by allowing for a time-varying second-order structure. This class of self-exciting point processes has recently attracted a lot of interest in applications in the life sciences (seismology, genomics, neuro-science,...), but also in the modelling of high-frequency financial data. In this contribution we provide a fully developed nonparametric estimation theory of both local mean density and local Bartlett spectra of a locally stationary Hawkes process. In particular we apply our kernel estimation of the spectrum localised both in time and frequency to two data sets of transaction times revealing pertinent features in the data that had not been made visible by classical non-localised approaches based on models with constant fertility functions over time.Comment: Bernoulli journal, A Para{\^i}tr

    Locally stationary long memory estimation

    Get PDF
    There exists a wide literature on modelling strongly dependent time series using a longmemory parameter d, including more recent work on semiparametric wavelet estimation. As a generalization of these latter approaches, in this work we allow the long-memory parameter d to be varying over time. We embed our approach into the framework of locally stationary processes. We show weak consistency and a central limit theorem for our log-regression wavelet estimator of the time-dependent d in a Gaussian context. Both simulations and a real data example complete our work on providing a fairly general approach

    A Multiscale Approach for Statistical Characterization of Functional Images

    Get PDF
    Increasingly, scientific studies yield functional image data, in which the observed data consist of sets of curves recorded on the pixels of the image. Examples include temporal brain response intensities measured by fMRI and NMR frequency spectra measured at each pixel. This article presents a new methodology for improving the characterization of pixels in functional imaging, formulated as a spatial curve clustering problem. Our method operates on curves as a unit. It is nonparametric and involves multiple stages: (i) wavelet thresholding, aggregation, and Neyman truncation to effectively reduce dimensionality; (ii) clustering based on an extended EM algorithm; and (iii) multiscale penalized dyadic partitioning to create a spatial segmentation. We motivate the different stages with theoretical considerations and arguments, and illustrate the overall procedure on simulated and real datasets. Our method appears to offer substantial improvements over monoscale pixel-wise methods. An Appendix which gives some theoretical justifications of the methodology, computer code, documentation and dataset are available in the online supplements

    Intrinsic data depth for Hermitian positive definite matrices

    Full text link
    Nondegenerate covariance, correlation and spectral density matrices are necessarily symmetric or Hermitian and positive definite. The main contribution of this paper is the development of statistical data depths for collections of Hermitian positive definite matrices by exploiting the geometric structure of the space as a Riemannian manifold. The depth functions allow one to naturally characterize most central or outlying matrices, but also provide a practical framework for inference in the context of samples of positive definite matrices. First, the desired properties of an intrinsic data depth function acting on the space of Hermitian positive definite matrices are presented. Second, we propose two computationally fast pointwise and integrated data depth functions that satisfy each of these requirements and investigate several robustness and efficiency aspects. As an application, we construct depth-based confidence regions for the intrinsic mean of a sample of positive definite matrices, which is applied to the exploratory analysis of a collection of covariance matrices associated to a multicenter research trial

    Fitting dynamic factor models to non-stationary time series

    Get PDF
    Factor modelling of a large time series panel has widely proven useful to reduce its cross-sectional dimensionality. This is done by explaining common co-movements in the panel through the existence of a small number of common components, up to some idiosyncratic behaviour of each individual series. To capture serial correlation in the common components, a dynamic structure is used as in traditional (uni- or multivariate) time series analysis of second order structure, i.e. allowing for infinite-length filtering of the factors via dynamic loadings. In this paper, motivated from economic data observed over long time periods which show smooth transitions over time in their covariance structure, we allow the dynamic structure of the factor model to be non-stationary over time, by proposing a deterministic time variation of its loadings. In this respect we generalise existing recent work on static factor models with time-varying loadings as well as the classical, i.e. stationary, dynamic approximate factor model. Motivated from the stationary case, we estimate the common components of our dynamic factor model by the eigenvectors of a consistent estimator of the now time-varying spectral density matrix of the underlying data-generating process. This can be seen as time-varying principal components approach in the frequency domain. We derive consistency of this estimator in a "double-asymptotic" framework of both cross-section and time dimension tending to infinity. A simulation study illustrates the performance of our estimators.econometrics;

    Multiariate Wavelet-based sahpe preserving estimation for dependant observation

    Get PDF
    We present a new approach on shape preserving estimation of probability distribution and density functions using wavelet methodology for multivariate dependent data. Our estimators preserve shape constraints such as monotonicity, positivity and integration to one, and allow for low spatial regularity of the underlying functions. As important application, we discuss conditional quantile estimation for financial time series data. We show that our methodology can be easily implemented with B-splines, and performs well in a finite sample situation, through Monte Carlo simulations.Conditional quantile; time series; shape preserving wavelet estimation; B-splines; multivariate process

    Normal frames for non-Riemannian connections

    Full text link
    The principal properties of geodesic normal coordinates are the vanishing of the connection components and first derivatives of the metric components at some point. It is well-known that these hold only at points where the connection has vanishing torsion and non-metricity. However, it is shown that normal frames, possessing the essential features of normal coordinates, can still be constructed when the connection is non-Riemannian.Comment: 4 pages, plain TeX. To appear in Class. Quantum Gra

    Nonparametric Transient Classification using Adaptive Wavelets

    Full text link
    Classifying transients based on multi band light curves is a challenging but crucial problem in the era of GAIA and LSST since the sheer volume of transients will make spectroscopic classification unfeasible. Here we present a nonparametric classifier that uses the transient's light curve measurements to predict its class given training data. It implements two novel components: the first is the use of the BAGIDIS wavelet methodology - a characterization of functional data using hierarchical wavelet coefficients. The second novelty is the introduction of a ranked probability classifier on the wavelet coefficients that handles both the heteroscedasticity of the data in addition to the potential non-representativity of the training set. The ranked classifier is simple and quick to implement while a major advantage of the BAGIDIS wavelets is that they are translation invariant, hence they do not need the light curves to be aligned to extract features. Further, BAGIDIS is nonparametric so it can be used for blind searches for new objects. We demonstrate the effectiveness of our ranked wavelet classifier against the well-tested Supernova Photometric Classification Challenge dataset in which the challenge is to correctly classify light curves as Type Ia or non-Ia supernovae. We train our ranked probability classifier on the spectroscopically-confirmed subsample (which is not representative) and show that it gives good results for all supernova with observed light curve timespans greater than 100 days (roughly 55% of the dataset). For such data, we obtain a Ia efficiency of 80.5% and a purity of 82.4% yielding a highly competitive score of 0.49 whilst implementing a truly "model-blind" approach to supernova classification. Consequently this approach may be particularly suitable for the classification of astronomical transients in the era of large synoptic sky surveys.Comment: 14 pages, 8 figures. Published in MNRA
    corecore