154 research outputs found

    A Multivariate Skew-Normal-Tukey-h Distribution

    Full text link
    We introduce a new family of multivariate distributions by taking the component-wise Tukey-h transformation of a random vector following a skew-normal distribution. The proposed distribution is named the skew-normal-Tukey-h distribution and is an extension of the skew-normal distribution for handling heavy-tailed data. We compare this proposed distribution to the skew-t distribution, which is another extension of the skew-normal distribution for modeling tail-thickness, and demonstrate that when there are substantial differences in marginal kurtosis, the proposed distribution is more appropriate. Moreover, we derive many appealing stochastic properties of the proposed distribution and provide a methodology for the estimation of the parameters in which the computational requirement increases linearly with the dimension. Using simulations, as well as a wine and a wind speed data application, we illustrate how to draw inferences based on the multivariate skew-normal-Tukey-h distribution

    Change Point Detection and Estimation in Sequences of Dependent Random Variables

    Get PDF
    Two change point detection and estimation procedures for sequences of dependent binary random variables are proposed and their asymptotic properties are explored. The two procedures are a dependent cumulative sum statistic (DCUSUM) and a dependent likelihood ratio test (LRT) statistic, which are generalizations of the independent CUSUM and LRT statistics. A one step Markov dependence is assumed between consecutive variables in the sequence, and the performance of the DCUSUM and dependent LRT are shown to have substantially better size and power performance than their independent counterparts. In most cases, a comparison of the dependent procedures via simulation shows that the dependent LRT provides a more powerful test, while the DCUSUM test has better size performance. The asymptotic distribution of the DCUSUM test is found to be a weighted sum of squared Brownian bridge processes and an approximation to calculate p-values is discussed. A Worsley type upper bound for p-values is provided as an alternative. The asymptotic distribution of the dependent LRT is unknown, but the tail probabilities are found to be empirically bounded by chi-square random variables with 6 and 7 degrees of freedom through a simulation study. A bootstrap algorithm to estimate p-values for the dependent LRT is discussed. Extensions of these procedures to multiple sequences and multinomial random variables are discussed, and a new statistic, the maximal change count statistic, is proposed. An application of the multiple sequence procedures to clustered time series models is provided. The asymptotic properties of the generalized procedures are reserved for future research

    Truncations of Haar distributed matrices, traces and bivariate Brownian bridges

    Full text link
    Let U be a Haar distributed unitary matrix in U(n)or O(n). We show that after centering the double index process W(n)(s,t)=∑i≤⌊ns⌋,j≤⌊nt⌋∣Uij∣2 W^{(n)} (s,t) = \sum_{i \leq \lfloor ns \rfloor, j \leq \lfloor nt\rfloor} |U_{ij}|^2 converges in distribution to the bivariate tied-down Brownian bridge. The proof relies on the notion of second order freeness.Comment: Random matrices: Theory and Applications (RMTA) To appear (2012) http://www.editorialmanager.com/rmta

    Robust and sparse estimation of large precision matrices

    Get PDF
    The thesis considers the estimation of sparse precision matrices in the highdimensional setting. First, we introduce an integrated approach to estimate undirected graphs and to perform model selection in high-dimensional Gaussian Graphical Models (GGMs). The approach is based on a parametrization of the inverse covariance matrix in terms of the prediction errors of the best linear predictor of each node in the graph. We exploit the relationship between partial correlation coefficients and the distribution of the prediction errors to propose a novel forward-backward algorithm for detecting pairs of variables having nonzero partial correlations among a large number of random variables based on i.i.d. samples. Then, we are able to establish asymptotic properties under mild conditions. Finally, numerical studies through simulation and real data examples provide evidence of the practical advantage of the procedure, where the proposed approach outperforms state-of-the-art methods such as the Graphical lasso and CLIME under different settings. Furthermore, we study the problem of robust estimation of GGMs in the highdimensional setting when the data may contain outlying observations. We propose a robust precision matrix estimator under the cellwise contamination mechanism that is robust against structural bivariate outliers. This framework exploits robust pairwise weighted correlation coefficient estimates, where the weights are computed by the Mahalanobis distance with respect to an affine equivariant robust correlation coefficient estimator. We show that the convergence rate of the proposed estimator is the same as the correlation coefficient used to compute the Mahalanobis distance. We conduct numerical simulation under different contamination settings to compare the graph recovery performance of different robust estimators. The proposed method is then applied to the classiffication of tumors using gene expression data. We show that our procedure can effectively recover the true graph under cellwise data contamination.Programa Oficial de Doctorado en Economía de la Empresa y Métodos CuantitativosPresidente: José Manuel Mira Mcwilliams; Secretario: Andrés Modesto Alonso Fernández; Vocal: José Ramón Berrendero Día
    • …
    corecore