958,139 research outputs found

    Stein-rules and Testing in Generalized Mean Reverting Processes with Multiple Change-points

    Get PDF
    In this thesis, we consider inference problems about the drift parameter vector in generalized mean reverting processes with multiple and unknown change-points. In particular, we study the case where the parameter may satisfy uncertain restrictions. As compared to the results in the literature, we generalize some findings in five ways. First, we consider a statistical model which incorporates uncertain prior information and the uncertain restriction includes as a special case the nonexistence of the change-points. Second, we derive the unrestricted estimator (UE) and the restricted estimator~(RE), and we study their asymptotic properties. Specifically, in the context of a known number of change-points, we derive the joint asymptotic normality of the UE and the RE, under the set of local alternative hypotheses. Third, we derive a test for testing the hypothesized restriction and we derive its asymptotic local power. We also prove that the proposed test is consistent. Fourth, we construct a class of shrinkage type estimators (SEs) which includes as special cases the UE, RE, and classical SEs. Fifth, we derive the relative risk dominance of the proposed estimators. More precisely, we prove that the SEs dominate the UE. The novelty of the derived results consists in the fact that the dimensions of the proposed estimators are random variables. Finally, we present some simulation results which corroborate the established theoretical findings

    Essays on structural changes in high dimensional econometric models

    Get PDF
    This dissertation consists of three essays on estimating and testing structural changes in high dimensional econometrics models. These essays are based on three working papers joint with Prof. Badi Baltagi and Prof. Chihwa Kao. The first essay considers estimating the date of a single common change in the regression coefficients of a heterogeneous large N and large T panel data model with or without strong cross- sectional dependence. The second essay considers estimating a high dimensional factor model with an unknown number of latent factors and a single common change in the number of factors and/or factor loadings. The third essay considers estimating a high dimensional factor model with an unknown number of latent factors and multiple common changes in the number of factors and/or factor loadings, and also testing procedures to detect the presence and number of structural changes. The first essay studies the asymptotic properties of the least squares estimator of the common change point in large heterogeneous panel data models under various sets of conditions on the change magnitude and N-T ratio, allowing N and T to go to infinity jointly. Consistency and limiting distribution are established under general conditions. A general Hajek-Renyi inequality is introduced to calculate the order of the expectation of sup-type terms. Both weak and strong cross-sectional dependence are considered. In the former case the least squares estimator is consistent as the number of subjects tends to infinity while in the latter case a two step estimator is proposed and consistency can be recovered once estimated factors are used to control the cross-sectional dependence. The limiting distribution is derived allowing the error process to be serially dependent and heteroskedastic of unknown form, and inference can be made based on the simulated distribution. The second essay tackles the identification and estimation of a high dimensional factor model with unknown number of latent factors and a single common break in the number of factors and/or factor loadings. Since the factors are unobservable, the change point estimator is based on the second moments of the estimated pseudo factors. This essay shows that the estimation error of the proposed estimator is bounded in probability as N and T go to infinity jointly. This essay also shows that the proposed estimator has a high degree of robustness to misspecification of the number of pseudo factors. With the estimated change point plugged in, consistency of the estimated number of pre and post- break factors and convergence rate of the estimated pre and post-break factor space are then established under fairly general assumptions. Finite sample performance of the proposed estimators is investigated using Monte Carlo experiments. The third essay considers high dimensional factor models with multiple common structural changes. Based on the second moments of the estimated pseudo factors, both joint and sequential estimation of the change points are considered. The estimation error of both estimators is bounded in probability as the cross-sectional dimension N and the time dimension T go to infinity jointly. The measurement error contained in the estimated pseudo factors has no effect on the asymptotic properties of the estimated change points as N and T go to infinity jointly, and no N-T ratio condition is needed. The estimated change points are plugged in to estimate the number of factors and the factor space in each regime. Although the estimated change points are inconsistent, using them asymptotically has no effect on subsequent estimation. This essay also proposes (i) tests for the null of no change versus the alternative of l changes and (ii) tests for the null of l changes versus the alternative of l + 1 changes. These tests allow us to make inference on the presence and number of structural changes. Simulation results show good performance of the proposed estimation and testing procedures

    Statistical inference for high-dimensional data via U-statistcs

    Get PDF
    Owing to the advances in the science and technology, there is a surge of interest in high-dimensional data. Many methods developed in low or fixed dimensional setting may not be theoretically valid under this new setting, and sometimes are not even applicable when the dimensionality is larger than the sample size. To circumvent the difficulties brought by the high-dimensionality, we consider to use U-statistics based methods. In this thesis, we investigate the theoretical properties of U-statistics under the high-dimensional setting, and develop the novel U-statistics based methods to three problems. In the first chapter, we propose a new formulation of self-normalization for inference about the mean of high-dimensional stationary processes by using a U-statistic based approach. Self-normalization has attracted considerable attention in the recent literature of time series analysis, but its scope of applicability has been limited to low-/fixed-dimensional parameters for low-dimensional time series. Our original test statistic is a U-statistic with a trimming parameter to remove the bias caused by weak dependence. Under the framework of nonlinear causal processes, we show the asymptotic normality of our U-statistic with the convergence rate dependent upon the order of the Frobenius norm of the long-run covariance matrix. The self-normalized test statistic is then constructed on the basis of recursive subsampled U-statistics and its limiting null distribution is shown to be a functional of time-changed Brownian motion, which differs from the pivotal limit used in the low-dimensional setting. An interesting phenomenon associated with self-normalization is that it works in the high-dimensional context even if the convergence rate of original test statistic is unknown. We also present applications to testing for bandedness of the covariance matrix and testing for white noise for high-dimensional stationary time series and compare the finite sample performance with existing methods in simulation studies. At the root of our theoretical arguments, we extend the martingale approximation to the high-dimensional setting, which could be of independent theoretical interest. In the second chapter, we consider change point testing and estimation for high dimensional data. In the case of testing for a mean shift, we propose a new test which is based on U-statistics and utilizes the self-normalization principle. Our test targets dense alternatives in the high dimensional setting and involves no tuning parameters. The weak convergence of a sequential U-statistic based process is shown as an important theoretical contribution. Extensions to testing for multiple unknown change points in the mean, and testing for changes in the covariance matrix are also presented with rigorous asymptotic theory and encouraging simulation results. Additionally, we illustrate how our approach can be used in combination with wild binary segmentation to estimate the number and location of multiple unknown change points. In the third chapter, we consider the estimation and inference for the location of single change point in the mean of independent high-dimensional data. Our change point location estimator maximizes a new U-statistic based objective function, and its convergence rate and asymptotic distribution after suitable centering and normalization are obtained under mild assumptions. Our estimator turns out to have better efficiency as compared to the least squares based counterpart in the literature. Based on the asymptotic theory, we construct a confidence interval by plugging in consistent estimates of several quantities in the normalization. We also provide a bootstrap-based confidence interval and state its asymptotic validity under suitable conditions. Through simulation studies, we demonstrate favorable finite sample performance of the new change point location estimator as compared to its least squares based counterpart, and our bootstrap-based confidence intervals, as compared to several existing competitors. The asymptotic theory based on high-dimensional U-statistic is substantially different from those developed in the literature and is of independent interest

    Identification of gene expression patterns using planned linear contrasts

    Get PDF
    BACKGROUND: In gene networks, the timing of significant changes in the expression level of each gene may be the most critical information in time course expression profiles. With the same timing of the initial change, genes which share similar patterns of expression for any number of sampling intervals from the beginning should be considered co-expressed at certain level(s) in the gene networks. In addition, multiple testing problems are complicated in experiments with multi-level treatments when thousands of genes are involved. RESULTS: To address these issues, we first performed an ANOVA F test to identify significantly regulated genes. The Benjamini and Hochberg (BH) procedure of controlling false discovery rate (FDR) at 5% was applied to the P values of the F test. We then categorized the genes with a significant F test into 4 classes based on the timing of their initial responses by sequentially testing a complete set of orthogonal contrasts, the reverse Helmert series. For genes within each class, specific sequences of contrasts were performed to characterize their general \u27fluctuation\u27 shapes of expression along the subsequent sampling time points. To be consistent with the BH procedure, each contrast was examined using a stepwise Studentized Maximum Modulus test to control the gene based maximum family-wise error rate (MFWER) at the level of alphanew determined by the BH procedure. We demonstrated our method on the analysis of microarray data from murine olfactory sensory epithelia at five different time points after target ablation. CONCLUSION: In this manuscript, we used planned linear contrasts to analyze time-course microarray experiments. This analysis allowed us to characterize gene expression patterns based on the temporal order in the data, the timing of a gene\u27s initial response, and the general shapes of gene expression patterns along the subsequent sampling time points. Our method is particularly suitable for analysis of microarray experiments in which it is often difficult to take sufficiently frequent measurements and/or the sampling intervals are non-uniform

    Identification of gene expression patterns using planned linear contrasts

    Get PDF
    BACKGROUND: In gene networks, the timing of significant changes in the expression level of each gene may be the most critical information in time course expression profiles. With the same timing of the initial change, genes which share similar patterns of expression for any number of sampling intervals from the beginning should be considered co-expressed at certain level(s) in the gene networks. In addition, multiple testing problems are complicated in experiments with multi-level treatments when thousands of genes are involved. RESULTS: To address these issues, we first performed an ANOVA F test to identify significantly regulated genes. The Benjamini and Hochberg (BH) procedure of controlling false discovery rate (FDR) at 5% was applied to the P values of the F test. We then categorized the genes with a significant F test into 4 classes based on the timing of their initial responses by sequentially testing a complete set of orthogonal contrasts, the reverse Helmert series. For genes within each class, specific sequences of contrasts were performed to characterize their general \u27fluctuation\u27 shapes of expression along the subsequent sampling time points. To be consistent with the BH procedure, each contrast was examined using a stepwise Studentized Maximum Modulus test to control the gene based maximum family-wise error rate (MFWER) at the level of alphanew determined by the BH procedure. We demonstrated our method on the analysis of microarray data from murine olfactory sensory epithelia at five different time points after target ablation. CONCLUSION: In this manuscript, we used planned linear contrasts to analyze time-course microarray experiments. This analysis allowed us to characterize gene expression patterns based on the temporal order in the data, the timing of a gene\u27s initial response, and the general shapes of gene expression patterns along the subsequent sampling time points. Our method is particularly suitable for analysis of microarray experiments in which it is often difficult to take sufficiently frequent measurements and/or the sampling intervals are non-uniform

    Change Point Estimation in Panel Data with Time-Varying Individual Effects

    Full text link
    This paper proposes a method for estimating multiple change points in panel data models with unobserved individual effects via ordinary least-squares (OLS). Typically, in this setting, the OLS slope estimators are inconsistent due to the unobserved individual effects bias. As a consequence, existing methods remove the individual effects before change point estimation through data transformations such as first-differencing. We prove that under reasonable assumptions, the unobserved individual effects bias has no impact on the consistent estimation of change points. Our simulations show that since our method does not remove any variation in the dataset before change point estimation, it performs better in small samples compared to first-differencing methods. We focus on short panels because they are commonly used in practice, and allow for the unobserved individual effects to vary over time. Our method is illustrated via two applications: the environmental Kuznets curve and the U.S. house price expectations after the financial crisis.Comment: 26 page

    Change-Point Testing and Estimation for Risk Measures in Time Series

    Full text link
    We investigate methods of change-point testing and confidence interval construction for nonparametric estimators of expected shortfall and related risk measures in weakly dependent time series. A key aspect of our work is the ability to detect general multiple structural changes in the tails of time series marginal distributions. Unlike extant approaches for detecting tail structural changes using quantities such as tail index, our approach does not require parametric modeling of the tail and detects more general changes in the tail. Additionally, our methods are based on the recently introduced self-normalization technique for time series, allowing for statistical analysis without the issues of consistent standard error estimation. The theoretical foundation for our methods are functional central limit theorems, which we develop under weak assumptions. An empirical study of S&P 500 returns and US 30-Year Treasury bonds illustrates the practical use of our methods in detecting and quantifying market instability via the tails of financial time series during times of financial crisis
    • …
    corecore