167 research outputs found

    Cluster-Robust Bootstrap Inference in Quantile Regression Models

    Full text link
    In this paper I develop a wild bootstrap procedure for cluster-robust inference in linear quantile regression models. I show that the bootstrap leads to asymptotically valid inference on the entire quantile regression process in a setting with a large number of small, heterogeneous clusters and provides consistent estimates of the asymptotic covariance function of that process. The proposed bootstrap procedure is easy to implement and performs well even when the number of clusters is much smaller than the sample size. An application to Project STAR data is provided.Comment: 46 pages, 4 figure

    Vol. 13, No. 1 (Full Issue)

    Get PDF

    Techniques for handling clustered binary data

    Get PDF
    Bibliography : leaves 143-153.Over the past few decades there has been increasing interest in clustered studies and hence much research has gone into the analysis of data arising from these studies. It is erroneous to treat clustered data, where observations within a cluster are correlated with each other, as one would treat independent data. It has been found that point estimates are not as greatly affected by clustering as are the standard deviations of the estimates. But as a consequence, confidence intervals and hypothesis testing are severely affected. Therefore one has to approach the analysis of clustered data with caution. Methods that specifically deal with correlated data have been developed. Analysis may be further complicated when the outcome variable of interest is binary rather than continuous. Methods for estimation of proportions, their variances, calculation of confidence intervals and a variety of techniques for testing the homogeneity of proportions have been developed over the years (Donner and Klar, 1993; Donner, 1989, and Rao and Scott, 1992). The methods developed within the context of experimental design generally involve incorporating the effect of clustering in the analysis. This cluster effect is quantified by the intracluster correlation and needs to be taken into account when estimating proportions, comparing proportions and in sample size calculations. In the context of observational studies, the effect of clustering is expressed by the design effect which is the inflation in the variance of an estimate that is due to selecting a cluster sample rather than an independent sample. Another important aspect of the analysis of complex sample data that is often neglected is sampling weights. One needs to recognise that each individual may not have the same probability of being selected. These weights adjust for this fact (Little et al, 1997). Methods for modelling correlated binary data have also been discussed quite extensively. Among the many models which have been proposed for analyzing binary clustered data are two approaches which have been studied and compared: the population-averaged and cluster-specific approach. The population-averaged model focuses on estimating the effect of a set of covariates on the marginal expectation of the response. One example of the population-averaged approach for parameter estimation is known as generalized estimating equations, proposed by Liang and Zeger (1986). It involves assuming that elements within a cluster are independent and then imposing a correlation structure on the set of responses. This is a useful application in longitudinal studies where a subject is regarded as a cluster. Then the parameters describe how the population-averaged response rather than a specific subject's response depends on the covariates of interest. On the other hand, cluster specific models introduce cluster to cluster variability in the model by including random effects terms, which are specific to the cluster, as linear predictors in the regression model (Neuhaus et al, 1991). Unlike the special case of correlated Gaussian responses, the parameters for the cluster specific model obtained for binary data describe different effects on the responses compared to that obtained from the population-averaged model. For longitudinal data, the parameters of a cluster-specific model describe how a specific individuals probability of a response depends on the covariates. The decision to use either of these modelling methods depends on the questions of interest. Cluster-specific models are useful for studying the effects of cluster-varying covariates and when an individual's response rather than an average population's response is the focus. The population-averaged model is useful when interest lies in how the average response across clusters changes with covariates. A criticism of this approach is that there may be no individual with the characteristics of the population-averaged model

    EEG connectivity in infants at risk for autism spectrum disorder

    Get PDF
    Autism Spectrum Disorder (ASD) is characterized by social and communication difficulties, and restricted and repetitive behaviours, and is typically diagnosed during toddlerhood. Electroencephalographic (EEG) connectivity during infancy may predict later diagnostic outcome, and dimensional traits, although results vary with differences in methods. The aim of this thesis is to examine how infant EEG connectivity relates to familial risk, and later categorical and dimensional outcomes of ASD. A previous study found alpha band hyperconnectivity in 14-month-old infants who developed ASD compared to infants who did not develop ASD at 36 months. Chapter 3 shows that methods used in this previous study indeed provide reliable results. Chapter 4 describes the replication study using identical methods to the previous study. Although the difference between groups was not replicated, the association between alpha connectivity and restricted and repetitive behaviours during toddlerhood was replicated. Chapter 5 tested the hypothesis that social and communication difficulties relate to theta connectivity in response to social and non-social stimuli. Theta connectivity was increased during social compared to non-social stimuli. Network topologies differed between groups with high and low familial risk, but not between categorical outcome groups. Theta connectivity was not associated with dimensional traits at toddlerhood. Chapter 6 showed that graph organisation was not related to familial risk, or diagnostic or dimensional outcomes at toddlerhood. Finally, Chapter 7 combined measures from previous chapters and examined how these relate to dimensional outcomes at childhood. Graph organisation at infancy showed a stronger association with dimensional outcomes at childhood than other connectivity measures. Overall, the results in this thesis illustrate the variability in developmental trajectories in ASD, while emphasizing the complexity of the disorder and use of a dimensional approach to ASD. Chapter 8 further discusses contributions and implications for research of EEG connectivity as early predictive marker for ASD

    High-dimensional asymptotic expansion of LR statistic for testing intraclass correlation structure and its error bound

    No full text
    This paper deals with the null distribution of a likelihood ratio (LR) statistic for testing the intraclass correlation structure. We derive an asymptotic expansion of the null distribution of the LR statistic when the number of variable p and the sample size N approach infinity together, while the ratio p/N is converging on a finite nonzero limit c[set membership, variant](0,1). Numerical simulations reveal that our approximation is more accurate than the classical [chi]2-type and F-type approximations as p increases in value. Furthermore, we derive a computable error bound for its asymptotic expansion.Asymptotic expansion Error bound High-dimensional approximation Intraclass correlation structure Likelihood ratio statistic

    Vol. 8, No. 2 (Full Issue)

    Get PDF
    corecore