26 research outputs found

    An Exact Test of Fit for the Gaussian Linear Model using Optimal Nonbipartite Matching

    Get PDF
    Fisher tested the fit of Gaussian linear models using replicated observations. We refine this method by (1) constructing near-replicates using an optimal nonbipartite matching and (2) defining a distance that focuses on predictors important to the model’s predictions. Near-replicates may not exist unless the predictor set is low-dimensional; the test addresses dimensionality by betting that model failures involve a subset of predictors important in the old fit. Despite using the old fit to pair observations, the test has exactly its stated level under the null hypothesis. Simulations show the test has reasonable power even when many spurious predictors are present

    Balancing Multiple Goals In Observational Study Design

    Get PDF
    This thesis unites three papers discussing new strategies for matched pair designs using observational data, developed to balance the demands of various disparate design goals. The first chapter introduces a new matching algorithm for large-scale treated-control comparisons when many categorical covariates are present. The algorithm balances covariates and their interactions in a prioritized manner by solving a combinatorial optimization problem, and guarantees computational efficiency through the use of a sparse network representation. The second chapter defines a class of variables called prods which can be ignored when matching in order to strictly attenuate unmeasured bias, if it is present. These variables can be difficult to identify with confidence, so a multiple-control-group strategy is proposed in which investigators match once on all variables, and once ignoring prods; the two treated-control comparisons together give stronger evidence about treatment effects than either one individually. The final paper considers a new version of Fisher\u27s classical lack-of-fit test for regression models, appropriate for data that lack replicated observations. The test uses matched pairs formed by optimal nonbipartite matching as near-replicates, and the model fit is used is used in constructing the matching distance in order to focus attention on variables that are predictive in the null model

    New Instrumental Variable Methods for Causal Inference.

    Full text link
    In observational studies, unmeasured differences between treatment groups often confound the relationship of interest. Instrumental variable (IV) methods can give consistent effect estimates in the presence of this unmeasured confounding, and are becoming increasingly popular in health and medical research. In this dissertation, we develop new IV methods and apply them in studies comparing mortality among patients receiving dialysis as treatment for end stage renal disease. In the first project, we develop a weighted IV estimator that adjusts for instrument-outcome confounders through the IV propensity score. The weights are designed to approximate the probability of being selected into a one-to-one match, though the extension to many-to-one designs is also presented. Advantages of weighting over matching include increased efficiency, straightforward variance estimation, and ease of computation. The estimator is shown to be more efficient than alternatives. Its use is illustrated in a study comparing the relationship between mortality and dialysis session length among hemodialysis patients. While developed for use with binary outcomes, future work on applying the method to survival data is presented as well. In the second project, we develop a weighting procedure for increasing the strength of the instrument when matching. Compared with existing methods, the proposed weighting procedure strengthens the instrument without compromising match quality. This is a major advantage of the proposed method, as poor match quality can bias estimation. Methods are illustrated with a study comparing early mortality in hemodialysis and peritoneal dialysis patients. In the third project, we compare estimation with strengthened instruments to estimation with instruments that are naturally stronger. Methods for strengthening the instrument are motivated by the benefits of using stronger instruments, including decreased finite-sample bias, increased efficiency, and results that are more robust to unmeasured instrument-outcome confounders. It has not been shown, however, that strengthened instruments provide these same benefits. Results indicate that while they provide for more efficient estimation, they do not decrease finite-sample bias or improve the robustness to unmeasured instrument-outcome confounders. We highlight an important issue that has thus far been overlooked in the literature, and give guidance for future research related to strengthening the instrument.PhDBiostatisticsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/133465/1/lehmannd_1.pd

    Valence Bonds in Random Quantum Magnets: Theory and Application to YbMgGaO4

    Get PDF
    We analyze the effect of quenched disorder on spin-1/2 quantum magnets in which magnetic frustration promotes the formation of local singlets. Our results include a theory for 2d valence-bond solids subject to weak bond randomness, as well as extensions to stronger disorder regimes where we make connections with quantum spin liquids. We find, on various lattices, that the destruction of a valence-bond solid phase by weak quenched disorder leads inevitably to the nucleation of topological defects carrying spin-1/2 moments. This renormalizes the lattice into a strongly random spin network with interesting low-energy excitations. Similarly when short-ranged valence bonds would be pinned by stronger disorder, we find that this putative glass is unstable to defects that carry spin-1/2 magnetic moments, and whose residual interactions decide the ultimate low energy fate. Motivated by these results we conjecture Lieb-Schultz-Mattis-like restrictions on ground states for disordered magnets with spin-1/2 per statistical unit cell. These conjectures are supported by an argument for 1d spin chains. We apply insights from this study to the phenomenology of YbMgGaO4_4, a recently discovered triangular lattice spin-1/2 insulator which was proposed to be a quantum spin liquid. We instead explore a description based on the present theory. Experimental signatures, including unusual specific heat, thermal conductivity, and dynamical structure factor, and their behavior in a magnetic field, are predicted from the theory, and compare favorably with existing measurements on YbMgGaO4_4 and related materials.Comment: v2: Stylistic revisions to improve clarity. 22 pages, 8 figures, 2 tables main text; 13 pages, 3 figures appendice

    Modeling and prediction of advanced prostate cancer

    Get PDF
    Background: Prostate cancer (PCa) is the most commonly diagnosed cancer and second leading cause of cancer-related deaths for men in Western countries. The advanced form of the disease is life-threatening with few options for curative therapies. The development of novel therapeutic alternatives would greatly benefit from a more comprehensive and tailored mathematical and statistical methodology. In particular, statistical inference of treatment effects and the prediction of time-dependent effects in both preclinical and clinical studies remains a challenging yet interesting opportunity for applied mathematicians. Such methods are likely to improve the reproducibility and translatability of results and offer possibility for novel holistic insights into disease progression, diagnosis, and prognosis. Methods: Several novel statistical and mathematical techniques were developed over the course of this thesis work for the in vivo modeling of PCa treatment responses. A matching-based, blinded randomized allocation procedure for preclinical experiments was developed that provides assistance for the statistical design of animal intervention studies, e.g., through power analysis and accounting for the stratification of individuals. For the post-intervention testing of treatment effects, two novel mixed-effects models were developed that aim to address the characteristic challenges of preclinical longitudinal experiments, including the heterogeneous response profiles observed in animal studies. Subsequently, a Finnish clinical PCa hospital registry cohort was inspected with a strong emphasis on prostate-specific antigen (PSA), the most commonly used PCa marker. After exploring the PSA trends using penalized splines, a generalized mixed-effects prediction model was implemented with a focus on the ultra-sensitive range of the PSA assay. Finally, for metastatic, aggressive PCa, an ensemble Cox regression methodology was developed for overall survival prediction in the DREAM 9.5 mCRPC Challenge based on open datasets from controlled clinical trials. Results: The advantages of the improved experimental design and two proposed statistical models were demonstrated in terms of both increased statistical power and accuracy in simulated and real preclinical testing settings. Penalized regression models applied to the clinical patient datasets support the use of PSA in the ultra-sensitive range together with a model for relapse prediction. Furthermore, the novel ensemble-based Cox regression model that was developed for the overall survival prediction in advanced PCa outperformed the state-of-the-art benchmark and all other models submitted to the Challenge and provided novel predictors of disease progression and treatment responses. Conclusions: The methods and results provide preclinical researchers and clinicians with novel tools for comprehensive modeling and prediction of PCa. All methodology is available as open source R statistical software packages and/or web-based graphical user interfaces

    Dulmage-Mendelsohn percolation: Geometry of maximally-packed dimer models and topologically-protected zero modes on diluted bipartite lattices

    Full text link
    The classic combinatorial construct of {\em maximum matchings} probes the random geometry of regions with local sublattice imbalance in a site-diluted bipartite lattice. We demonstrate that these regions, which host the monomers of any maximum matching of the lattice, control the localization properties of a zero-energy quantum particle hopping on this lattice. The structure theory of Dulmage and Mendelsohn provides us a way of identifying a complete and non-overlapping set of such regions. This motivates our large-scale computational study of the Dulmage-Mendelsohn decomposition of site-diluted bipartite lattices in two and three dimensions. Our computations uncover an interesting universality class of percolation associated with the end-to-end connectivity of such monomer-carrying regions with local sublattice imbalance, which we dub {\em Dulmage-Mendelsohn percolation}. Our results imply the existence of a monomer percolation transition in the classical statistical mechanics of the associated maximally-packed dimer model and the existence of a phase with area-law entanglement entropy of arbitrary many-body eigenstates of the corresponding quantum dimer model. They also have striking implications for the nature of collective zero-energy Majorana fermion excitations of bipartite networks of Majorana modes localized on sites of diluted lattices, for the character of topologically-protected zero-energy wavefunctions of the bipartite random hopping problem on such lattices, and thence for the corresponding quantum percolation problem, and for the nature of low-energy magnetic excitations in bipartite quantum antiferromagnets diluted by a small density of nonmagnetic impurities.Comment: minor typos and errors fixed; further clarifications added. no substantive changes in result

    Effects of rescheduling on patient no-show behavior in outpatient clinics

    Get PDF
    This version includes the Appendices with the article.</p

    Contributions To Multivariate Matching In Observational Studies

    Get PDF
    Matching is a common approach to reduce bias in observed covariates to draw reliable causal inferences in observational studies. This thesis consists of three papers discussing new methods for conducting, evaluating, and improving matching designs in observational studies. The first paper presents new optimal matching techniques for large-scale observational data. This new method reduces the computational complexity and preserves appealing properties in terms of balancing covariates. After constructing a matched sample, it is essential to assess the covariate balance of the matched data since lack of balance in covariates can induce a bias of the estimated treatment effect. The second paper discusses a formal evaluation of covariate balance. This new assessment evaluates whether the match is adequate compared to randomized experiments and identifies the major problems, guiding how to improve the covariate balance. If diagnostics suggest that the current match is not satisfactory, how can we improve the quality of matched samples? The final paper utilizes the idea of directional penalties, which can improve covariate balance in a matched sample effectively, even for a large observational study

    Better predictions when models are wrong or underspecified

    Get PDF
    Many statistical methods rely on models of reality in order to learn from data and to make predictions about future data. By necessity, these models usually do not match reality exactly, but are either wrong (none of the hypotheses in the model provides an accurate description of reality) or underspecified (the hypotheses in the model describe only part of the data). In this thesis, we discuss three scenarios involving models that are wrong or underspecified. In each case, we find that standard statistical methods may fail, sometimes dramatically, and present different methods that continue to perform well even if the models are wrong or underspecified. The first two of these scenarios involve regression problems and investigate AIC (Akaike's Information Criterion) and Bayesian statistics. The third scenario has the famous Monty Hall problem as a special case, and considers the question how we can update our belief about an unknown outcome given new evidence when the precise relation between outcome and evidence is unknown.UBL - phd migration 201
    corecore