385 research outputs found

    Covariance Estimation: The GLM and Regularization Perspectives

    Get PDF
    Finding an unconstrained and statistically interpretable reparameterization of a covariance matrix is still an open problem in statistics. Its solution is of central importance in covariance estimation, particularly in the recent high-dimensional data environment where enforcing the positive-definiteness constraint could be computationally expensive. We provide a survey of the progress made in modeling covariance matrices from two relatively complementary perspectives: (1) generalized linear models (GLM) or parsimony and use of covariates in low dimensions, and (2) regularization or sparsity for high-dimensional data. An emerging, unifying and powerful trend in both perspectives is that of reducing a covariance estimation problem to that of estimating a sequence of regression problems. We point out several instances of the regression-based formulation. A notable case is in sparse estimation of a precision matrix or a Gaussian graphical model leading to the fast graphical LASSO algorithm. Some advantages and limitations of the regression-based Cholesky decomposition relative to the classical spectral (eigenvalue) and variance-correlation decompositions are highlighted. The former provides an unconstrained and statistically interpretable reparameterization, and guarantees the positive-definiteness of the estimated covariance matrix. It reduces the unintuitive task of covariance estimation to that of modeling a sequence of regressions at the cost of imposing an a priori order among the variables. Elementwise regularization of the sample covariance matrix such as banding, tapering and thresholding has desirable asymptotic properties and the sparse estimated covariance matrix is positive definite with probability tending to one for large samples and dimensions.Comment: Published in at http://dx.doi.org/10.1214/11-STS358 the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Robustness of Homogeneity Tests in Parallel-Piped Contingency Tables

    Get PDF

    Adaptive treatment allocation and selection in multi-arm clinical trials : a Bayesian perspective

    Get PDF
    Background: Adaptive designs offer added flexibility in the execution of clinical trials, including the possibilities of allocating more patients to the treatments that turned out more successful, and early stopping due to either declared success or futility. Commonly applied adaptive designs, such as group sequential methods, are based on the frequentist paradigm and on ideas from statistical significance testing. Interim checks during the trial will have the effect of inflating the Type 1 error rate, or, if this rate is controlled and kept fixed, lowering the power. Results: The purpose of the paper is to demonstrate the usefulness of the Bayesian approach in the design and in the actual running of randomized clinical trials during phase II and III. This approach is based on comparing the performance of the different treatment arms in terms of the respective joint posterior probabilities evaluated sequentially from the accruing outcome data, and then taking a control action if such posterior probabilities fall below a pre-specified critical threshold value. Two types of actions are considered: treatment allocation, putting on hold at least temporarily further accrual of patients to a treatment arm, and treatment selection, removing an arm from the trial permanently. The main development in the paper is in terms of binary outcomes, but extensions for handling time-to-event data, including data from vaccine trials, are also discussed. The performance of the proposed methodology is tested in extensive simulation experiments, with numerical results and graphical illustrations documented in a Supplement to the main text. As a companion to this paper, an implementation of the methods is provided in the form of a freely available R package 'barts'. Conclusion: The proposed methods for trial design provide an attractive alternative to their frequentist counterparts.Peer reviewe

    Recent developments in the econometrics of program evaluation

    Get PDF
    Many empirical questions in economics and other social sciences depend on causal effects of programs or policies. In the last two decades much research has been done on the econometric and statistical analysis of the effects of such programs or treatments. This recent theoretical literature has built on, and combined features of, earlier work in both the statistics and econometrics literatures. It has by now reached a level of maturity that makes it an important tool in many areas of empirical research in economics, including labor economics, public finance, development economics, industrial organization and other areas of empirical micro-economics. In this review we discuss some of the recent developments. We focus primarily on practical issues for empirical researchers, as well as provide a historical overview of the area and give references to more technical research.

    Issues in the use of adaptive clinical trial designs

    Get PDF
    SUMMARY Sequential sampling plans are often used in the monitoring of clinical trials in order to address the ethical and efficiency issues inherent in human testing of a new treatment or preventive agent for disease. Group sequential stopping rules are perhaps the most commonly used approaches, but in recent years, a number of authors have proposed adaptive methods of choosing a stopping rule. In general, such adaptive approaches come at a price of inefficiency (almost always) and clouding of the scientific question (sometimes). In this paper, I review the degree of adaptation possible within the largely prespecified group sequential stopping rules, and discuss the operating characteristics that can be characterized fully prior to collection of the data. I then discuss the greater flexibility possible when using several of the adaptive approaches receiving the greatest attention in the statistical literature and conclude with a discussion of the scientific and statistical issues raised by their use

    Information Theory and Machine Learning

    Get PDF
    The recent successes of machine learning, especially regarding systems based on deep neural networks, have encouraged further research activities and raised a new set of challenges in understanding and designing complex machine learning algorithms. New applications require learning algorithms to be distributed, have transferable learning results, use computation resources efficiently, convergence quickly on online settings, have performance guarantees, satisfy fairness or privacy constraints, incorporate domain knowledge on model structures, etc. A new wave of developments in statistical learning theory and information theory has set out to address these challenges. This Special Issue, "Machine Learning and Information Theory", aims to collect recent results in this direction reflecting a diverse spectrum of visions and efforts to extend conventional theories and develop analysis tools for these complex machine learning systems

    Endogenous Lysine Strategy Profile and Cartel Duration:An Instrumental Variables Approach

    Get PDF
    Abstract: Colluding firms often exchange private information and make transfers within the cartels based on the information. Estimating the impact of such collusive practices— known as the “lysine strategy profile (LSP)”— on cartel duration is difficult because of endogeneity and omitted variable bias. I use firms’ linguistic differences as an instrumental variable for the LSP in 135 cartels discovered by the European Commission since 1980. The incidence of the LSP is not significantly related to cartel duration. After correction for selectivity in the decision to use the LSP, statistical tests are consistent with a theoretic prediction that the LSP increases cartel duration. Journal of Economic Literature
    corecore