1,496 research outputs found

    Block-regularized 5Ă—\times2 Cross-validated McNemar's Test for Comparing Two Classification Algorithms

    Full text link
    In the task of comparing two classification algorithms, the widely-used McNemar's test aims to infer the presence of a significant difference between the error rates of the two classification algorithms. However, the power of the conventional McNemar's test is usually unpromising because the hold-out (HO) method in the test merely uses a single train-validation split that usually produces a highly varied estimation of the error rates. In contrast, a cross-validation (CV) method repeats the HO method in multiple times and produces a stable estimation. Therefore, a CV method has a great advantage to improve the power of McNemar's test. Among all types of CV methods, a block-regularized 5Ă—\times2 CV (BCV) has been shown in many previous studies to be superior to the other CV methods in the comparison task of algorithms because the 5Ă—\times2 BCV can produce a high-quality estimator of the error rate by regularizing the numbers of overlapping records between all training sets. In this study, we compress the 10 correlated contingency tables in the 5Ă—\times2 BCV to form an effective contingency table. Then, we define a 5Ă—\times2 BCV McNemar's test on the basis of the effective contingency table. We demonstrate the reasonable type I error and the promising power of the proposed 5Ă—\times2 BCV McNemar's test on multiple simulated and real-world data sets.Comment: 12 pages, 6 figures, and 5 table

    New Results in ell_1 Penalized Regression

    Get PDF
    Here we consider penalized regression methods, and extend on the results surrounding the l1 norm penalty. We address a more recent development that generalizes previous methods by penalizing a linear transformation of the coefficients of interest instead of penalizing just the coefficients themselves. We introduce an approximate algorithm to fit this generalization and a fully Bayesian hierarchical model that is a direct analogue of the frequentist version. A number of benefits are derived from the Bayesian persepective; most notably choice of the tuning parameter and natural means to estimate the variation of estimates – a notoriously difficult task for the frequentist formulation. We then introduce Bayesian trend filtering which exemplifies the benefits of our Bayesian version. Bayesian trend filtering is shown to be an empirically strong technique for fitting univariate, nonparametric regression. Through a simulation study, we show that Bayesian trend filtering reduces prediction error and attains more accurate coverage probabilities over the frequentist method. We then apply Bayesian trend filtering to real data sets, where our method is quite competitive against a number of other popular nonparametric methods

    Group Spike and Slab Variational Bayes

    Full text link
    We introduce Group Spike-and-slab Variational Bayes (GSVB), a scalable method for group sparse regression. A fast co-ordinate ascent variational inference (CAVI) algorithm is developed for several common model families including Gaussian, Binomial and Poisson. Theoretical guarantees for our proposed approach are provided by deriving contraction rates for the variational posterior in grouped linear regression. Through extensive numerical studies, we demonstrate that GSVB provides state-of-the-art performance, offering a computationally inexpensive substitute to MCMC, whilst performing comparably or better than existing MAP methods. Additionally, we analyze three real world datasets wherein we highlight the practical utility of our method, demonstrating that GSVB provides parsimonious models with excellent predictive performance, variable selection and uncertainty quantification.Comment: 66 pages, 5 figures, 7 table

    Robust and efficient projection predictive inference

    Full text link
    The concepts of Bayesian prediction, model comparison, and model selection have developed significantly over the last decade. As a result, the Bayesian community has witnessed a rapid growth in theoretical and applied contributions to building and selecting predictive models. Projection predictive inference in particular has shown promise to this end, finding application across a broad range of fields. It is less prone to over-fitting than na\"ive selection based purely on cross-validation or information criteria performance metrics, and has been known to out-perform other methods in terms of predictive performance. We survey the core concept and contemporary contributions to projection predictive inference, and present a safe, efficient, and modular workflow for prediction-oriented model selection therein. We also provide an interpretation of the projected posteriors achieved by projection predictive inference in terms of their limitations in causal settings

    Towards a data-driven personalised management of Atopic Dermatitis severity

    Get PDF
    Atopic Dermatitis (AD, eczema) is a common inflammatory skin disease, characterised by dry and itchy skin. AD cannot be cured, but its long-term outcomes can be managed with treatments. Given the heterogeneity in patients' responses to treatment, designing personalised rather than ``one-size-fits-all" treatment strategies is of high clinical relevance. In this thesis, we aim to pave the way towards a data-driven personalised management of AD severity, whereby severity data would be collected automatically from photographs without the need for patients to visit a clinic, be used to predict the evolution of AD severity, and generate personalised treatment recommendations. First, we developed EczemaNet, a computer vision pipeline using convolution neural networks that detects areas of AD from photographs and then makes probabilistic assessments of AD severity. EczemaNet was internally validated with a medium-size dataset of images collected in a published clinical trial and demonstrated fair performance. Then, we developed models predicting the daily to weekly evolution of AD severity. We highlighted the challenges of extracting signals from noisy severity data, with small and practically not significant effects of environmental factors and biomarkers on prediction. We showed the importance of using high-quality measurements of validated and objective (vs subjective) severity scores. We also stressed the importance of modelling individual severity items rather than aggregate scores, and introduced EczemaPred, a principled approach to predict AD severity using Bayesian state-space models. Our models are flexible by design, interpretable and can quantify uncertainty in measurements, parameters and predictions. The models demonstrated good performance to predict the Patient-Oriented SCOring AD (PO-SCORAD). Finally, we generated personalised treatment recommendations using Bayesian decision analysis. We observed that treatment effects and recommendations could be confounded by the clinical phenotype of patients. We also pretrained our model using historical data and combined clinical and self-assessments. In conclusion, we have demonstrated the feasibility and the challenges of a data-driven personalised management of AD severity.Open Acces

    Sparse summaries of complex covariance structures : a thesis submitted in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Statistics, School of Natural & Computational Sciences, Massey University, Auckland, New Zealand

    Get PDF
    A matrix that has most of its elements equal to zero is called a sparse matrix. The zero elements in a sparse matrix reduce the number of parameters for its potential interpretability. Bayesians desiring a sparse model frequently formulate priors that enhance sparsity. However, in most settings, this leads to sparse posterior samples, not to a sparse posterior mean. A decoupled shrinkage and selection posterior - variable selection approach was proposed by (Hahn & Carvalho, 2015) to address this problem in a regression setting to set some of the elements of the regression coefficients matrix to exact zeros. Hahn & Carvallho (2015) suggested to work on a decoupled shrinkage and selection approach in a Gaussian graphical models setting to set some of the elements of a precision matrix (graph) to exact zeros. In this thesis, I have filled this gap and proposed decoupled shrinkage and selection approaches to sparsify the precision matrix and the factor loading matrix that is an extension of Hahn & Carvallho’s (2015) decoupled shrinkage and selection approach. The decoupled shrinkage and selection approach proposed by me uses samples from the posterior over the parameter, sets a penalization criteria to produce progressively sparser estimates of the desired parameter, and then sets a rule to pick the final desired parameter from the generated parameters, based on the posterior distribution of fit. My proposed decoupled approach generally produced sparser graphs than a range of existing sparsification strategies such as thresholding the partial correlations, credible interval, adaptive graphical Lasso, and ratio selection, while maintaining a good fit based on the log-likelihood. In simulation studies, my decoupled shrinkage and selection approach had better sensitivity and specificity than the other strategies as the dimension p and sample size n grew. For low-dimensional data, my decoupled shrinkage and selection approach was comparable with the other strategies. Further, I have extended my proposed decoupled shrinkage and selection approach for one population to two populations by modifying the ADMM (alternating directions method of multipliers) algorithm in the JGL (joint graphical Lasso) R – package (Danaher et al, 2013) to find sparse sets of differences between two inverse covariance matrices. The simulation studies showed that my decoupled shrinkage and selection approach for two populations for the sparse case had better sensitivity and specificity than the sensitivity and specificity using JGL. However, sparse sets of differences were challenging for the dense case and moderate sample sizes. My decoupled shrinkage and selection approach for two populations was also applied to find sparse sets of differences between the precision matrices for cases and controls in a metabolomics dataset. Finally, decoupled shrinkage and selection is used to post-process the posterior mean covariance matrix to produce a factor model with a sparse factor loading matrix whose expected fit lies within the upper 95% of the posterior over fits. In the Gaussian setting, simulation studies showed that my proposed DSS sparse factor model approach performed better than fanc (factor analysis using non-convex penalties) (Hirose and Yamamoto, 2015) in terms of sensitivity, specificity, and picking the correct number of factors. Decoupled shrinkage and selection is also easily applied to models where a latent multivariate normal underlies non-Gaussian marginals, e.g., multivariate probit models. I illustrate my findings with moderate dimensional data examples from modelling of food frequency questionnaires and fish abundance

    Lecture notes on ridge regression

    Full text link
    The linear regression model cannot be fitted to high-dimensional data, as the high-dimensionality brings about empirical non-identifiability. Penalized regression overcomes this non-identifiability by augmentation of the loss function by a penalty (i.e. a function of regression coefficients). The ridge penalty is the sum of squared regression coefficients, giving rise to ridge regression. Here many aspect of ridge regression are reviewed e.g. moments, mean squared error, its equivalence to constrained estimation, and its relation to Bayesian regression. Finally, its behaviour and use are illustrated in simulation and on omics data. Subsequently, ridge regression is generalized to allow for a more general penalty. The ridge penalization framework is then translated to logistic regression and its properties are shown to carry over. To contrast ridge penalized estimation, the final chapter introduces its lasso counterpart

    Methodological Reform in Quantitative Second Language Research: Effect Sizes, Bayesian Hypothesis Testing, and Bayesian Estimation of Effect Sizes

    Get PDF
    This dissertation consists of three manuscripts. The manuscripts contribute to a budding “methodological reform” currently taking place in quantitative second-language (L2) research. In the first manuscript, the researcher describes an empirical investigation on the application of two well-known effect size estimators, eta-squared (η2) and partial eta-squared (ηp2), from the previously published literature (2005 - 2015) in four premier L2 journals. These two effect size estimators express the amount of variance accounted for by one or more independent variables. However, despite their widespread reporting, often in conjunction with ANOVAs, these estimators are rarely accompanied by much in the way of interpretation. The study shows that ηp2 values are frequently being misreported as representing η2. The researcher interprets and discusses potential consequences related to the long-standing confusion surrounding these related but distinct estimators. In the second manuscript, the researcher discusses a Bayesian alternative to p-values in t-test designs known a “Bayes Factor”. This approach responds to pointed calls questioning why null hypothesis testing is still the go-to analytic approach in L2 research. Adopting an open-science framework, the researcher (a) re-analyzes the empirical findings of 418 L2 t-tests using the Bayesian hypothesis testing, and (b) compares the Bayesian results with their conventional, null hypothesis testing counterparts. The results show considerable differences arising in the rejections of the null hypothesis in certain cases of previously published literature. The study provides field-wide recommendations for improved use of null hypothesis testing, and introduces a free, online software package developed to promote Bayesian hypothesis testing in the field. In the third manuscript, the researcher provides an applied, non-technical rationale for using Bayesian estimation in L2 research. Specifically, the researcher takes three steps to achieve my goal. First, the researcher compares the conceptual underpinning of the Bayesian and the Frequentist methods. Second, using real as well as carefully simulated data, the researcher introduces and applies a Bayesian method to the estimation of standardized mean difference effect size (i.e., Cohen’s d) from t-test designs. Third, to promote the use of Bayesian estimation of Cohen’s d effect size in L2 research, the researcher introduces a free, web-accessed, point-and-click software package as well as a suite of highly flexible R functions
    • …
    corecore