1,496 research outputs found
Block-regularized 52 Cross-validated McNemar's Test for Comparing Two Classification Algorithms
In the task of comparing two classification algorithms, the widely-used
McNemar's test aims to infer the presence of a significant difference between
the error rates of the two classification algorithms. However, the power of the
conventional McNemar's test is usually unpromising because the hold-out (HO)
method in the test merely uses a single train-validation split that usually
produces a highly varied estimation of the error rates. In contrast, a
cross-validation (CV) method repeats the HO method in multiple times and
produces a stable estimation. Therefore, a CV method has a great advantage to
improve the power of McNemar's test. Among all types of CV methods, a
block-regularized 52 CV (BCV) has been shown in many previous studies
to be superior to the other CV methods in the comparison task of algorithms
because the 52 BCV can produce a high-quality estimator of the error
rate by regularizing the numbers of overlapping records between all training
sets. In this study, we compress the 10 correlated contingency tables in the
52 BCV to form an effective contingency table. Then, we define a
52 BCV McNemar's test on the basis of the effective contingency table.
We demonstrate the reasonable type I error and the promising power of the
proposed 52 BCV McNemar's test on multiple simulated and real-world
data sets.Comment: 12 pages, 6 figures, and 5 table
New Results in ell_1 Penalized Regression
Here we consider penalized regression methods, and extend on the results surrounding the l1 norm penalty. We address a more recent development that generalizes previous methods by penalizing a linear transformation of the coefficients of interest instead of penalizing just the coefficients themselves. We introduce an approximate algorithm to fit this generalization and a fully Bayesian hierarchical model that is a direct analogue of the frequentist version. A number of benefits are derived from the Bayesian persepective; most notably choice of the tuning parameter and natural means to estimate the variation of estimates – a notoriously difficult task for the frequentist formulation. We then introduce Bayesian trend filtering which exemplifies the benefits of our Bayesian version. Bayesian trend filtering is shown to be an empirically strong technique for fitting univariate, nonparametric regression. Through a simulation study, we show that Bayesian trend filtering reduces prediction error and attains more accurate coverage probabilities over the frequentist method. We then apply Bayesian trend filtering to real data sets, where our method is quite competitive against a number of other popular nonparametric methods
Group Spike and Slab Variational Bayes
We introduce Group Spike-and-slab Variational Bayes (GSVB), a scalable method
for group sparse regression. A fast co-ordinate ascent variational inference
(CAVI) algorithm is developed for several common model families including
Gaussian, Binomial and Poisson. Theoretical guarantees for our proposed
approach are provided by deriving contraction rates for the variational
posterior in grouped linear regression. Through extensive numerical studies, we
demonstrate that GSVB provides state-of-the-art performance, offering a
computationally inexpensive substitute to MCMC, whilst performing comparably or
better than existing MAP methods. Additionally, we analyze three real world
datasets wherein we highlight the practical utility of our method,
demonstrating that GSVB provides parsimonious models with excellent predictive
performance, variable selection and uncertainty quantification.Comment: 66 pages, 5 figures, 7 table
Robust and efficient projection predictive inference
The concepts of Bayesian prediction, model comparison, and model selection
have developed significantly over the last decade. As a result, the Bayesian
community has witnessed a rapid growth in theoretical and applied contributions
to building and selecting predictive models. Projection predictive inference in
particular has shown promise to this end, finding application across a broad
range of fields. It is less prone to over-fitting than na\"ive selection based
purely on cross-validation or information criteria performance metrics, and has
been known to out-perform other methods in terms of predictive performance. We
survey the core concept and contemporary contributions to projection predictive
inference, and present a safe, efficient, and modular workflow for
prediction-oriented model selection therein. We also provide an interpretation
of the projected posteriors achieved by projection predictive inference in
terms of their limitations in causal settings
Towards a data-driven personalised management of Atopic Dermatitis severity
Atopic Dermatitis (AD, eczema) is a common inflammatory skin disease, characterised by dry and itchy skin.
AD cannot be cured, but its long-term outcomes can be managed with treatments.
Given the heterogeneity in patients' responses to treatment, designing personalised rather than ``one-size-fits-all" treatment strategies is of high clinical relevance.
In this thesis, we aim to pave the way towards a data-driven personalised management of AD severity, whereby severity data would be collected automatically from photographs without the need for patients to visit a clinic, be used to predict the evolution of AD severity, and generate personalised treatment recommendations.
First, we developed EczemaNet, a computer vision pipeline using convolution neural networks that detects areas of AD from photographs and then makes probabilistic assessments of AD severity.
EczemaNet was internally validated with a medium-size dataset of images collected in a published clinical trial and demonstrated fair performance.
Then, we developed models predicting the daily to weekly evolution of AD severity.
We highlighted the challenges of extracting signals from noisy severity data, with small and practically not significant effects of environmental factors and biomarkers on prediction.
We showed the importance of using high-quality measurements of validated and objective (vs subjective) severity scores.
We also stressed the importance of modelling individual severity items rather than aggregate scores, and introduced EczemaPred, a principled approach to predict AD severity using Bayesian state-space models.
Our models are flexible by design, interpretable and can quantify uncertainty in measurements, parameters and predictions.
The models demonstrated good performance to predict the Patient-Oriented SCOring AD (PO-SCORAD).
Finally, we generated personalised treatment recommendations using Bayesian decision analysis.
We observed that treatment effects and recommendations could be confounded by the clinical phenotype of patients.
We also pretrained our model using historical data and combined clinical and self-assessments.
In conclusion, we have demonstrated the feasibility and the challenges of a data-driven personalised management of AD severity.Open Acces
Sparse summaries of complex covariance structures : a thesis submitted in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Statistics, School of Natural & Computational Sciences, Massey University, Auckland, New Zealand
A matrix that has most of its elements equal to zero is called a sparse matrix. The zero elements in a sparse matrix reduce the number of parameters for its potential interpretability. Bayesians desiring a sparse model frequently formulate priors that enhance sparsity. However, in most settings, this leads to sparse posterior samples, not to a sparse posterior mean. A decoupled shrinkage and selection posterior - variable selection approach was proposed by (Hahn & Carvalho, 2015) to address this problem in a regression setting to set some of the elements of the regression coefficients matrix to exact zeros. Hahn & Carvallho (2015) suggested to work on a decoupled shrinkage and selection approach in a Gaussian graphical models setting to set some of the elements of a precision matrix (graph) to exact zeros. In this thesis, I have filled this gap and proposed decoupled shrinkage and selection approaches to sparsify the precision matrix and the factor loading matrix that is an extension of Hahn & Carvallho’s (2015) decoupled shrinkage and selection approach. The decoupled shrinkage and selection approach proposed by me uses samples from the posterior over the parameter, sets a penalization criteria to produce progressively sparser estimates of the desired parameter, and then sets a rule to pick the final desired parameter from the generated parameters, based on the posterior distribution of fit. My proposed decoupled approach generally produced sparser graphs than a range of existing sparsification strategies such as thresholding the partial correlations, credible interval, adaptive graphical Lasso, and ratio selection, while maintaining a good fit based on the log-likelihood. In simulation studies, my decoupled shrinkage and selection approach had better sensitivity and specificity than the other strategies as the dimension p and sample size n grew. For low-dimensional data, my decoupled shrinkage and selection approach was comparable with the other strategies.
Further, I have extended my proposed decoupled shrinkage and selection approach for one population to two populations by modifying the ADMM (alternating directions method of multipliers) algorithm in the JGL (joint graphical Lasso) R – package (Danaher et al, 2013) to find sparse sets of differences between two inverse covariance matrices. The simulation studies showed that my decoupled shrinkage and selection approach for two populations for the sparse case had better sensitivity and specificity than the sensitivity and specificity using JGL. However, sparse sets of differences were challenging for the dense case and moderate sample sizes. My decoupled shrinkage and selection approach for two populations was also applied to find sparse sets of differences between the precision matrices for cases and controls in a metabolomics dataset.
Finally, decoupled shrinkage and selection is used to post-process the posterior mean covariance matrix to produce a factor model with a sparse factor loading matrix whose expected fit lies within the upper 95% of the posterior over fits. In the Gaussian setting, simulation studies showed that my proposed DSS sparse factor model approach performed better than fanc (factor analysis using non-convex penalties) (Hirose and Yamamoto, 2015) in terms of sensitivity, specificity, and picking the correct number of factors. Decoupled shrinkage and selection is also easily applied to models where a latent multivariate normal underlies non-Gaussian marginals, e.g., multivariate probit models. I illustrate my findings with moderate dimensional data examples from modelling of food frequency questionnaires and fish abundance
Lecture notes on ridge regression
The linear regression model cannot be fitted to high-dimensional data, as the
high-dimensionality brings about empirical non-identifiability. Penalized
regression overcomes this non-identifiability by augmentation of the loss
function by a penalty (i.e. a function of regression coefficients). The ridge
penalty is the sum of squared regression coefficients, giving rise to ridge
regression. Here many aspect of ridge regression are reviewed e.g. moments,
mean squared error, its equivalence to constrained estimation, and its relation
to Bayesian regression. Finally, its behaviour and use are illustrated in
simulation and on omics data. Subsequently, ridge regression is generalized to
allow for a more general penalty. The ridge penalization framework is then
translated to logistic regression and its properties are shown to carry over.
To contrast ridge penalized estimation, the final chapter introduces its lasso
counterpart
Methodological Reform in Quantitative Second Language Research: Effect Sizes, Bayesian Hypothesis Testing, and Bayesian Estimation of Effect Sizes
This dissertation consists of three manuscripts. The manuscripts contribute to a budding “methodological reform” currently taking place in quantitative second-language (L2) research.
In the first manuscript, the researcher describes an empirical investigation on the application of two well-known effect size estimators, eta-squared (η2) and partial eta-squared (ηp2), from the previously published literature (2005 - 2015) in four premier L2 journals. These two effect size estimators express the amount of variance accounted for by one or more independent variables. However, despite their widespread reporting, often in conjunction with ANOVAs, these estimators are rarely accompanied by much in the way of interpretation. The study shows that ηp2 values are frequently being misreported as representing η2. The researcher interprets and discusses potential consequences related to the long-standing confusion surrounding these related but distinct estimators.
In the second manuscript, the researcher discusses a Bayesian alternative to p-values in t-test designs known a “Bayes Factor”. This approach responds to pointed calls questioning why null hypothesis testing is still the go-to analytic approach in L2 research. Adopting an open-science framework, the researcher (a) re-analyzes the empirical findings of 418 L2 t-tests using the Bayesian hypothesis testing, and (b) compares the Bayesian results with their conventional, null hypothesis testing counterparts. The results show considerable differences arising in the rejections of the null hypothesis in certain cases of previously published literature. The study provides field-wide recommendations for improved use of null hypothesis testing, and introduces a free, online software package developed to promote Bayesian hypothesis testing in the field.
In the third manuscript, the researcher provides an applied, non-technical rationale for using Bayesian estimation in L2 research. Specifically, the researcher takes three steps to achieve my goal. First, the researcher compares the conceptual underpinning of the Bayesian and the Frequentist methods. Second, using real as well as carefully simulated data, the researcher introduces and applies a Bayesian method to the estimation of standardized mean difference effect size (i.e., Cohen’s d) from t-test designs. Third, to promote the use of Bayesian estimation of Cohen’s d effect size in L2 research, the researcher introduces a free, web-accessed, point-and-click software package as well as a suite of highly flexible R functions
- …