53,683 research outputs found
Doubly Optimized Calibrated Support Vector Machine (DOC-SVM): an algorithm for joint optimization of discrimination and calibration.
Historically, probabilistic models for decision support have focused on discrimination, e.g., minimizing the ranking error of predicted outcomes. Unfortunately, these models ignore another important aspect, calibration, which indicates the magnitude of correctness of model predictions. Using discrimination and calibration simultaneously can be helpful for many clinical decisions. We investigated tradeoffs between these goals, and developed a unified maximum-margin method to handle them jointly. Our approach called, Doubly Optimized Calibrated Support Vector Machine (DOC-SVM), concurrently optimizes two loss functions: the ridge regression loss and the hinge loss. Experiments using three breast cancer gene-expression datasets (i.e., GSE2034, GSE2990, and Chanrion's datasets) showed that our model generated more calibrated outputs when compared to other state-of-the-art models like Support Vector Machine (p=0.03, p=0.13, and p<0.001) and Logistic Regression (p=0.006, p=0.008, and p<0.001). DOC-SVM also demonstrated better discrimination (i.e., higher AUCs) when compared to Support Vector Machine (p=0.38, p=0.29, and p=0.047) and Logistic Regression (p=0.38, p=0.04, and p<0.0001). DOC-SVM produced a model that was better calibrated without sacrificing discrimination, and hence may be helpful in clinical decision making
Recommended from our members
Learning salience amoung [sic] features through contingency in the CEL framework
Determining which features in an environment are salient given a task, salience assignment, is a central problem in Machine Learning. A related phenomenon, contingency (the conditions under which relative salience among environmental features is acquired), is central to learning and memory in animal psychology. This paper presents an analysis of a set of empirical data on contingency and an algorithm for the salience assignment problem. The algorithm presented is implemented in a working computer program which interacts with a simulated environment to produce contingent associative learning corresponding to relevant behavioral data. The model also makes specific empirical predictions that can be experimentally tested
Multivariate Bayesian semiparametric models for authentication of food and beverages
Food and beverage authentication is the process by which foods or beverages
are verified as complying with its label description, for example, verifying if
the denomination of origin of an olive oil bottle is correct or if the variety
of a certain bottle of wine matches its label description. The common way to
deal with an authentication process is to measure a number of attributes on
samples of food and then use these as input for a classification problem. Our
motivation stems from data consisting of measurements of nine chemical
compounds denominated Anthocyanins, obtained from samples of Chilean red wines
of grape varieties Cabernet Sauvignon, Merlot and Carm\'{e}n\`{e}re. We
consider a model-based approach to authentication through a semiparametric
multivariate hierarchical linear mixed model for the mean responses, and
covariance matrices that are specific to the classification categories.
Specifically, we propose a model of the ANOVA-DDP type, which takes advantage
of the fact that the available covariates are discrete in nature. The results
suggest that the model performs well compared to other parametric alternatives.
This is also corroborated by application to simulated data.Comment: Published in at http://dx.doi.org/10.1214/11-AOAS492 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
- …