Search CORE

1,733 research outputs found

An experimental investigation of calibration techniques for imbalanced data

Author: Chen H.
Huang L.
vanden Broucke Seppe
Zhao J.
Zhu B.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2020
Field of study

Calibration is a technique used to obtain accurate probability estimation for classification problems in real applications. Class imbalance can create considerable challenges in obtaining accurate probabilities for calibration methods. However, previous research has paid little attention to this issue. In this paper, we present an experimental investigation of some prevailing calibration methods in different imbalance scenarios. Several performance metrics are considered to evaluate different aspects of calibration performance. The experimental results show that the performance of different calibration techniques depends on the metrics and the degree of the imbalance ratio. Isotonic Regression has better overall performance on imbalanced datasets than parametric and other complex non-parametric methods. However, it performs unstably in highly imbalanced scenarios. This study provides some insights into calibration methods on imbalanced datasets, and it can be a reference for the future development of calibration methods in class imbalance scenarios

Ghent University Academic Bibliography

Classifier Calibration: A survey on how to assess and improve predicted class probabilities

Author: Filho Telmo Silva
Flach Peter
Kull Meelis
Perello-Nieto Miquel
Santos-Rodriguez Raul
Song Hao
Publication venue
Publication date: 16/02/2023
Field of study

This paper provides both an introduction to and a detailed overview of the principles and practice of classifier calibration. A well-calibrated classifier correctly quantifies the level of uncertainty or confidence associated with its instance-wise predictions. This is essential for critical applications, optimal decision making, cost-sensitive classification, and for some types of context change. Calibration research has a rich history which predates the birth of machine learning as an academic field by decades. However, a recent increase in the interest on calibration has led to new methods and the extension from binary to the multiclass setting. The space of options and issues to consider is large, and navigating it requires the right set of concepts and tools. We provide both introductory material and up-to-date technical details of the main concepts and methods, including proper scoring rules and other evaluation metrics, visualisation approaches, a comprehensive account of post-hoc calibration methods for binary and multiclass classification, and several advanced topics

arXiv.org e-Print Archive

Explore Bristol Research

Bootstrap Confidence Regions for Optimal Operating Conditions in Response Surface Methodology

Author: Carter Walter H, Jr
Gibb Roger D
Lu I-Li
Publication venue: Collection of Biostatistics Research Archive
Publication date: 12/11/2007
Field of study

This article concerns the application of bootstrap methodology to construct a likelihood-based confidence region for operating conditions associated with the maximum of a response surface constrained to a specified region. Unlike classical methods based on the stationary point, proper interpretation of this confidence region does not depend on unknown model parameters. In addition, the methodology does not require the assumption of normally distributed errors. The approach is demonstrated for concave-down and saddle system cases in two dimensions. Simulation studies were performed to assess the coverage probability of these regions. AMS 2000 subj Classification: 62F25, 62F40, 62F30, 62J05. Key words: Stationary point; Kernel density estimator; Boundary kernel

Collection Of Biostatistics Research Archive

On The Rise of Health Spending and Longevity

Author: Fonseca Raquel
Galama Titus
Kapteyn Arie
Michaud Pierre-Carl
Publication venue
Publication date
Field of study

We use a calibrated stochastic life-cycle model of endogenous health spending, asset accumulation and retirement to investigate the causes behind the increase in health spending and life expectancy over the period 1965-2005. We estimate that technological change along with the increase in the generosity of health insurance may explain independently 53% of the rise in health spending (insurance 29% and technology 24%) while income less than 10%. By simultaneously occurring over this period, these changes may have lead to a "synergy" or interaction effect which helps explain an additional 37% increase in health spending. We estimate that technological change, taking the form of increased productivity at an annual rate of 1.8%, explains 59% of the rise in life expectancy at age 50 over this period while insurance and income explain less than 10%.demand for health, health spending, insurance, technological change, longevity

Research Papers in Economics

The SuperCOSMOS Sky Survey. Paper II: Image detection, parameterisation, classification and photometry

Author: Beard
Bessell
Blair
Boyle
Brownrigg
Bunclark
Caretta
Carter
Croom
Cunow
ESA
Evans
Goldschmidt
H.T. MacGillivray
Hambly
Hambly
Heydon-Dumbleton
Hoaglin
Irwin
Irwin
Landolt
Lasker
M.J. Irwin
MacGillivray
Maddox
Metcalfe
Morgan
N.C. Hambly
Postman
Stobie
Stobie
Stone
Tritton
Young
Publication venue: 'Wiley'
Publication date: 01/01/2001
Field of study

In this, the second in a series of three papers concerning the SuperCOSMOS Sky Survey, we describe the methods for image detection, parameterisation, classification and photometry. We demonstrate the internal and external accuracy of our object parameters. Using examples from the first release of data, the South Galactic Cap survey, we show that our image detection completeness is close to 100% to within 1.5 mag of the nominal plate limits. We show that for the Bj survey data, the image classification is externally > 99% reliable to Bj = 19.5. Internally, the image classification is reliable at a level of > 90% to Bj=21, R=19. The photometric accuracy of our data is typically 0.3 mag with respect to external data for m > 15. Internally, the relative photometric accuracy in restricted position and magnitude ranges can be as accurate as 5% for well exposed stellar images. Colours (B-R or R-I) are externally accurate to 0.07 mag at Bj = 16.5 rising to 0.16 mag at Bj = 20.Comment: 22 pages, 16 figures; accepted for publication in MNRA

arXiv.org e-Print Archive

CiteSeerX

Crossref

CERN Document Server

Improving acoustic vehicle classification by information fusion

Author: Damarla Thyagaraju
Guo Baofeng
Nixon Mark
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 11/03/2011
Field of study

We present an information fusion approach for ground vehicle classification based on the emitted acoustic signal. Many acoustic factors can contribute to the classification accuracy of working ground vehicles. Classification relying on a single feature set may lose some useful information if its underlying sound production model is not comprehensive. To improve classification accuracy, we consider an information fusion diagram, in which various aspects of an acoustic signature are taken into account and emphasized separately by two different feature extraction methods. The first set of features aims to represent internal sound production, and a number of harmonic components are extracted to characterize the factors related to the vehicle’s resonance. The second set of features is extracted based on a computationally effective discriminatory analysis, and a group of key frequency components are selected by mutual information, accounting for the sound production from the vehicle’s exterior parts. In correspondence with this structure, we further put forward a modifiedBayesian fusion algorithm, which takes advantage of matching each specific feature set with its favored classifier. To assess the proposed approach, experiments are carried out based on a data set containing acoustic signals from different types of vehicles. Results indicate that the fusion approach can effectively increase classification accuracy compared to that achieved using each individual features set alone. The Bayesian-based decision level fusion is found fusion is found to be improved than a feature level fusion approac

Southampton (e-Prints Soton)