1,733 research outputs found

    An experimental investigation of calibration techniques for imbalanced data

    Get PDF
    Calibration is a technique used to obtain accurate probability estimation for classification problems in real applications. Class imbalance can create considerable challenges in obtaining accurate probabilities for calibration methods. However, previous research has paid little attention to this issue. In this paper, we present an experimental investigation of some prevailing calibration methods in different imbalance scenarios. Several performance metrics are considered to evaluate different aspects of calibration performance. The experimental results show that the performance of different calibration techniques depends on the metrics and the degree of the imbalance ratio. Isotonic Regression has better overall performance on imbalanced datasets than parametric and other complex non-parametric methods. However, it performs unstably in highly imbalanced scenarios. This study provides some insights into calibration methods on imbalanced datasets, and it can be a reference for the future development of calibration methods in class imbalance scenarios

    Classifier Calibration: A survey on how to assess and improve predicted class probabilities

    Full text link
    This paper provides both an introduction to and a detailed overview of the principles and practice of classifier calibration. A well-calibrated classifier correctly quantifies the level of uncertainty or confidence associated with its instance-wise predictions. This is essential for critical applications, optimal decision making, cost-sensitive classification, and for some types of context change. Calibration research has a rich history which predates the birth of machine learning as an academic field by decades. However, a recent increase in the interest on calibration has led to new methods and the extension from binary to the multiclass setting. The space of options and issues to consider is large, and navigating it requires the right set of concepts and tools. We provide both introductory material and up-to-date technical details of the main concepts and methods, including proper scoring rules and other evaluation metrics, visualisation approaches, a comprehensive account of post-hoc calibration methods for binary and multiclass classification, and several advanced topics

    Bootstrap Confidence Regions for Optimal Operating Conditions in Response Surface Methodology

    Get PDF
    This article concerns the application of bootstrap methodology to construct a likelihood-based confidence region for operating conditions associated with the maximum of a response surface constrained to a specified region. Unlike classical methods based on the stationary point, proper interpretation of this confidence region does not depend on unknown model parameters. In addition, the methodology does not require the assumption of normally distributed errors. The approach is demonstrated for concave-down and saddle system cases in two dimensions. Simulation studies were performed to assess the coverage probability of these regions. AMS 2000 subj Classification: 62F25, 62F40, 62F30, 62J05. Key words: Stationary point; Kernel density estimator; Boundary kernel

    On The Rise of Health Spending and Longevity

    Get PDF
    We use a calibrated stochastic life-cycle model of endogenous health spending, asset accumulation and retirement to investigate the causes behind the increase in health spending and life expectancy over the period 1965-2005. We estimate that technological change along with the increase in the generosity of health insurance may explain independently 53% of the rise in health spending (insurance 29% and technology 24%) while income less than 10%. By simultaneously occurring over this period, these changes may have lead to a "synergy" or interaction effect which helps explain an additional 37% increase in health spending. We estimate that technological change, taking the form of increased productivity at an annual rate of 1.8%, explains 59% of the rise in life expectancy at age 50 over this period while insurance and income explain less than 10%.demand for health, health spending, insurance, technological change, longevity

    The SuperCOSMOS Sky Survey. Paper II: Image detection, parameterisation, classification and photometry

    Get PDF
    In this, the second in a series of three papers concerning the SuperCOSMOS Sky Survey, we describe the methods for image detection, parameterisation, classification and photometry. We demonstrate the internal and external accuracy of our object parameters. Using examples from the first release of data, the South Galactic Cap survey, we show that our image detection completeness is close to 100% to within 1.5 mag of the nominal plate limits. We show that for the Bj survey data, the image classification is externally > 99% reliable to Bj = 19.5. Internally, the image classification is reliable at a level of > 90% to Bj=21, R=19. The photometric accuracy of our data is typically 0.3 mag with respect to external data for m > 15. Internally, the relative photometric accuracy in restricted position and magnitude ranges can be as accurate as 5% for well exposed stellar images. Colours (B-R or R-I) are externally accurate to 0.07 mag at Bj = 16.5 rising to 0.16 mag at Bj = 20.Comment: 22 pages, 16 figures; accepted for publication in MNRA

    Improving acoustic vehicle classification by information fusion

    No full text
    We present an information fusion approach for ground vehicle classification based on the emitted acoustic signal. Many acoustic factors can contribute to the classification accuracy of working ground vehicles. Classification relying on a single feature set may lose some useful information if its underlying sound production model is not comprehensive. To improve classification accuracy, we consider an information fusion diagram, in which various aspects of an acoustic signature are taken into account and emphasized separately by two different feature extraction methods. The first set of features aims to represent internal sound production, and a number of harmonic components are extracted to characterize the factors related to the vehicle’s resonance. The second set of features is extracted based on a computationally effective discriminatory analysis, and a group of key frequency components are selected by mutual information, accounting for the sound production from the vehicle’s exterior parts. In correspondence with this structure, we further put forward a modifiedBayesian fusion algorithm, which takes advantage of matching each specific feature set with its favored classifier. To assess the proposed approach, experiments are carried out based on a data set containing acoustic signals from different types of vehicles. Results indicate that the fusion approach can effectively increase classification accuracy compared to that achieved using each individual features set alone. The Bayesian-based decision level fusion is found fusion is found to be improved than a feature level fusion approac
    • 

    corecore