1,429 research outputs found

    Accuracy-Rejection Curves (ARCs) for Comparing Classification Methods with a Reject Option

    Get PDF
    Abstract Data extracted from microarrays are now considered an important source of knowledge about various diseases. Several studies based on microarray data and the use of receiver operating characteristics (ROC) graphs have compared supervised machine learning approaches. These comparisons are based on classification schemes in which all samples are classified, regardless of the degree of confidence associated with the classification of a particular sample on the basis of a given classifier. In the domain of healthcare, it is safer to refrain from classifying a sample if the confidence assigned to the classification is not high enough, rather than classifying all samples even if confidence is low. We describe an approach in which the performance of different classifiers is compared, with the possibility of rejection, based on several reject areas. Using a tradeoff between accuracy and rejection, we propose the use of accuracy-rejection curves (ARCs) and three types of relationship between ARCs for comparisons of the ARCs of two classifiers. Empirical results based on purely synthetic data, semi-synthetic data (generated from real data obtained from patients) and public microarray data for binary classification problems demonstrate the efficacy of this method

    Precision and Recall Reject Curves for Classification

    Full text link
    For some classification scenarios, it is desirable to use only those classification instances that a trained model associates with a high certainty. To obtain such high-certainty instances, previous work has proposed accuracy-reject curves. Reject curves allow to evaluate and compare the performance of different certainty measures over a range of thresholds for accepting or rejecting classifications. However, the accuracy may not be the most suited evaluation metric for all applications, and instead precision or recall may be preferable. This is the case, for example, for data with imbalanced class distributions. We therefore propose reject curves that evaluate precision and recall, the recall-reject curve and the precision-reject curve. Using prototype-based classifiers from learning vector quantization, we first validate the proposed curves on artificial benchmark data against the accuracy reject curve as a baseline. We then show on imbalanced benchmarks and medical, real-world data that for these scenarios, the proposed precision- and recall-curves yield more accurate insights into classifier performance than accuracy reject curves.Comment: 11 pages, 3 figures. Updated figure label

    Uncertainty-Based Rejection in Machine Learning: Implications for Model Development and Interpretability

    Get PDF
    POCI-01-0247-FEDER-033479Uncertainty is present in every single prediction of Machine Learning (ML) models. Uncertainty Quantification (UQ) is arguably relevant, in particular for safety-critical applications. Prior research focused on the development of methods to quantify uncertainty; however, less attention has been given to how to leverage the knowledge of uncertainty in the process of model development. This work focused on applying UQ into practice, closing the gap of its utility in the ML pipeline and giving insights into how UQ is used to improve model development and its interpretability. We identified three main research questions: (1) How can UQ contribute to choosing the most suitable model for a given classification task? (2) Can UQ be used to combine different models in a principled manner? (3) Can visualization techniques improve UQ’s interpretability? These questions are answered by applying several methods to quantify uncertainty in both a simulated dataset and a real-world dataset of Human Activity Recognition (HAR). Our results showed that uncertainty quantification can increase model robustness and interpretability.publishersversionpublishe

    IDPS Signature Classification with a Reject Option and the Incorporation of Expert Knowledge

    Full text link
    As the importance of intrusion detection and prevention systems (IDPSs) increases, great costs are incurred to manage the signatures that are generated by malicious communication pattern files. Experts in network security need to classify signatures by importance for an IDPS to work. We propose and evaluate a machine learning signature classification model with a reject option (RO) to reduce the cost of setting up an IDPS. To train the proposed model, it is essential to design features that are effective for signature classification. Experts classify signatures with predefined if-then rules. An if-then rule returns a label of low, medium, high, or unknown importance based on keyword matching of the elements in the signature. Therefore, we first design two types of features, symbolic features (SFs) and keyword features (KFs), which are used in keyword matching for the if-then rules. Next, we design web information and message features (WMFs) to capture the properties of signatures that do not match the if-then rules. The WMFs are extracted as term frequency-inverse document frequency (TF-IDF) features of the message text in the signatures. The features are obtained by web scraping from the referenced external attack identification systems described in the signature. Because failure needs to be minimized in the classification of IDPS signatures, as in the medical field, we consider introducing a RO in our proposed model. The effectiveness of the proposed classification model is evaluated in experiments with two real datasets composed of signatures labeled by experts: a dataset that can be classified with if-then rules and a dataset with elements that do not match an if-then rule. In the experiment, the proposed model is evaluated. In both cases, the combined SFs and WMFs performed better than the combined SFs and KFs. In addition, we also performed feature analysis.Comment: 9 pages, 5 figures, 3 table

    On the Feature Selection Methods and Reject Option Classifiers for Robust Cancer Prediction

    Get PDF
    Cancer is the second leading cause of mortality across the globe. Approximately 9.6 million people are estimated to have died due to cancer disease in 2019. Accurate and early prediction of cancer can assist healthcare professionals to devise timely therapeutic innervations to control sufferings and the risk of mortality. Generally, a machine learning (ML) based predictive system in healthcare uses data (genetic profile or clinical parameters) and learning algorithms to predict target values for cancer detection. However, optimization of predictive accuracy is an important endeavor for accurate decision making. Reject Option (RO) classifiers have been used to improve the predictive accuracy of classifiers for cancer like complex problems. In a gene profile all of the features are not important and should be shaved off. ML offers different techniques with their own methodology for feature selection (FS) and the classification results are dependent on the datasets each having its own distribution and features. Therefore, both FS methods and ML algorithms with RO need to be considered for robust classification. The main objective of this study is to optimize three parameters (learning algorithm, FS method and rejection rate) for robust cancer prediction rather than considering two traditional parameters (learning algorithm and rejection rate). The analysis of different FS methods (including t-Test, Las Vegas Filter (LVF), Relief, and Information Gain (IG)) and RO classifiers on different rejection thresholds is performed to investigate the robust predictability of cancer. The three cancer datasets (Colon cancer, Leukemia and Breast cancer) were reduced using different FS methods and each of them were used to analyze the predictability of cancer using different RO classifiers. The results reveal that for each dataset predictive accuracies of RO classifiers were different for different FS methods. The findings based on proposed scheme indicate that, the ML algorithms along with their dependence on suitable FS methods need to be taken into consideration for accurate prediction

    Towards Knowledge Uncertainty Estimation for Open Set Recognition

    Get PDF
    POCI-01-0247-FEDER-033479Uncertainty is ubiquitous and happens in every single prediction of Machine Learning models. The ability to estimate and quantify the uncertainty of individual predictions is arguably relevant, all the more in safety-critical applications. Real-world recognition poses multiple challenges since a model's knowledge about physical phenomenon is not complete, and observations are incomplete by definition. However, Machine Learning algorithms often assume that train and test data distributions are the same and that all testing classes are present during training. A more realistic scenario is the Open Set Recognition, where unknown classes can be submitted to an algorithm during testing. In this paper, we propose a Knowledge Uncertainty Estimation (KUE) method to quantify knowledge uncertainty and reject out-of-distribution inputs. Additionally, we quantify and distinguish aleatoric and epistemic uncertainty with the classical information-theoretical measures of entropy by means of ensemble techniques. We performed experiments on four datasets with different data modalities and compared our results with distance-based classifiers, SVM-based approaches and ensemble techniques using entropy measures. Overall, the effectiveness of KUE in distinguishing in- and out-distribution inputs obtained better results in most cases and was at least comparable in others. Furthermore, a classification with rejection option based on a proposed combination strategy between different measures of uncertainty is an application of uncertainty with proven results.publishersversionpublishe

    Focusing on the Big Picture: Insights into a Systems Approach to Deep Learning for Satellite Imagery

    Full text link
    Deep learning tasks are often complicated and require a variety of components working together efficiently to perform well. Due to the often large scale of these tasks, there is a necessity to iterate quickly in order to attempt a variety of methods and to find and fix bugs. While participating in IARPA's Functional Map of the World challenge, we identified challenges along the entire deep learning pipeline and found various solutions to these challenges. In this paper, we present the performance, engineering, and deep learning considerations with processing and modeling data, as well as underlying infrastructure considerations that support large-scale deep learning tasks. We also discuss insights and observations with regard to satellite imagery and deep learning for image classification.Comment: Accepted to IEEE Big Data 201

    Consistency of plug-in confidence sets for classification in semi-supervised learning

    Full text link
    Confident prediction is highly relevant in machine learning; for example, in applications such as medical diagnoses, wrong prediction can be fatal. For classification, there already exist procedures that allow to not classify data when the confidence in their prediction is weak. This approach is known as classification with reject option. In the present paper, we provide new methodology for this approach. Predicting a new instance via a confidence set, we ensure an exact control of the probability of classification. Moreover, we show that this methodology is easily implementable and entails attractive theoretical and numerical properties

    How do you feel? Measuring User-Perceived Value for Rejecting Machine Decisions in Hate Speech Detection

    Full text link
    Hate speech moderation remains a challenging task for social media platforms. Human-AI collaborative systems offer the potential to combine the strengths of humans' reliability and the scalability of machine learning to tackle this issue effectively. While methods for task handover in human-AI collaboration exist that consider the costs of incorrect predictions, insufficient attention has been paid to accurately estimating these costs. In this work, we propose a value-sensitive rejection mechanism that automatically rejects machine decisions for human moderation based on users' value perceptions regarding machine decisions. We conduct a crowdsourced survey study with 160 participants to evaluate their perception of correct and incorrect machine decisions in the domain of hate speech detection, as well as occurrences where the system rejects making a prediction. Here, we introduce Magnitude Estimation, an unbounded scale, as the preferred method for measuring user (dis)agreement with machine decisions. Our results show that Magnitude Estimation can provide a reliable measurement of participants' perception of machine decisions. By integrating user-perceived value into human-AI collaboration, we further show that it can guide us in 1) determining when to accept or reject machine decisions to obtain the optimal total value a model can deliver and 2) selecting better classification models as compared to the more widely used target of model accuracy.Comment: To appear at AIES '23. Philippe Lammerts, Philip Lippmann, Yen-Chia Hsu, Fabio Casati, and Jie Yang. 2023. How do you feel? Measuring User-Perceived Value for Rejecting Machine Decisions in Hate Speech Detection. In AAAI/ACM Conference on AI, Ethics, and Society (AIES '23), August 8.10, 2023, Montreal, QC, Canada. ACM, New York, NY, USA. 11 page
    corecore