Search CORE

231 research outputs found

Variable Selection Bias in Classification Trees Based on Imprecise Probabilities

Author: Strobl Carolin
Publication venue
Publication date: 01/01/2005
Field of study

Classification trees based on imprecise probabilities provide an advancement of classical classification trees. The Gini Index is the default splitting criterion in classical classification trees, while in classification trees based on imprecise probabilities, an extension of the Shannon entropy has been introduced as the splitting criterion. However, the use of these empirical entropy measures as split selection criteria can lead to a bias in variable selection, such that variables are preferred for features other than their information content. This bias is not eliminated by the imprecise probability approach. The source of variable selection bias for the estimated Shannon entropy, as well as possible corrections, are outlined. The variable selection performance of the biased and corrected estimators are evaluated in a simulation study. Additional results from research on variable selection bias in classical classification trees are incorporated, implying further investigation of alternative split selection criteria in classification trees based on imprecise probabilities

CiteSeerX

Open Access LMU

Completing an uncertainty criterion of classification

Author: Abellán Joaquín
Publication venue: Universitat Politècnica de Catalunya. Secció de Matemàtiques i Informàtica
Publication date: 01/01/2005
Field of study

We present a variation of a method of classification based in uncertainty on credal set. Similarly to its origin it use the imprecise Dirichlet model to create the credal set and the same uncertainty measures. It take into account sets of two variables to reduce the uncertainty and to seek the direct relations between the variables in the data base and the variable to be classified. The success are equivalent to the success of the first method except in those where there are a direct relations between some variables that decide the value of the variable to be classified where we have a notable improvement

UPCommons. Portal del coneixement obert de la UPC

Ensemble methods for classification trees under imprecise probabilities

Author: Fink Paul
Publication venue
Publication date: 01/01/2012
Field of study

Open Access LMU

CreINNs: Credal-Set Interval Neural Networks for Uncertainty Estimation in Classification Tasks

Author: Cuzzolin Fabio
Hallez Hans
Manchingal Shireen Kudukkil
Moens David
Shariatmadar Keivan
Wang Kaizheng
Publication venue
Publication date: 02/02/2024
Field of study

Uncertainty estimation is increasingly attractive for improving the reliability of neural networks. In this work, we present novel credal-set interval neural networks (CreINNs) designed for classification tasks. CreINNs preserve the traditional interval neural network structure, capturing weight uncertainty through deterministic intervals, while forecasting credal sets using the mathematical framework of probability intervals. Experimental validations on an out-of-distribution detection benchmark (CIFAR10 vs SVHN) showcase that CreINNs outperform epistemic uncertainty estimation when compared to variational Bayesian neural networks (BNNs) and deep ensembles (DEs). Furthermore, CreINNs exhibit a notable reduction in computational complexity compared to variational BNNs and demonstrate smaller model sizes than DEs

arXiv.org e-Print Archive

Improving the Naive Bayes Classifier via a Quick Variable Selection Method Using Maximum of Entropy

Author: Demšar
Duda
Jaynes
Klir
Klir
Shafer
Walley
Walley
Witten
Publication venue: 'MDPI AG'
Publication date: 01/01/2017
Field of study

Variable selection methods play an important role in the field of attribute mining. The Naive Bayes (NB) classifier is a very simple and popular classification method that yields good results in a short processing time. Hence, it is a very appropriate classifier for very large datasets. The method has a high dependence on the relationships between the variables. The Info-Gain (IG) measure, which is based on general entropy, can be used as a quick variable selection method. This measure ranks the importance of the attribute variables on a variable under study via the information obtained from a dataset. The main drawback is that it is always non-negative and it requires setting the information threshold to select the set of most important variables for each dataset. We introduce here a new quick variable selection method that generalizes the method based on the Info-Gain measure. It uses imprecise probabilities and the maximum entropy measure to select the most informative variables without setting a threshold. This new variable selection method, combined with the Naive Bayes classifier, improves the original method and provides a valuable tool for handling datasets with a very large number of features and a huge amount of data, where more complex methods are not computationally feasible.This work has been supported by the Spanish “Ministerio de Economía y Competitividad” and by “Fondo Europeo de Desarrollo Regional” (FEDER) under Project TEC2015-69496-R

Multidisciplinary Digital Publishing Institute

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Directory of Open Access Journals

Repositorio Institucional Universidad de Granada

Maximum of entropy for belief intervals under Evidence Theory

Author: Abellán Mulero Joaquín
Moral García Serafín
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2020
Field of study

The Dempster-Shafer Theory (DST) or Evidence Theory has been commonly used to deal with uncertainty. It is based on the basic probability assignment concept (BPA). The upper entropy on the credal set associated with a BPA is the only uncertainty measure in DST that verifies all the necessary mathematical properties and behaviors. Nonetheless, its computation is notably complex. For this reason, many alternatives to this measure have been recently proposed, but they do not satisfy most of the mathematical requirements and present some undesirable behaviors. Belief intervals have been frequently employed to quantify uncertainty in DST in the last years, and they can represent the uncertainty-basedinformation better than a BPA. In this research, we develop a new uncertainty measure that consists of the maximum of entropy on the credal set corresponding to belief intervals for singletons. It verifies all the crucial mathematical requirements and presents good behavior, solving most of the shortcomings found in uncertainty measures proposed recently. Moreover, its calculation is notably easier than the upper entropy on the credal set associated with the BPA. Therefore, our proposed uncertainty measure is more suitable to be used in practical applications.Spanish Ministerio de Economia y Competitividad TIN2016-77902-C3-2-PEuropean Union (EU) TEC2015-69496-

Crossref

Repositorio Institucional Universidad de Granada

Upgrading the Fusion of Imprecise Classifiers

Author: Benítez María D.
Moral García Serafín
Publication venue: MDPI
Publication date: 19/07/2023
Field of study

Imprecise classification is a relatively new task within Machine Learning. The difference with standard classification is that not only is one state of the variable under study determined, a set of states that do not have enough information against them and cannot be ruled out is determined as well. For imprecise classification, a mode called an Imprecise Credal Decision Tree (ICDT) that uses imprecise probabilities and maximum of entropy as the information measure has been presented. A difficult and interesting task is to show how to combine this type of imprecise classifiers. A procedure based on the minimum level of dominance has been presented; though it represents a very strong method of combining, it has the drawback of an important risk of possible erroneous prediction. In this research, we use the second-best theory to argue that the aforementioned type of combination can be improved through a new procedure built by relaxing the constraints. The new procedure is compared with the original one in an experimental study on a large set of datasets, and shows improvement.UGR-FEDER funds under Project A-TIC-344-UGR20FEDER/Junta de Andalucía-Consejería de Transformación Económica, Industria, Conocimiento y Universidades” under Project P20_0015

Repositorio Institucional Universidad de Granada

Conformalized Credal Set Predictors

Author: Hüllermeier Eyke
Javanmardi Alireza
Stutz David
Publication venue
Publication date: 16/02/2024
Field of study

Credal sets are sets of probability distributions that are considered as candidates for an imprecisely known ground-truth distribution. In machine learning, they have recently attracted attention as an appealing formalism for uncertainty representation, in particular due to their ability to represent both the aleatoric and epistemic uncertainty in a prediction. However, the design of methods for learning credal set predictors remains a challenging problem. In this paper, we make use of conformal prediction for this purpose. More specifically, we propose a method for predicting credal sets in the classification task, given training data labeled by probability distributions. Since our method inherits the coverage guarantees of conformal prediction, our conformal credal sets are guaranteed to be valid with high probability (without any assumptions on model or distribution). We demonstrate the applicability of our method to natural language inference, a highly ambiguous natural language task where it is common to obtain multiple annotations per example

arXiv.org e-Print Archive

Bagging of Credal Decision Trees for Imprecise Classification

Author: Abellán Mulero Joaquín
Benítez Estévez María Dolores
García Castellano Francisco Javier
Mantas Ruiz Carlos Javier
Moral García Serafín
Publication venue: Elsevier
Publication date: 01/03/2020
Field of study

The Credal Decision Trees (CDT) have been adapted for Imprecise Classification (ICDT). However, no ensembles of imprecise classifiers have been proposed so far. The reason might be that it is not a trivial question to combine the predictions made by multiple imprecise classifier. In fact, if the combination method used is not appropriate, the ensemble method could even worse the performance of one single classifier. On the other hand, the Bagging scheme has shown to provide satisfactory results in precise classification, specially when it is used with CDTs, which are known to be very weak and unstable classifiers. For these reasons, in this research, it is proposed a new Bagging scheme with ICDTs. It is presented a new technique for combining predictions made by imprecise classifiers that tries to maximize the precision of the bagging classifier. If the procedure for such a combination is too conservative it is easy to obtain few information and worse the results of a single classifier. Our proposal considers only the states with the minimum level of non-dominance. An exhaustive experimentation carried out in this work has shown that the Bagging of ICDTs, with our proposed combination technique, performs clearly better than a single ICDT.This work has been supported by the Spanish “Ministerio de Economía y Competitividad” and by “Fondo Europeo de Desarrollo Regional” (FEDER) under Project TEC2015-69496-R

Repositorio Institucional Universidad de Granada