12 research outputs found

    Maximum of entropy for belief intervals under Evidence Theory

    Get PDF
    The Dempster-Shafer Theory (DST) or Evidence Theory has been commonly used to deal with uncertainty. It is based on the basic probability assignment concept (BPA). The upper entropy on the credal set associated with a BPA is the only uncertainty measure in DST that verifies all the necessary mathematical properties and behaviors. Nonetheless, its computation is notably complex. For this reason, many alternatives to this measure have been recently proposed, but they do not satisfy most of the mathematical requirements and present some undesirable behaviors. Belief intervals have been frequently employed to quantify uncertainty in DST in the last years, and they can represent the uncertainty-basedinformation better than a BPA. In this research, we develop a new uncertainty measure that consists of the maximum of entropy on the credal set corresponding to belief intervals for singletons. It verifies all the crucial mathematical requirements and presents good behavior, solving most of the shortcomings found in uncertainty measures proposed recently. Moreover, its calculation is notably easier than the upper entropy on the credal set associated with the BPA. Therefore, our proposed uncertainty measure is more suitable to be used in practical applications.Spanish Ministerio de Economia y Competitividad TIN2016-77902-C3-2-PEuropean Union (EU) TEC2015-69496-

    Required mathematical properties and behaviors of uncertainty measures on belief intervals

    Get PDF
    The Dempster–Shafer theory of evidence (DST) has been widely used to handle uncertainty‐based information. It is based on the concept of basic probability assignment (BPA). Belief intervals are easier to manage than a BPA to represent uncertainty‐based information. For this reason, several uncertainty measures for DST recently proposed are based on belief intervals. In this study, we carry out a study about the crucial mathematical properties and behavioral requirements that must be verified by every uncertainty measure on belief intervals. We base on the study previously carried out for uncertainty measures on BPAs. Furthermore, we analyze which of these properties are satisfied by each one of the uncertainty measures on belief intervals proposed so far. Such a comparative analysis shows that, among these measures, the maximum of entropy on the belief intervals is the most suitable one to be employed in practical applications since it is the only one that satisfies all the required mathematical properties and behaviors

    Upgrading the Fusion of Imprecise Classifiers

    Get PDF
    Imprecise classification is a relatively new task within Machine Learning. The difference with standard classification is that not only is one state of the variable under study determined, a set of states that do not have enough information against them and cannot be ruled out is determined as well. For imprecise classification, a mode called an Imprecise Credal Decision Tree (ICDT) that uses imprecise probabilities and maximum of entropy as the information measure has been presented. A difficult and interesting task is to show how to combine this type of imprecise classifiers. A procedure based on the minimum level of dominance has been presented; though it represents a very strong method of combining, it has the drawback of an important risk of possible erroneous prediction. In this research, we use the second-best theory to argue that the aforementioned type of combination can be improved through a new procedure built by relaxing the constraints. The new procedure is compared with the original one in an experimental study on a large set of datasets, and shows improvement.UGR-FEDER funds under Project A-TIC-344-UGR20FEDER/Junta de Andalucía-Consejería de Transformación Económica, Industria, Conocimiento y Universidades” under Project P20_0015

    Using extreme prior probabilities on the Naive Credal Classifier

    Get PDF
    The Naive Credal Classifier (NCC) was the first method proposed for Imprecise Classification. It starts from the known Naive Bayes algorithm (NB), which assumes that the attributes are independent given the class variable. Despite this unrealistic assumption, NB and NCC have been successfully used in practical applications. In this work, we propose a new version of NCC, called Extreme Prior Naive Credal Classifier (EP-NCC). Unlike NCC, EP-NCC takes into consideration the lower and upper prior probabilities of the class variable in the estimation of the lower and upper conditional probabilities. We demonstrate that, with our proposed EP-NCC, the predictions are more informative than with NCC without increasing the risk of making erroneous predictions. An experimental analysis carried out in this work shows that EP-NCC significantly outperforms NCC and obtains statistically equivalent results to the algorithm proposed so far for Imprecise Classification based on decision trees, even though EP-NCC is computationally simpler. Therefore, EP-NCC is more suitable to be applied to large datasets for Imprecise Classification than the methods proposed so far in this field. This is an important issue in favor of our proposal due to the increasing amount of data in every area.This work has been supported by UGR-FEDER funds under Project A-TIC-344-UGR20, by the “FEDER/Junta de Andalucía-Consejería de Transformación Económica, Industria, Conocimiento y Universidades ” under Project P20_00159, and by research scholarship FPU17/02685

    Decision Tree Ensemble Method for Analyzing Traffic Accidents of Novice Drivers in Urban Areas

    Get PDF
    Presently, there is a critical need to analyze traffic accidents in order to mitigate their terrible economic and human impact. Most accidents occur in urban areas. Furthermore, driving experience has an important effect on accident analysis, since inexperienced drivers are more likely to suffer fatal injuries. This work studies the injury severity produced by accidents that involve inexperienced drivers in urban areas. The analysis was based on data provided by the Spanish General Traffic Directorate. The information root node variation (IRNV) method (based on decision trees) was used to get a rule set that provides useful information about the most probable causes of fatalities in accidents involving inexperienced drivers in urban areas. This may prove useful knowledge in preventing this kind of accidents and/or mitigating their consequences.his work has been supported by the Spanish “Ministerio de Economía y Competitividad” and by “Fondo Europeo de Desarrollo Regional” (FEDER) under Project TEC2015-69496-R

    Combining gene expression data and prior knowledge for inferring gene regulatory networks via Bayesian networks using structural restrictions

    Get PDF
    Ministerio de Economía y Competitividad y Fondo Europeo de Desarrollo Regional (FEDER), proyectos TEC2015-69496-R y TIN2016-77902-C3-2-

    Value‐based potentials: Exploiting quantitative information regularity patterns in probabilistic graphical models

    Get PDF
    This study was jointly supported by the Spanish Ministry of Education and Science under projects PID2019-106758GB-C31 and TIN2016-77902-C3-2-P, and the European Regional Development Fund (FEDER). Funding for open access charge from Universidad de Granada/CBUA.When dealing with complex models (i.e., models with many variables, a high degree of dependency between variables, or many states per variable), the efficient representation of quantitative information in probabilistic graphical models (PGMs) is a challenging task. To address this problem, this study introduces several new structures, aptly named value‐based potentials (VBPs), which are based exclusively on the values. VBPs leverage repeated values to reduce memory requirements. In the present paper, they are compared with some common structures, like standard tables or unidimensional arrays, and probability trees (PT). Like VBPs, PTs are designed to reduce the memory space, but this is achieved only if value repetitions correspond to context‐specific independence patterns (i.e., repeated values are related to consecutive indices or configurations). VBPs are devised to overcome this limitation. The goal of this study is to analyze the properties of VBPs. We provide a theoretical analysis of VBPs and use them to encode the quantitative information of a set of well‐known Bayesian networks, measuring the access time to their content and the computational time required to perform some inference tasks.Spanish Government PID2019-106758GB-C31 TIN2016-77902-C3-2-PEuropean Commissio

    Bagging of Credal Decision Trees for Imprecise Classification

    Get PDF
    The Credal Decision Trees (CDT) have been adapted for Imprecise Classification (ICDT). However, no ensembles of imprecise classifiers have been proposed so far. The reason might be that it is not a trivial question to combine the predictions made by multiple imprecise classifier. In fact, if the combination method used is not appropriate, the ensemble method could even worse the performance of one single classifier. On the other hand, the Bagging scheme has shown to provide satisfactory results in precise classification, specially when it is used with CDTs, which are known to be very weak and unstable classifiers. For these reasons, in this research, it is proposed a new Bagging scheme with ICDTs. It is presented a new technique for combining predictions made by imprecise classifiers that tries to maximize the precision of the bagging classifier. If the procedure for such a combination is too conservative it is easy to obtain few information and worse the results of a single classifier. Our proposal considers only the states with the minimum level of non-dominance. An exhaustive experimentation carried out in this work has shown that the Bagging of ICDTs, with our proposed combination technique, performs clearly better than a single ICDT.This work has been supported by the Spanish “Ministerio de Economía y Competitividad” and by “Fondo Europeo de Desarrollo Regional” (FEDER) under Project TEC2015-69496-R

    Nuevas aplicaciones de modelos basados en probabilidades imprecisas dentro de la minería de datos

    No full text
    When we have information about a finite set of possible alternatives provided by an expert or dataset, a mathematical model is needed to represent such information. In some cases, a unique probability distribution is not appropriate for this purpose because the available information is not sufficient. For this reason, several mathematical theories and models based on imprecise probabilities have been developed in the literature. In this thesis work, we analyze the relations between some imprecise probability theories and study the properties of some models based on imprecise probabilities. When imprecise probability theories and models arise, tools for quantifying the uncertaintybased information in such theories and models, usually called uncertainty measures, are needed. In this thesis work, we analyze the properties of some existing uncertainty measures in theories based on imprecise probabilities and propose uncertainty measures in imprecise probability theories and models that present some advantages over the existing ones. Situations in which it is necessary to represent the information provided by a dataset about a finite set of possible alternatives arise in classification, an essential task within Data Mining. This well-known task consists of predicting, for a given instance described via a set of attributes, the value of a variable under study, known as the class variable. In classification, it is often needed to quantify the uncertainty-based information about the class variable. For this purpose, classical probability theory (PT) has been employed for many years. In the last years, classification algorithms that represent the information about the class variable via imprecise probability models have been developed. Via experimental studies, it has been shown that classification methods based on imprecise probabilities significantly outperform the ones that utilize PT when data contain errors. When classifying an instance, classifiers tend to predict a single value of the class variable. Nonetheless, in some cases, there is not enough information available for a classifier to point out a single class value. In these situations, it is more logical that classifiers predict a set of class values instead of a single value of the class variable. This is known as Imprecise Classification. Classification algorithms (including Imprecise Classification) often aim to minimize the number of instances erroneously classified. This would be optimal if all classification errors had the same importance. Nevertheless, in practical applications, different classification errors usually lead to different costs. For this reason, classifiers that take the misclassification costs into account, also known as cost-sensitive classifiers, have been developed in the literature. Traditional classification (including Imprecise Classification) assumes that each instance has a single value of a class variable. However, in some domains, this task does not fit well because an instance may belong to multiple labels simultaneously. In these domains, the Multi-Label Classification task (MLC) is more suitable than traditional classification. MLC aims to predict the set of labels associated with a given instance described via an attribute set. Most of the MLC methods proposed so far represent the information provided by an MLC dataset about the set of labels via classical PT. In this thesis work, we develop new classification algorithms based on imprecise probability models, including Imprecise Classification, cost-sensitive Imprecise Classification, and MLC, that present some advantages and obtain better experimental results than the ones of the state-of-the-art.En esta tesis seguimos la línea de investigación de teorías y modelos de probabilidades imprecisas y medidas de incertidumbre con probabilidades imprecisas. También proponemos nuevos métodos de clasificación basados en probabilidades imprecisas que obtienen mejor rendimiento que los del estado del arte.Tesis Univ. Granada

    A Variation of the Algorithm to Achieve the Maximum Entropy for Belief Functions

    Get PDF
    Evidence theory (TE), based on imprecise probabilities, is often more appropriate than the classical theory of probability (PT) to apply in situations with inaccurate or incomplete information. The quantification of the information that a piece of evidence involves is a key issue in TE. Shannon’s entropy is an excellent measure in the PT for such purposes, being easy to calculate and fulfilling a wide set of properties that make it axiomatically the best one in PT. In TE, a similar role is played by the maximum of entropy (ME), verifying a similar set of properties. The ME is the unique measure in TE that has such axiomatic behavior. The problem of the ME in TE is its complex computational calculus, which makes its use problematic in some situations. There exists only one algorithm for the calculus of the ME in TE with a high computational cost, and this problem has been the principal drawback found with this measure. In this work, a variation of the original algorithm is presented. It is shown that with this modification, a reduction in the necessary steps to attain the ME can be obtained because, in each step, the power set of possibilities is reduced with respect to the original algorithm, which is the key point of the complexity found. This solution can provide greater applicability of this measure
    corecore