364 research outputs found

    Median evidential c-means algorithm and its application to community detection

    Get PDF
    Median clustering is of great value for partitioning relational data. In this paper, a new prototype-based clustering method, called Median Evidential C-Means (MECM), which is an extension of median c-means and median fuzzy c-means on the theoretical framework of belief functions is proposed. The median variant relaxes the restriction of a metric space embedding for the objects but constrains the prototypes to be in the original data set. Due to these properties, MECM could be applied to graph clustering problems. A community detection scheme for social networks based on MECM is investigated and the obtained credal partitions of graphs, which are more refined than crisp and fuzzy ones, enable us to have a better understanding of the graph structures. An initial prototype-selection scheme based on evidential semi-centrality is presented to avoid local premature convergence and an evidential modularity function is defined to choose the optimal number of communities. Finally, experiments in synthetic and real data sets illustrate the performance of MECM and show its difference to other methods

    Probabilistic Logic Programming with Beta-Distributed Random Variables

    Full text link
    We enable aProbLog---a probabilistic logical programming approach---to reason in presence of uncertain probabilities represented as Beta-distributed random variables. We achieve the same performance of state-of-the-art algorithms for highly specified and engineered domains, while simultaneously we maintain the flexibility offered by aProbLog in handling complex relational domains. Our motivation is that faithfully capturing the distribution of probabilities is necessary to compute an expected utility for effective decision making under uncertainty: unfortunately, these probability distributions can be highly uncertain due to sparse data. To understand and accurately manipulate such probability distributions we need a well-defined theoretical framework that is provided by the Beta distribution, which specifies a distribution of probabilities representing all the possible values of a probability when the exact value is unknown.Comment: Accepted for presentation at AAAI 201

    Beyond tree-shaped credal sum-product networks

    Get PDF

    Learning Credal Sum-Product Networks

    Get PDF
    Probabilistic representations, such as Bayesian and Markov networks, are fundamental to much of statistical machine learning. Thus, learning probabilistic representations directly from data is a deep challenge, the main computational bottleneck being inference that is intractable. Tractable learning is a powerful new paradigm that attempts to learn distributions that support efficient probabilistic querying. By leveraging local structure, representations such as sum-product networks (SPNs) can capture high tree-width models with many hidden layers, essentially a deep architecture, while still admitting a range of probabilistic queries to be computable in time polynomial in the network size. While the progress is impressive, numerous data sources are incomplete, and in the presence of missing data, structure learning methods nonetheless revert to single distributions without characterizing the loss in confidence. In recent work, credal sum-product networks, an imprecise extension of sum-product networks, were proposed to capture this robustness angle. In this work, we are interested in how such representations can be learnt and thus study how the computational machinery underlying tractable learning and inference can be generalized for imprecise probabilities.Comment: Accepted to AKBC 202

    The belief noisy-or model applied to network reliability analysis

    Get PDF
    One difficulty faced in knowledge engineering for Bayesian Network (BN) is the quan-tification step where the Conditional Probability Tables (CPTs) are determined. The number of parameters included in CPTs increases exponentially with the number of parent variables. The most common solution is the application of the so-called canonical gates. The Noisy-OR (NOR) gate, which takes advantage of the independence of causal interactions, provides a logarithmic reduction of the number of parameters required to specify a CPT. In this paper, an extension of NOR model based on the theory of belief functions, named Belief Noisy-OR (BNOR), is proposed. BNOR is capable of dealing with both aleatory and epistemic uncertainty of the network. Compared with NOR, more rich information which is of great value for making decisions can be got when the available knowledge is uncertain. Specially, when there is no epistemic uncertainty, BNOR degrades into NOR. Additionally, different structures of BNOR are presented in this paper in order to meet various needs of engineers. The application of BNOR model on the reliability evaluation problem of networked systems demonstrates its effectiveness

    ISIPTA'07: Proceedings of the Fifth International Symposium on Imprecise Probability: Theories and Applications

    Get PDF
    B

    Credal Fusion of Classifications for Noisy and Uncertain Data

    Get PDF
    This paper reports on an investigation in classification technique employed to classify noised and uncertain data. However, classification is not an easy task. It is a significant challenge to discover knowledge from uncertain data. In fact, we can find many problems. More time we don't have a good or a big learning database for supervised classification. Also, when training data contains noise or missing values, classification accuracy will be affected dramatically. So to extract groups from  data is not easy to do. They are overlapped and not very separated from each other. Another problem which can be cited here is the uncertainty due to measuring devices. Consequentially classification model is not so robust and strong to classify new objects. In this work, we present a novel classification algorithm to cover these problems. We materialize our main idea by using belief function theory to do combination between classification and clustering. This theory treats very well imprecision and uncertainty linked to classification. Experimental results show that our approach has ability to significantly improve the quality of classification of generic database

    Extraction of decision rules via imprecise probabilities

    Full text link
    "This is an Accepted Manuscript of an article published by Taylor & Francis in International Journal of General Systems on 2017, available online: https://www.tandfonline.com/doi/full/10.1080/03081079.2017.1312359"Data analysis techniques can be applied to discover important relations among features. This is the main objective of the Information Root Node Variation (IRNV) technique, a new method to extract knowledge from data via decision trees. The decision trees used by the original method were built using classic split criteria. The performance of new split criteria based on imprecise probabilities and uncertainty measures, called credal split criteria, differs significantly from the performance obtained using the classic criteria. This paper extends the IRNV method using two credal split criteria: one based on a mathematical parametric model, and other one based on a non-parametric model. The performance of the method is analyzed using a case study of traffic accident data to identify patterns related to the severity of an accident. We found that a larger number of rules is generated, significantly supplementing the information obtained using the classic split criteria.This work has been supported by the Spanish "Ministerio de Economia y Competitividad" [Project number TEC2015-69496-R] and FEDER funds.Abellán, J.; López-Maldonado, G.; Garach, L.; Castellano, JG. (2017). Extraction of decision rules via imprecise probabilities. International Journal of General Systems. 46(4):313-331. https://doi.org/10.1080/03081079.2017.1312359S313331464Abellan, J., & Bosse, E. (2018). Drawbacks of Uncertainty Measures Based on the Pignistic Transformation. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 48(3), 382-388. doi:10.1109/tsmc.2016.2597267Abellán, J., & Klir, G. J. (2005). Additivity of uncertainty measures on credal sets. International Journal of General Systems, 34(6), 691-713. doi:10.1080/03081070500396915Abellán, J., & Masegosa, A. R. (2010). An ensemble method using credal decision trees. European Journal of Operational Research, 205(1), 218-226. doi:10.1016/j.ejor.2009.12.003(2003). International Journal of Intelligent Systems, 18(12). doi:10.1002/int.v18:12Abellán, J., Klir, G. J., & Moral, S. (2006). Disaggregated total uncertainty measure for credal sets. International Journal of General Systems, 35(1), 29-44. doi:10.1080/03081070500473490Abellán, J., Baker, R. M., & Coolen, F. P. A. (2011). Maximising entropy on the nonparametric predictive inference model for multinomial data. European Journal of Operational Research, 212(1), 112-122. doi:10.1016/j.ejor.2011.01.020Abellán, J., López, G., & de Oña, J. (2013). Analysis of traffic accident severity using Decision Rules via Decision Trees. Expert Systems with Applications, 40(15), 6047-6054. doi:10.1016/j.eswa.2013.05.027Abellán, J., Baker, R. M., Coolen, F. P. A., Crossman, R. J., & Masegosa, A. R. (2014). Classification with decision trees from a nonparametric predictive inference perspective. Computational Statistics & Data Analysis, 71, 789-802. doi:10.1016/j.csda.2013.02.009Alkhalid, A., Amin, T., Chikalov, I., Hussain, S., Moshkov, M., & Zielosko, B. (2013). Optimization and analysis of decision trees and rules: dynamic programming approach. International Journal of General Systems, 42(6), 614-634. doi:10.1080/03081079.2013.798902Chang, L.-Y., & Chien, J.-T. (2013). Analysis of driver injury severity in truck-involved accidents using a non-parametric classification tree model. Safety Science, 51(1), 17-22. doi:10.1016/j.ssci.2012.06.017Chang, L.-Y., & Wang, H.-W. (2006). Analysis of traffic injury severity: An application of non-parametric classification tree techniques. Accident Analysis & Prevention, 38(5), 1019-1027. doi:10.1016/j.aap.2006.04.009DE CAMPOS, L. M., HUETE, J. F., & MORAL, S. (1994). PROBABILITY INTERVALS: A TOOL FOR UNCERTAIN REASONING. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 02(02), 167-196. doi:10.1142/s0218488594000146DGT. 2011b.Spanish Road Safety Strategy 2011–2020, 222 p. Madrid: Traffic General Directorate.Dolques, X., Le Ber, F., Huchard, M., & Grac, C. (2016). Performance-friendly rule extraction in large water data-sets with AOC posets and relational concept analysis. International Journal of General Systems, 45(2), 187-210. doi:10.1080/03081079.2015.1072927Gray, R. C., Quddus, M. A., & Evans, A. (2008). Injury severity analysis of accidents involving young male drivers in Great Britain. Journal of Safety Research, 39(5), 483-495. doi:10.1016/j.jsr.2008.07.003Guo, J., & Chankong, V. (2002). Rough set-based approach to rule generation and rule induction. International Journal of General Systems, 31(6), 601-617. doi:10.1080/0308107021000034353Huang, H., Chin, H. C., & Haque, M. M. (2008). Severity of driver injury and vehicle damage in traffic crashes at intersections: A Bayesian hierarchical analysis. Accident Analysis & Prevention, 40(1), 45-54. doi:10.1016/j.aap.2007.04.002Kashani, A. T., & Mohaymany, A. S. (2011). Analysis of the traffic injury severity on two-lane, two-way rural roads based on classification tree models. Safety Science, 49(10), 1314-1320. doi:10.1016/j.ssci.2011.04.019Li, X., & Yu, L. (2016). Decision making under various types of uncertainty. International Journal of General Systems, 45(3), 251-252. doi:10.1080/03081079.2015.1086574Mantas, C. J., & Abellán, J. (2014). Analysis and extension of decision trees based on imprecise probabilities: Application on noisy data. Expert Systems with Applications, 41(5), 2514-2525. doi:10.1016/j.eswa.2013.09.050Mayhew, D. R., Simpson, H. M., & Pak, A. (2003). Changes in collision rates among novice drivers during the first months of driving. Accident Analysis & Prevention, 35(5), 683-691. doi:10.1016/s0001-4575(02)00047-7McCartt, A. T., Mayhew, D. R., Braitman, K. A., Ferguson, S. A., & Simpson, H. M. (2009). Effects of Age and Experience on Young Driver Crashes: Review of Recent Literature. Traffic Injury Prevention, 10(3), 209-219. doi:10.1080/15389580802677807Montella, A., Aria, M., D’Ambrosio, A., & Mauriello, F. (2011). Data-Mining Techniques for Exploratory Analysis of Pedestrian Crashes. Transportation Research Record: Journal of the Transportation Research Board, 2237(1), 107-116. doi:10.3141/2237-12Montella, A., Aria, M., D’Ambrosio, A., & Mauriello, F. (2012). Analysis of powered two-wheeler crashes in Italy by classification trees and rules discovery. Accident Analysis & Prevention, 49, 58-72. doi:10.1016/j.aap.2011.04.025De Oña, J., López, G., & Abellán, J. (2013). Extracting decision rules from police accident reports through decision trees. Accident Analysis & Prevention, 50, 1151-1160. doi:10.1016/j.aap.2012.09.006De Oña, J., López, G., Mujalli, R., & Calvo, F. J. (2013). Analysis of traffic accidents on rural highways using Latent Class Clustering and Bayesian Networks. Accident Analysis & Prevention, 51, 1-10. doi:10.1016/j.aap.2012.10.016Pande, A., & Abdel-Aty, M. (2009). Market basket analysis of crash data from large jurisdictions and its potential as a decision support tool. Safety Science, 47(1), 145-154. doi:10.1016/j.ssci.2007.12.001Peek-Asa, C., Britton, C., Young, T., Pawlovich, M., & Falb, S. (2010). Teenage driver crash incidence and factors influencing crash injury by rurality. Journal of Safety Research, 41(6), 487-492. doi:10.1016/j.jsr.2010.10.002Sikora, M., & Wróbel, Ł. (2013). Data-driven adaptive selection of rule quality measures for improving rule induction and filtration algorithms. International Journal of General Systems, 42(6), 594-613. doi:10.1080/03081079.2013.798901Walley, P. (1996). Inferences from Multinomial Data: Learning About a Bag of Marbles. Journal of the Royal Statistical Society: Series B (Methodological), 58(1), 3-34. doi:10.1111/j.2517-6161.1996.tb02065.xWang, Z., & Klir, G. J. (1992). Fuzzy Measure Theory. doi:10.1007/978-1-4757-5303-5Webb, G. I. (2007). Discovering Significant Patterns. Machine Learning, 68(1), 1-33. doi:10.1007/s10994-007-5006-xWitten, I. H., & Frank, E. (2002). Data mining. ACM SIGMOD Record, 31(1), 76-77. doi:10.1145/507338.50735
    corecore