506 research outputs found
Conformal Credal Self-Supervised Learning
In semi-supervised learning, the paradigm of self-training refers to the idea
of learning from pseudo-labels suggested by the learner itself. Across various
domains, corresponding methods have proven effective and achieve
state-of-the-art performance. However, pseudo-labels typically stem from ad-hoc
heuristics, relying on the quality of the predictions though without
guaranteeing their validity. One such method, so-called credal self-supervised
learning, maintains pseudo-supervision in the form of sets of (instead of
single) probability distributions over labels, thereby allowing for a flexible
yet uncertainty-aware labeling. Again, however, there is no justification
beyond empirical effectiveness. To address this deficiency, we make use of
conformal prediction, an approach that comes with guarantees on the validity of
set-valued predictions. As a result, the construction of credal sets of labels
is supported by a rigorous theoretical foundation, leading to better calibrated
and less error-prone supervision for unlabeled data. Along with this, we
present effective algorithms for learning from credal self-supervision. An
empirical study demonstrates excellent calibration properties of the
pseudo-supervision, as well as the competitiveness of our method on several
benchmark datasets.Comment: 26 pages, 5 figures, 10 tables, to be published at the 12th Symposium
on Conformal and Probabilistic Prediction with Applications (COPA 2023
Extraction of decision rules via imprecise probabilities
"This is an Accepted Manuscript of an article published by Taylor & Francis in International Journal of General Systems on 2017, available online: https://www.tandfonline.com/doi/full/10.1080/03081079.2017.1312359"Data analysis techniques can be applied to discover important relations among features. This is the main objective of the Information Root Node Variation (IRNV) technique, a new method to extract knowledge from data via decision trees. The decision trees used by the original method were built using classic split criteria. The performance of new split criteria based on imprecise probabilities and uncertainty measures, called credal split criteria, differs significantly from the performance obtained using the classic criteria. This paper extends the IRNV method using two credal split criteria: one based on a mathematical parametric model, and other one based on a non-parametric model. The performance of the method is analyzed using a case study of traffic accident data to identify patterns related to the severity of an accident. We found that a larger number of rules is generated, significantly supplementing the information obtained using the classic split criteria.This work has been supported by the Spanish "Ministerio de Economia y Competitividad" [Project number TEC2015-69496-R] and FEDER funds.Abellán, J.; López-Maldonado, G.; Garach, L.; Castellano, JG. (2017). Extraction of decision rules via imprecise probabilities. International Journal of General Systems. 46(4):313-331. https://doi.org/10.1080/03081079.2017.1312359S313331464Abellan, J., & Bosse, E. (2018). Drawbacks of Uncertainty Measures Based on the Pignistic Transformation. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 48(3), 382-388. doi:10.1109/tsmc.2016.2597267Abellán, J., & Klir, G. J. (2005). Additivity of uncertainty measures on credal sets. International Journal of General Systems, 34(6), 691-713. doi:10.1080/03081070500396915Abellán, J., & Masegosa, A. R. (2010). An ensemble method using credal decision trees. European Journal of Operational Research, 205(1), 218-226. doi:10.1016/j.ejor.2009.12.003(2003). International Journal of Intelligent Systems, 18(12). doi:10.1002/int.v18:12Abellán, J., Klir, G. J., & Moral, S. (2006). Disaggregated total uncertainty measure for credal sets. International Journal of General Systems, 35(1), 29-44. doi:10.1080/03081070500473490Abellán, J., Baker, R. M., & Coolen, F. P. A. (2011). Maximising entropy on the nonparametric predictive inference model for multinomial data. European Journal of Operational Research, 212(1), 112-122. doi:10.1016/j.ejor.2011.01.020Abellán, J., López, G., & de Oña, J. (2013). Analysis of traffic accident severity using Decision Rules via Decision Trees. Expert Systems with Applications, 40(15), 6047-6054. doi:10.1016/j.eswa.2013.05.027Abellán, J., Baker, R. M., Coolen, F. P. A., Crossman, R. J., & Masegosa, A. R. (2014). Classification with decision trees from a nonparametric predictive inference perspective. Computational Statistics & Data Analysis, 71, 789-802. doi:10.1016/j.csda.2013.02.009Alkhalid, A., Amin, T., Chikalov, I., Hussain, S., Moshkov, M., & Zielosko, B. (2013). Optimization and analysis of decision trees and rules: dynamic programming approach. International Journal of General Systems, 42(6), 614-634. doi:10.1080/03081079.2013.798902Chang, L.-Y., & Chien, J.-T. (2013). Analysis of driver injury severity in truck-involved accidents using a non-parametric classification tree model. Safety Science, 51(1), 17-22. doi:10.1016/j.ssci.2012.06.017Chang, L.-Y., & Wang, H.-W. (2006). Analysis of traffic injury severity: An application of non-parametric classification tree techniques. Accident Analysis & Prevention, 38(5), 1019-1027. doi:10.1016/j.aap.2006.04.009DE CAMPOS, L. M., HUETE, J. F., & MORAL, S. (1994). PROBABILITY INTERVALS: A TOOL FOR UNCERTAIN REASONING. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 02(02), 167-196. doi:10.1142/s0218488594000146DGT. 2011b.Spanish Road Safety Strategy 2011–2020, 222 p. Madrid: Traffic General Directorate.Dolques, X., Le Ber, F., Huchard, M., & Grac, C. (2016). Performance-friendly rule extraction in large water data-sets with AOC posets and relational concept analysis. International Journal of General Systems, 45(2), 187-210. doi:10.1080/03081079.2015.1072927Gray, R. C., Quddus, M. A., & Evans, A. (2008). Injury severity analysis of accidents involving young male drivers in Great Britain. Journal of Safety Research, 39(5), 483-495. doi:10.1016/j.jsr.2008.07.003Guo, J., & Chankong, V. (2002). Rough set-based approach to rule generation and rule induction. International Journal of General Systems, 31(6), 601-617. doi:10.1080/0308107021000034353Huang, H., Chin, H. C., & Haque, M. M. (2008). Severity of driver injury and vehicle damage in traffic crashes at intersections: A Bayesian hierarchical analysis. Accident Analysis & Prevention, 40(1), 45-54. doi:10.1016/j.aap.2007.04.002Kashani, A. T., & Mohaymany, A. S. (2011). Analysis of the traffic injury severity on two-lane, two-way rural roads based on classification tree models. Safety Science, 49(10), 1314-1320. doi:10.1016/j.ssci.2011.04.019Li, X., & Yu, L. (2016). Decision making under various types of uncertainty. International Journal of General Systems, 45(3), 251-252. doi:10.1080/03081079.2015.1086574Mantas, C. J., & Abellán, J. (2014). Analysis and extension of decision trees based on imprecise probabilities: Application on noisy data. Expert Systems with Applications, 41(5), 2514-2525. doi:10.1016/j.eswa.2013.09.050Mayhew, D. R., Simpson, H. M., & Pak, A. (2003). Changes in collision rates among novice drivers during the first months of driving. Accident Analysis & Prevention, 35(5), 683-691. doi:10.1016/s0001-4575(02)00047-7McCartt, A. T., Mayhew, D. R., Braitman, K. A., Ferguson, S. A., & Simpson, H. M. (2009). Effects of Age and Experience on Young Driver Crashes: Review of Recent Literature. Traffic Injury Prevention, 10(3), 209-219. doi:10.1080/15389580802677807Montella, A., Aria, M., D’Ambrosio, A., & Mauriello, F. (2011). Data-Mining Techniques for Exploratory Analysis of Pedestrian Crashes. Transportation Research Record: Journal of the Transportation Research Board, 2237(1), 107-116. doi:10.3141/2237-12Montella, A., Aria, M., D’Ambrosio, A., & Mauriello, F. (2012). Analysis of powered two-wheeler crashes in Italy by classification trees and rules discovery. Accident Analysis & Prevention, 49, 58-72. doi:10.1016/j.aap.2011.04.025De Oña, J., López, G., & Abellán, J. (2013). Extracting decision rules from police accident reports through decision trees. Accident Analysis & Prevention, 50, 1151-1160. doi:10.1016/j.aap.2012.09.006De Oña, J., López, G., Mujalli, R., & Calvo, F. J. (2013). Analysis of traffic accidents on rural highways using Latent Class Clustering and Bayesian Networks. Accident Analysis & Prevention, 51, 1-10. doi:10.1016/j.aap.2012.10.016Pande, A., & Abdel-Aty, M. (2009). Market basket analysis of crash data from large jurisdictions and its potential as a decision support tool. Safety Science, 47(1), 145-154. doi:10.1016/j.ssci.2007.12.001Peek-Asa, C., Britton, C., Young, T., Pawlovich, M., & Falb, S. (2010). Teenage driver crash incidence and factors influencing crash injury by rurality. Journal of Safety Research, 41(6), 487-492. doi:10.1016/j.jsr.2010.10.002Sikora, M., & Wróbel, Ł. (2013). Data-driven adaptive selection of rule quality measures for improving rule induction and filtration algorithms. International Journal of General Systems, 42(6), 594-613. doi:10.1080/03081079.2013.798901Walley, P. (1996). Inferences from Multinomial Data: Learning About a Bag of Marbles. Journal of the Royal Statistical Society: Series B (Methodological), 58(1), 3-34. doi:10.1111/j.2517-6161.1996.tb02065.xWang, Z., & Klir, G. J. (1992). Fuzzy Measure Theory. doi:10.1007/978-1-4757-5303-5Webb, G. I. (2007). Discovering Significant Patterns. Machine Learning, 68(1), 1-33. doi:10.1007/s10994-007-5006-xWitten, I. H., & Frank, E. (2002). Data mining. ACM SIGMOD Record, 31(1), 76-77. doi:10.1145/507338.50735
High Dimensional Statistical Modelling with Limited Information
Modern scientific experiments often rely on different statistical tools, regularisation
being one of them. Regularisation methods are usually used to avoid overfitting but
we may also want use regularisation methods for variable selection,
especially when the number of modelling parameters are higher than the total number
of observations. However, performing variable selection can often be difficult under limited
information and we may get a misspecified model. To overcome this issue, we propose a robust
variable selection routine using a Bayesian hierarchical model.
We adapt the framework of Narisetty and He to propose a novel spike and slab prior specification
for the regression coefficients. We take inspiration from the imprecise beta model and
use a set of beta distributions to specify the prior expectation of the selection probability.
We perform a robust Bayesian analysis over this set of distributions in order to
incorporate expert opinion in an efficient manner.
We also discuss novel results on likelihood-based approaches for variable selection.
We exploit the framework of the adaptive LASSO to propose sensitivity analyses
of LASSO-type problems. The sensitivity analysis also gives us a novel non-deterministic classifier
for high dimensional problems, which we illustrate using real datasets.
Finally, we illustrate our novel robust Bayesian variable selection using synthetic and real-world data.
We show the importance of prior elicitation in variable selection as well as model fitting and compare
our method with other Bayesian approaches for variable selection
Classification using set-valued Kalman filtering and Levi\u27s decision theory
We consider the problem of using Levi\u27s expected epistemic decision theory for classification when the hypotheses are of different informational values, conditioned on convex sets obtained from a set-valued Kalman filter. The background of epistemic utility decision theory with convex probabilities is outlined and a brief introduction to set-valued estimation is given. The decision theory is applied to a classifier in a multiple-target tracking scenario. A new probability density, appropriate for classification using the ratio of intensities, is introduced
- …