923 research outputs found

    Extraction of decision rules via imprecise probabilities

    Full text link
    "This is an Accepted Manuscript of an article published by Taylor & Francis in International Journal of General Systems on 2017, available online: https://www.tandfonline.com/doi/full/10.1080/03081079.2017.1312359"Data analysis techniques can be applied to discover important relations among features. This is the main objective of the Information Root Node Variation (IRNV) technique, a new method to extract knowledge from data via decision trees. The decision trees used by the original method were built using classic split criteria. The performance of new split criteria based on imprecise probabilities and uncertainty measures, called credal split criteria, differs significantly from the performance obtained using the classic criteria. This paper extends the IRNV method using two credal split criteria: one based on a mathematical parametric model, and other one based on a non-parametric model. The performance of the method is analyzed using a case study of traffic accident data to identify patterns related to the severity of an accident. We found that a larger number of rules is generated, significantly supplementing the information obtained using the classic split criteria.This work has been supported by the Spanish "Ministerio de Economia y Competitividad" [Project number TEC2015-69496-R] and FEDER funds.AbellĂĄn, J.; LĂłpez-Maldonado, G.; Garach, L.; Castellano, JG. (2017). Extraction of decision rules via imprecise probabilities. International Journal of General Systems. 46(4):313-331. https://doi.org/10.1080/03081079.2017.1312359S313331464Abellan, J., & Bosse, E. (2018). Drawbacks of Uncertainty Measures Based on the Pignistic Transformation. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 48(3), 382-388. doi:10.1109/tsmc.2016.2597267AbellĂĄn, J., & Klir, G. J. (2005). Additivity of uncertainty measures on credal sets. International Journal of General Systems, 34(6), 691-713. doi:10.1080/03081070500396915AbellĂĄn, J., & Masegosa, A. R. (2010). An ensemble method using credal decision trees. European Journal of Operational Research, 205(1), 218-226. doi:10.1016/j.ejor.2009.12.003(2003). International Journal of Intelligent Systems, 18(12). doi:10.1002/int.v18:12AbellĂĄn, J., Klir, G. J., & Moral, S. (2006). Disaggregated total uncertainty measure for credal sets. International Journal of General Systems, 35(1), 29-44. doi:10.1080/03081070500473490AbellĂĄn, J., Baker, R. M., & Coolen, F. P. A. (2011). Maximising entropy on the nonparametric predictive inference model for multinomial data. European Journal of Operational Research, 212(1), 112-122. doi:10.1016/j.ejor.2011.01.020AbellĂĄn, J., LĂłpez, G., & de Oña, J. (2013). Analysis of traffic accident severity using Decision Rules via Decision Trees. Expert Systems with Applications, 40(15), 6047-6054. doi:10.1016/j.eswa.2013.05.027AbellĂĄn, J., Baker, R. M., Coolen, F. P. A., Crossman, R. J., & Masegosa, A. R. (2014). Classification with decision trees from a nonparametric predictive inference perspective. Computational Statistics & Data Analysis, 71, 789-802. doi:10.1016/j.csda.2013.02.009Alkhalid, A., Amin, T., Chikalov, I., Hussain, S., Moshkov, M., & Zielosko, B. (2013). Optimization and analysis of decision trees and rules: dynamic programming approach. International Journal of General Systems, 42(6), 614-634. doi:10.1080/03081079.2013.798902Chang, L.-Y., & Chien, J.-T. (2013). Analysis of driver injury severity in truck-involved accidents using a non-parametric classification tree model. Safety Science, 51(1), 17-22. doi:10.1016/j.ssci.2012.06.017Chang, L.-Y., & Wang, H.-W. (2006). Analysis of traffic injury severity: An application of non-parametric classification tree techniques. Accident Analysis & Prevention, 38(5), 1019-1027. doi:10.1016/j.aap.2006.04.009DE CAMPOS, L. M., HUETE, J. F., & MORAL, S. (1994). PROBABILITY INTERVALS: A TOOL FOR UNCERTAIN REASONING. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 02(02), 167-196. doi:10.1142/s0218488594000146DGT. 2011b.Spanish Road Safety Strategy 2011–2020, 222 p. Madrid: Traffic General Directorate.Dolques, X., Le Ber, F., Huchard, M., & Grac, C. (2016). Performance-friendly rule extraction in large water data-sets with AOC posets and relational concept analysis. International Journal of General Systems, 45(2), 187-210. doi:10.1080/03081079.2015.1072927Gray, R. C., Quddus, M. A., & Evans, A. (2008). Injury severity analysis of accidents involving young male drivers in Great Britain. Journal of Safety Research, 39(5), 483-495. doi:10.1016/j.jsr.2008.07.003Guo, J., & Chankong, V. (2002). Rough set-based approach to rule generation and rule induction. International Journal of General Systems, 31(6), 601-617. doi:10.1080/0308107021000034353Huang, H., Chin, H. C., & Haque, M. M. (2008). Severity of driver injury and vehicle damage in traffic crashes at intersections: A Bayesian hierarchical analysis. Accident Analysis & Prevention, 40(1), 45-54. doi:10.1016/j.aap.2007.04.002Kashani, A. T., & Mohaymany, A. S. (2011). Analysis of the traffic injury severity on two-lane, two-way rural roads based on classification tree models. Safety Science, 49(10), 1314-1320. doi:10.1016/j.ssci.2011.04.019Li, X., & Yu, L. (2016). Decision making under various types of uncertainty. International Journal of General Systems, 45(3), 251-252. doi:10.1080/03081079.2015.1086574Mantas, C. J., & AbellĂĄn, J. (2014). Analysis and extension of decision trees based on imprecise probabilities: Application on noisy data. Expert Systems with Applications, 41(5), 2514-2525. doi:10.1016/j.eswa.2013.09.050Mayhew, D. R., Simpson, H. M., & Pak, A. (2003). Changes in collision rates among novice drivers during the first months of driving. Accident Analysis & Prevention, 35(5), 683-691. doi:10.1016/s0001-4575(02)00047-7McCartt, A. T., Mayhew, D. R., Braitman, K. A., Ferguson, S. A., & Simpson, H. M. (2009). Effects of Age and Experience on Young Driver Crashes: Review of Recent Literature. Traffic Injury Prevention, 10(3), 209-219. doi:10.1080/15389580802677807Montella, A., Aria, M., D’Ambrosio, A., & Mauriello, F. (2011). Data-Mining Techniques for Exploratory Analysis of Pedestrian Crashes. Transportation Research Record: Journal of the Transportation Research Board, 2237(1), 107-116. doi:10.3141/2237-12Montella, A., Aria, M., D’Ambrosio, A., & Mauriello, F. (2012). Analysis of powered two-wheeler crashes in Italy by classification trees and rules discovery. Accident Analysis & Prevention, 49, 58-72. doi:10.1016/j.aap.2011.04.025De Oña, J., LĂłpez, G., & AbellĂĄn, J. (2013). Extracting decision rules from police accident reports through decision trees. Accident Analysis & Prevention, 50, 1151-1160. doi:10.1016/j.aap.2012.09.006De Oña, J., LĂłpez, G., Mujalli, R., & Calvo, F. J. (2013). Analysis of traffic accidents on rural highways using Latent Class Clustering and Bayesian Networks. Accident Analysis & Prevention, 51, 1-10. doi:10.1016/j.aap.2012.10.016Pande, A., & Abdel-Aty, M. (2009). Market basket analysis of crash data from large jurisdictions and its potential as a decision support tool. Safety Science, 47(1), 145-154. doi:10.1016/j.ssci.2007.12.001Peek-Asa, C., Britton, C., Young, T., Pawlovich, M., & Falb, S. (2010). Teenage driver crash incidence and factors influencing crash injury by rurality. Journal of Safety Research, 41(6), 487-492. doi:10.1016/j.jsr.2010.10.002Sikora, M., & WrĂłbel, Ɓ. (2013). Data-driven adaptive selection of rule quality measures for improving rule induction and filtration algorithms. International Journal of General Systems, 42(6), 594-613. doi:10.1080/03081079.2013.798901Walley, P. (1996). Inferences from Multinomial Data: Learning About a Bag of Marbles. Journal of the Royal Statistical Society: Series B (Methodological), 58(1), 3-34. doi:10.1111/j.2517-6161.1996.tb02065.xWang, Z., & Klir, G. J. (1992). Fuzzy Measure Theory. doi:10.1007/978-1-4757-5303-5Webb, G. I. (2007). Discovering Significant Patterns. Machine Learning, 68(1), 1-33. doi:10.1007/s10994-007-5006-xWitten, I. H., & Frank, E. (2002). Data mining. ACM SIGMOD Record, 31(1), 76-77. doi:10.1145/507338.50735

    Decision Tree Ensemble Method for Analyzing Traffic Accidents of Novice Drivers in Urban Areas

    Get PDF
    Presently, there is a critical need to analyze traffic accidents in order to mitigate their terrible economic and human impact. Most accidents occur in urban areas. Furthermore, driving experience has an important effect on accident analysis, since inexperienced drivers are more likely to suffer fatal injuries. This work studies the injury severity produced by accidents that involve inexperienced drivers in urban areas. The analysis was based on data provided by the Spanish General Traffic Directorate. The information root node variation (IRNV) method (based on decision trees) was used to get a rule set that provides useful information about the most probable causes of fatalities in accidents involving inexperienced drivers in urban areas. This may prove useful knowledge in preventing this kind of accidents and/or mitigating their consequences.his work has been supported by the Spanish “Ministerio de Economía y Competitividad” and by “Fondo Europeo de Desarrollo Regional” (FEDER) under Project TEC2015-69496-R

    Analysis of traffic accident severity using Decision Rules via Decision Trees

    Full text link
    [EN] A Decision Tree (DT) is a potential method for studying traffic accident severity. One of its main advantages is that Decision Rules can be extracted from its structure and used to identify safety problems and establish certain measures of performance. However, when it used only one DT, the rule extraction is limited to the structure of that DT and some important relationships between variables cannot be extracted. This paper presents a method for extracting rules from a DT more effectively. The method¿s effectiveness when applied to a particular traffic accidents dataset is shown. Specifically, our study focuses on traffic accident data from rural roads in Granada (Spain) from 2003 to 2009 (both included). The results show that we can obtain more than 70 relevant rules from our data using the new method, whereas with only one DT we would had extracted only 5 rules from the same dataset.Abellån, J.; López-Maldonado, G.; De Oña, J. (2013). Analysis of traffic accident severity using Decision Rules via Decision Trees. Expert Systems with Applications. 40(15):6047-6054. doi:10.1016/j.eswa.2013.05.027S60476054401

    Fuzzy Logic

    Get PDF
    The capability of Fuzzy Logic in the development of emerging technologies is introduced in this book. The book consists of sixteen chapters showing various applications in the field of Bioinformatics, Health, Security, Communications, Transportations, Financial Management, Energy and Environment Systems. This book is a major reference source for all those concerned with applied intelligent systems. The intended readers are researchers, engineers, medical practitioners, and graduate students interested in fuzzy logic systems

    Road Traffic Congestion Analysis Via Connected Vehicles

    Get PDF
    La congestion routiĂšre est un Ă©tat particulier de mobilitĂ© oĂč les temps de dĂ©placement augmentent et de plus en plus de temps est passĂ© dans le vĂ©hicule. En plus d’ĂȘtre une expĂ©rience trĂšs stressante pour les conducteurs, la congestion a Ă©galement un impact nĂ©gatif sur l’environnement et l’économie. Dans ce contexte, des pressions sont exercĂ©es sur les autoritĂ©s afin qu’elles prennent des mesures dĂ©cisives pour amĂ©liorer le flot du trafic sur le rĂ©seau routier. En amĂ©liorant le flot, la congestion est rĂ©duite et la durĂ©e totale de dĂ©placement des vĂ©hicules est rĂ©duite. D’une part, la congestion routiĂšre peut ĂȘtre rĂ©currente, faisant rĂ©fĂ©rence Ă  la congestion qui se produit rĂ©guliĂšrement. La congestion non rĂ©currente (NRC), quant Ă  elle, dans un rĂ©seau urbain, est principalement causĂ©e par des incidents, des zones de construction, des Ă©vĂ©nements spĂ©ciaux ou des conditions mĂ©tĂ©orologiques dĂ©favorables. Les opĂ©rateurs d’infrastructure surveillent le trafic sur le rĂ©seau mais sont contraints Ă  utiliser le moins de ressources possibles. Cette contrainte implique que l’état du trafic ne peut pas ĂȘtre mesurĂ© partout car il n’est pas rĂ©aliste de dĂ©ployer des Ă©quipements sophistiquĂ©s pour assurer la collecte prĂ©cise des donnĂ©es de trafic et la dĂ©tection en temps rĂ©el des Ă©vĂ©nements partout sur le rĂ©seau routier. Alors certains emplacements oĂč le flot de trafic doit ĂȘtre amĂ©liorĂ© ne sont pas surveillĂ©s car ces emplacements varient beaucoup. D’un autre cĂŽtĂ©, de nombreuses Ă©tudes sur la congestion routiĂšre ont Ă©tĂ© consacrĂ©es aux autoroutes plutĂŽt qu’aux rĂ©gions urbaines, qui sont pourtant beaucoup plus susceptibles d’ĂȘtre surveillĂ©es par les autoritĂ©s de la circulation. De plus, les systĂšmes actuels de collecte de donnĂ©es de trafic n’incluent pas la possibilitĂ© d’enregistrer des informations dĂ©taillĂ©es sur les Ă©vĂ©nements qui surviennent sur la route, tels que les collisions, les conditions mĂ©tĂ©orologiques dĂ©favorables, etc. Aussi, les Ă©tudes proposĂ©es dans la littĂ©rature ne font que dĂ©tecter la congestion ; mais ce n’est pas suffisant, nous devrions ĂȘtre en mesure de mieux caractĂ©riser l’évĂ©nement qui en est la cause. Les agences doivent comprendre quelle est la cause qui affecte la variabilitĂ© de flot sur leurs installations et dans quelle mesure elles peuvent prendre les actions appropriĂ©es pour attĂ©nuer la congestion.----------ABSTRACT: Road traffic congestion is a particular state of mobility where travel times increase and more and more time is spent in vehicles. Apart from being a quite-stressful experience for drivers, congestion also has a negative impact on the environment and the economy. In this context, there is pressure on the authorities to take decisive actions to improve the network traffic flow. By improving network flow, congestion is reduced and the total travel time of vehicles is decreased. In fact, congestion can be classified as recurrent and non-recurrent (NRC). Recurrent congestion refers to congestion that happens on a regular basis. Non-recurrent congestion in an urban network is mainly caused by incidents, workzones, special events and adverse weather. Infrastructure operators monitor traffic on the network while using the least possible resources. Thus, traffic state cannot be directly measured everywhere on the traffic road network. But the location where traffic flow needs to be improved varies highly and certainly, deploying highly sophisticated equipment to ensure the accurate estimation of traffic flows and timely detection of events everywhere on the road network is not feasible. Also, many studies have been devoted to highways rather than highly congested urban regions which are intricate, complex networks and far more likely to be monitored by the traffic authorities. Moreover, current traffic data collection systems do not incorporate the ability of registring detailed information on the altering events happening on the road, such as vehicle crashes, adverse weather, etc. Operators require external data sources to retireve this information in real time. Current methods only detect congestion but it’s not enough, we should be able to better characterize the event causing it. Agencies need to understand what is the cause affecting variability on their facilities and to what degree so that they can take the appropriate action to mitigate congestion

    Safety and Reliability - Safe Societies in a Changing World

    Get PDF
    The contributions cover a wide range of methodologies and application areas for safety and reliability that contribute to safe societies in a changing world. These methodologies and applications include: - foundations of risk and reliability assessment and management - mathematical methods in reliability and safety - risk assessment - risk management - system reliability - uncertainty analysis - digitalization and big data - prognostics and system health management - occupational safety - accident and incident modeling - maintenance modeling and applications - simulation for safety and reliability analysis - dynamic risk and barrier management - organizational factors and safety culture - human factors and human reliability - resilience engineering - structural reliability - natural hazards - security - economic analysis in risk managemen


    Get PDF
    There are two key concerns in the development process of aviation. One is safety, and the other is cost. An airline running with high safety and low cost must be the most competitive one in the market. This work investigates two research efforts respectively relevant to these two concerns. When building support of a real time Flight Risk Assessment and Mitigation System (FRAMS), a sequential multi-stage approach is developed. The whole risk management process is considered in order to improve the safety of each flight by integrating AHP and FTA technique to describe the framework of all levels of risks through risk score. Unlike traditional fault tree analysis, severity level, time level and synergy effect are taken into account when calculating the risk score for each flight. A risk tree is designed for risk data with flat shape structure and a time sensitive optimization model is developed to support decision making of how to mitigate risk with as little cost as possible. A case study is solved in reasonable time to approve that the model is practical for the real time system. On the other hand, an intense competitive environment makes cost controlling more and more important for airlines. An integrated approach is developed for improving the efficiency of reserve crew scheduling which can contribute to decrease cost. Unlike the other technique, this approach integrates the demand forecasting, reserve pattern generation and optimization. A reserve forecasting tool is developed based on a large data base. The expected value of each type of dropped trip is the output of this tool based on the predicted dropping rate and the total scheduled trips. The rounding step in current applied methods is avoided to keep as much information as possible. The forecasting stage is extended to the optimization stage through the input of these expected values. A novel optimization model with column generation algorithm is developed to generate patterns to cover these expected level reserve demands with minimization to the total cost. The many-to-many covering mode makes the model avoid the influence of forecasting errors caused by high uncertainty as much as possible

    Innovative Two-Stage Fuzzy Classification for Unknown Intrusion Detection

    Get PDF
    Intrusion detection is the essential part of network security in combating against illegal network access or malicious cyberattacks. Due to the constantly evolving nature of cyber attacks, it has been a technical challenge for an intrusion detection system (IDS) to effectively recognize unknown attacks or known attacks with inadequate training data. Therefore in this dissertation work, an innovative two-stage classifier is developed for accurately and efficiently detecting both unknown attacks and known attacks with insufficient or inaccurate training information. The novel two-stage fuzzy classification scheme is based on advanced machine learning techniques specifically for handling the ambiguity of traffic connections and network data. In the first stage of the classification, a fuzzy C-means (FCM) algorithm is employed to softly compute and optimize clustering centers of the training datasets with some degree of fuzziness counting for feature inaccuracy and ambiguity in the training data. Subsequently, a distance-weighted k-NN (k-nearest neighbors) classifier, combined with the Dempster-Shafer Theory (DST), is introduced to assess the belief functions and pignistic probabilities of the incoming data associated with each of known classes to further address the data uncertainty issue in the cyberattack data. In the second stage of the proposed classification algorithm, a subsequent classification scheme is implemented based on the obtained pignistic probabilities and their entropy functions to determine if the input data are normal, one of the known attacks or an unknown attack. Secondly, to strengthen the robustness to attacks, we form the three-layer hierarchy ensemble classifier based on the FCM weighted k-NN DST classifier to have more precise inferences than those made by a single classifier. The proposed intrusion detection algorithm is evaluated through the application of the KDD’99 datasets and their variants containing known and unknown attacks. The experimental results show that the new two-stage fuzzy KNN-DST classifier outperforms other well-known classifiers in intrusion detection and is especially effective in detecting unknown attacks
