91,218 research outputs found

    Extraction of decision rules using genetic algorithms and simulated annealing for prediction of severity of traffic accidents by motorcyclists

    Get PDF
    The objective of this study is to analysis of accident of motorcyclists on Bogotá roads in Colombia. For detection of conditions related to crashes and their severity, the proposed model develops the strategies to enhance road safety. In this context, data mining and machine learning techniques are used to investigate 34,232 accidents by motorcyclists during January 2013 to February 2018. Both the Genetic algorithm and simulated annealing are applied in conjunction with mining rules (support, confidence, lift, and comprehensibility) as per objectives of the problem. The application of a hybrid algorithm allows for the creation and definition of optimal hierarchical decision rules for the prediction of the severity of motorcycle traffic accidents. The proposed method yields good results in the metrics of recall (90.07%), precision (89.87%), and accuracy (90.06%) on the data set. The results increase the prediction by 20–21% in comparisons with the following methods: Decision Trees (CART, ID3, and C4.5), Support Vector Machines (SVMs), K-Nearest Neighbor (KNN), Naive Bayes, Neural Networks, Random Forest, and Random Tree. The proposed method defines 11 rules for the prediction of accidents with material damage, 24 rules with injuries, and 12 rules with fatalities. The variables with the most recurrence in the definition of rules are time, weather and road conditions, and the number of victims involved in the accidents. Finally, the interactions of the conditions and characteristics presented in motorcycle accidents are analyzed which contribute to the definition of countermeasures for road safety. © 2021, Springer-Verlag GmbH Germany, part of Springer Nature

    Survey of data mining approaches to user modeling for adaptive hypermedia

    Get PDF
    The ability of an adaptive hypermedia system to create tailored environments depends mainly on the amount and accuracy of information stored in each user model. Some of the difficulties that user modeling faces are the amount of data available to create user models, the adequacy of the data, the noise within that data, and the necessity of capturing the imprecise nature of human behavior. Data mining and machine learning techniques have the ability to handle large amounts of data and to process uncertainty. These characteristics make these techniques suitable for automatic generation of user models that simulate human decision making. This paper surveys different data mining techniques that can be used to efficiently and accurately capture user behavior. The paper also presents guidelines that show which techniques may be used more efficiently according to the task implemented by the applicatio

    An Overview of the Use of Neural Networks for Data Mining Tasks

    Get PDF
    In the recent years the area of data mining has experienced a considerable demand for technologies that extract knowledge from large and complex data sources. There is a substantial commercial interest as well as research investigations in the area that aim to develop new and improved approaches for extracting information, relationships, and patterns from datasets. Artificial Neural Networks (NN) are popular biologically inspired intelligent methodologies, whose classification, prediction and pattern recognition capabilities have been utilised successfully in many areas, including science, engineering, medicine, business, banking, telecommunication, and many other fields. This paper highlights from a data mining perspective the implementation of NN, using supervised and unsupervised learning, for pattern recognition, classification, prediction and cluster analysis, and focuses the discussion on their usage in bioinformatics and financial data analysis tasks

    An enhanced intelligent database engine by neural network and data mining

    Get PDF
    An Intelligent Database Engine (IDE) is developed to solve any classification problem by providing two integrated features: decision-making by a backpropagation (BP) neural network (NN) and decision support by Apriori, a data mining (DM) algorithm. Previous experimental results show the accuracy of NN (90%) and DM (60%) to be drastically distinct. Thus, efforts to improve DM accuracy is crucial to ensure a well-balanced hybrid architecture. The poor DM performance is caused by either too few rules or too many poor rules which are generated in the classifier. Thus, the first problem is curbed by generating multiple level rules, by incorporating multiple attribute support and level confidence to the initial Apriori. The second problem is tackled by implementing two strengthening procedures, confidence and Bayes verification to filter out the unpredictive rules. Experiments with more datasets are carried out to compare the performance of initial and improved Apriori. Great improvement is obtained for the latte

    Towards a Comprehensible and Accurate Credit Management Model: Application of four Computational Intelligence Methodologies

    Get PDF
    The paper presents methods for classification of applicants into different categories of credit risk using four different computational intelligence techniques. The selected methodologies involved in the rule-based categorization task are (1) feedforward neural networks trained with second order methods (2) inductive machine learning, (3) hierarchical decision trees produced by grammar-guided genetic programming and (4) fuzzy rule based systems produced by grammar-guided genetic programming. The data used are both numerical and linguistic in nature and they represent a real-world problem, that of deciding whether a loan should be granted or not, in respect to financial details of customers applying for that loan, to a specific private EU bank. We examine the proposed classification models with a sample of enterprises that applied for a loan, each of which is described by financial decision variables (ratios), and classified to one of the four predetermined classes. Attention is given to the comprehensibility and the ease of use for the acquired decision models. Results show that the application of the proposed methods can make the classification task easier and - in some cases - may minimize significantly the amount of required credit data. We consider that these methodologies may also give the chance for the extraction of a comprehensible credit management model or even the incorporation of a related decision support system in bankin
    corecore