7,798 research outputs found

    An Overview of the Use of Neural Networks for Data Mining Tasks

    Get PDF
    In the recent years the area of data mining has experienced a considerable demand for technologies that extract knowledge from large and complex data sources. There is a substantial commercial interest as well as research investigations in the area that aim to develop new and improved approaches for extracting information, relationships, and patterns from datasets. Artificial Neural Networks (NN) are popular biologically inspired intelligent methodologies, whose classification, prediction and pattern recognition capabilities have been utilised successfully in many areas, including science, engineering, medicine, business, banking, telecommunication, and many other fields. This paper highlights from a data mining perspective the implementation of NN, using supervised and unsupervised learning, for pattern recognition, classification, prediction and cluster analysis, and focuses the discussion on their usage in bioinformatics and financial data analysis tasks

    Data mining as a tool for environmental scientists

    Get PDF
    Over recent years a huge library of data mining algorithms has been developed to tackle a variety of problems in fields such as medical imaging and network traffic analysis. Many of these techniques are far more flexible than more classical modelling approaches and could be usefully applied to data-rich environmental problems. Certain techniques such as Artificial Neural Networks, Clustering, Case-Based Reasoning and more recently Bayesian Decision Networks have found application in environmental modelling while other methods, for example classification and association rule extraction, have not yet been taken up on any wide scale. We propose that these and other data mining techniques could be usefully applied to difficult problems in the field. This paper introduces several data mining concepts and briefly discusses their application to environmental modelling, where data may be sparse, incomplete, or heterogenous

    The use of data-mining for the automatic formation of tactics

    Get PDF
    This paper discusses the usse of data-mining for the automatic formation of tactics. It was presented at the Workshop on Computer-Supported Mathematical Theory Development held at IJCAR in 2004. The aim of this project is to evaluate the applicability of data-mining techniques to the automatic formation of tactics from large corpuses of proofs. We data-mine information from large proof corpuses to find commonly occurring patterns. These patterns are then evolved into tactics using genetic programming techniques

    Automated design of genetic programming of classification algorithms.

    Get PDF
    Doctoral Degree. University of KwaZulu-Natal, Pietermaritzburg.Over the past decades, there has been an increase in the use of evolutionary algorithms (EAs) for data mining and knowledge discovery in a wide range of application domains. Data classification, a real-world application problem is one of the areas EAs have been widely applied. Data classification has been extensively researched resulting in the development of a number of EA based classification algorithms. Genetic programming (GP) in particular has been shown to be one of the most effective EAs at inducing classifiers. It is widely accepted that the effectiveness of a parameterised algorithm like GP depends on its configuration. Currently, the design of GP classification algorithms is predominantly performed manually. Manual design follows an iterative trial and error approach which has been shown to be a menial, non-trivial time-consuming task that has a number of vulnerabilities. The research presented in this thesis is part of a large-scale initiative by the machine learning community to automate the design of machine learning techniques. The study investigates the hypothesis that automating the design of GP classification algorithms for data classification can still lead to the induction of effective classifiers. This research proposes using two evolutionary algorithms,namely,ageneticalgorithm(GA)andgrammaticalevolution(GE)toautomatethe design of GP classification algorithms. The proof-by-demonstration research methodology is used in the study to achieve the set out objectives. To that end two systems namely, a genetic algorithm system and a grammatical evolution system were implemented for automating the design of GP classification algorithms. The classification performance of the automated designed GP classifiers, i.e., GA designed GP classifiers and GE designed GP classifiers were compared to manually designed GP classifiers on real-world binary class and multiclass classification problems. The evaluation was performed on multiple domain problems obtained from the UCI machine learning repository and on two specific domains, cybersecurity and financial forecasting. The automated designed classifiers were found to outperform the manually designed GP classifiers on all the problems considered in this study. GP classifiers evolved by GE were found to be suitable for classifying binary classification problems while those evolved by a GA were found to be suitable for multiclass classification problems. Furthermore, the automated design time was found to be less than manual design time. Fitness landscape analysis of the design spaces searched by a GA and GE were carried out on all the class of problems considered in this study. Grammatical evolution found the search to be smoother on binary classification problems while the GA found multiclass problems to be less rugged than binary class problems

    A data-driven intelligent decision support system that combines predictive and prescriptive analytics for the design of new textile fabrics

    Get PDF
    In this paper, we propose an Intelligent Decision Support System (IDSS) for the design of new textile fabrics. The IDSS uses predictive analytics to estimate fabric properties (e.g., elasticity) and composition values (% cotton) and then prescriptive techniques to optimize the fabric design inputs that feed the predictive models (e.g., types of yarns used). Using thousands of data records from a Portuguese textile company, we compared two distinct Machine Learning (ML) predictive approaches: Single-Target Regression (STR), via an Automated ML (AutoML) tool, and Multi-target Regression, via a deep learning Artificial Neural Network. For the prescriptive analytics, we compared two Evolutionary Multi-objective Optimization (EMO) methods (NSGA-II and R-NSGA-II) when optimizing 100 new fabrics, aiming to simultaneously minimize the physical property predictive error and the distance of the optimized values when compared with the learned input space. The two EMO methods were applied to design of 100 new fabrics. Overall, the STR approach provided the best results for both prediction tasks, with Normalized Mean Absolute Error values that range from 4% (weft elasticity) to 11% (pilling) in terms of the fabric properties and a textile composition classification accuracy of 87% when adopting a small tolerance of 0.01 for predicting the percentages of six types of fibers (e.g., cotton). As for the prescriptive results, they favored the R-NSGA-II EMO method, which tends to select Pareto curves that are associated with an average 11% predictive error and 16% distance.This work was carried out within the project "TexBoost: less Commodities more Specialities" reference POCI-01-0247-FEDER-024523, co-funded by Fundo Europeu de Desenvolvimento Regional (FEDER), through Portugal 2020 (P2020)
    • ā€¦
    corecore