98 research outputs found

    Credit Card Fraud Detection Using Asexual Reproduction Optimization

    Full text link
    As the number of credit card users has increased, detecting fraud in this domain has become a vital issue. Previous literature has applied various supervised and unsupervised machine learning methods to find an effective fraud detection system. However, some of these methods require an enormous amount of time to achieve reasonable accuracy. In this paper, an Asexual Reproduction Optimization (ARO) approach was employed, which is a supervised method to detect credit card fraud. ARO refers to a kind of production in which one parent produces some offspring. By applying this method and sampling just from the majority class, the effectiveness of the classification is increased. A comparison to Artificial Immune Systems (AIS), which is one of the best methods implemented on current datasets, has shown that the proposed method is able to remarkably reduce the required training time and at the same time increase the recall that is important in fraud detection problems. The obtained results show that ARO achieves the best cost in a short time, and consequently, it can be considered a real-time fraud detection system

    A privacy-preserving, distributed and cooperative FCM-based learning approach for Cancer Research

    Full text link
    Distributed Artificial Intelligence is attracting interest day by day. In this paper, the authors introduce an innovative methodology for distributed learning of Particle Swarm Optimization-based Fuzzy Cognitive Maps in a privacy-preserving way. The authors design a training scheme for collaborative FCM learning that offers data privacy compliant with the current regulation. This method is applied to a cancer detection problem, proving that the performance of the model is improved by the Federated Learning process, and obtaining similar results to the ones that can be found in the literature.Comment: Rough Sets: International Joint Conference, IJCRS 202

    A Learning Fuzzy Cognitive Map (LFCM) approach to predict student performance

    Get PDF
    Aim/Purpose: This research aims to present a brand-new approach for student performance prediction using the Learning Fuzzy Cognitive Map (LFCM) approach. Background: Predicting student academic performance has long been an important research topic in many academic disciplines. Different mathematical models have been employed to predict student performance. Although the available sets of common prediction approaches, such as Artificial Neural Networks (ANN) and regression, work well with large datasets, they face challenges dealing with small sample sizes, limiting their practical applications in real practices. Methodology: Six distinct categories of performance antecedents are adopted here as course characteristics, LMS characteristics, student characteristics, student engagement, student support, and institutional factors, along with measurement items within each category. Furthermore, we assessed the student’s overall performance using three items of student satisfaction score, knowledge construction level, and student GPA. We have collected longitudinal data from 30 postgraduates in four subsequent semesters and analyzed data using the Learning Fuzzy Cognitive Map (LFCM) technique. Contribution: This research proposes a brand new approach, Learning Fuzzy Cognitive Map (LFCM), to predict student performance. Using this approach, we identified the most influential determinants of student performance, such as student engagement. Besides, this research depicts a model of interrelations among the student performance determinants. Findings: The results suggest that the model reasonably predicts the incoming sequence when there is a limited sample size. The results also reveal that students’ total online time and the regularity of learning interval in LMS have the largest effect on overall performance. The student engagement category also has the highest direct effect on student’s overall performance. Recommendations for Practitioners: Academic institutions can use the results and approach developed in this paper to identify students’ performance antecedents, predict the performance, and establish action plans to resolve the shortcomings in the long term. Instructors can adjust their learning methods based on the feedback from students in the short run on the operational level. Recommendation for Researchers: Researchers can use the proposed approach in this research to deal with the problems in other domains, such as using LMS for organizational/institutional education. Besides, they can focus on specific dimensions of the proposed model, such as exploring ways to boost student engagement in the learning process. Impact on Society: Our results revealed that students are at the center of the learning process. The degree to which they are dedicated to learning is the most crucial determinant of the learning outcome. Therefore, learners should consider this finding in order the gain value from the learning process. Future Research: As a potential for future works, the proposed approach could be used in other contexts to test its applicability. Future studies could also improve the performance level of the proposed LFMC model by tuning the model’s elements

    Credit card fraud detection using asexual reproduction optimization

    Get PDF
    Purpose – The best algorithm that was implemented on this Brazilian dataset was artificial immune system (AIS) algorithm. But the time and cost of this algorithm are high. Using asexual reproduction optimization (ARO) algorithm, the authors achieved better results in less time. So the authors achieved less cost in a shorter time. Their framework addressed the problems such as high costs and training time in credit card fraud detection. This simple and effective approach has achieved better results than the best techniques implemented on our dataset so far. The purpose of this paper is to detect credit card fraud using ARO. Design/methodology/approach – In this paper, the authors used ARO algorithm to classify the bank transactions into fraud and legitimate. ARO is taken from asexual reproduction. Asexual reproduction refers to a kind of production in which one parent produces offspring identical to herself. In ARO algorithm, an individual is shown by a vector of variables. Each variable is considered as a chromosome. A binary string represents a chromosome consisted of genes. It is supposed that every generated answer exists in the environment, and because of limited resources, only the best solution can remain alive. The algorithm starts with a random individual in the answer scope. This parent reproduces the offspring named bud. Either the parent or the offspring can survive. In this competition, the one which outperforms in fitness function remains alive. If the offspring has suitable performance,it will be the next parent, and the current parent becomes obsolete.Otherwise, the offspring perishes, and the present parent survives. The algorithm recurs until the stop condition occurs. Findings – Results showed that ARO had increased the AUC (i.e. area under a receiver operating characteristic (ROC) curve), sensitivity, precision, specificity and accuracy by 13%, 25%, 56%, 3% and 3%, in comparison with AIS, respectively. The authors achieved a high precision value indicating that if ARO detects a record as a fraud, with a high probability, it is a fraud one. Supporting a real-time fraud detection system is another vital issue. ARO outperforms AIS not only in the mentioned criteria, but also decreases the training time by 75% in comparison with the AIS, which is a significant figure. Originality/value – In this paper, the authors implemented the ARO in credit card fraud detection. The authors compared the results with those of the AIS, which was one of the best methods ever implemented on the benchmark dataset. The chief focus of the fraud detection studies is finding the algorithms that can detect legal transactions from the fraudulent ones with high detection accuracy in the shortest time and at a low cost. That ARO meets all these demands

    Hybrid approaches for mobile robot navigation

    Get PDF
    The work described in this thesis contributes to the efficient solution of mobile robot navigation problems. A series of new evolutionary approaches is presented. Two novel evolutionary planners have been developed that reduce the computational overhead in generating plans of mobile robot movements. In comparison with the best-performing evolutionary scheme reported in the literature, the first of the planners significantly reduces the plan calculation time in static environments. The second planner was able to generate avoidance strategies in response to unexpected events arising from the presence of moving obstacles. To overcome limitations in responsiveness and the unrealistic assumptions regarding a priori knowledge that are inherent in planner-based and a vigation systems, subsequent work concentrated on hybrid approaches. These included a reactive component to identify rapidly and autonomously environmental features that were represented by a small number of critical waypoints. Not only is memory usage dramatically reduced by such a simplified representation, but also the calculation time to determine new plans is significantly reduced. Further significant enhancements of this work were firstly, dynamic avoidance to limit the likelihood of potential collisions with moving obstacles and secondly, exploration to identify statistically the dynamic characteristics of the environment. Finally, by retaining more extensive environmental knowledge gained during previous navigation activities, the capability of the hybrid navigation system was enhanced to allow planning to be performed for any start point and goal point

    Time series data mining: preprocessing, analysis, segmentation and prediction. Applications

    Get PDF
    Currently, the amount of data which is produced for any information system is increasing exponentially. This motivates the development of automatic techniques to process and mine these data correctly. Specifically, in this Thesis, we tackled these problems for time series data, that is, temporal data which is collected chronologically. This kind of data can be found in many fields of science, such as palaeoclimatology, hydrology, financial problems, etc. TSDM consists of several tasks which try to achieve different objectives, such as, classification, segmentation, clustering, prediction, analysis, etc. However, in this Thesis, we focus on time series preprocessing, segmentation and prediction. Time series preprocessing is a prerequisite for other posterior tasks: for example, the reconstruction of missing values in incomplete parts of time series can be essential for clustering them. In this Thesis, we tackled the problem of massive missing data reconstruction in SWH time series from the Gulf of Alaska. It is very common that buoys stop working for different periods, what it is usually related to malfunctioning or bad weather conditions. The relation of the time series of each buoy is analysed and exploited to reconstruct the whole missing time series. In this context, EANNs with PUs are trained, showing that the resulting models are simple and able to recover these values with high precision. In the case of time series segmentation, the procedure consists in dividing the time series into different subsequences to achieve different purposes. This segmentation can be done trying to find useful patterns in the time series. In this Thesis, we have developed novel bioinspired algorithms in this context. For instance, for paleoclimate data, an initial genetic algorithm was proposed to discover early warning signals of TPs, whose detection was supported by expert opinions. However, given that the expert had to individually evaluate every solution given by the algorithm, the evaluation of the results was very tedious. This led to an improvement in the body of the GA to evaluate the procedure automatically. For significant wave height time series, the objective was the detection of groups which contains extreme waves, i.e. those which are relatively large with respect other waves close in time. The main motivation is to design alert systems. This was done using an HA, where an LS process was included by using a likelihood-based segmentation, assuming that the points follow a beta distribution. Finally, the analysis of similarities in different periods of European stock markets was also tackled with the aim of evaluating the influence of different markets in Europe. When segmenting time series with the aim of reducing the number of points, different techniques have been proposed. However, it is an open challenge given the difficulty to operate with large amounts of data in different applications. In this work, we propose a novel statistically-driven CRO algorithm (SCRO), which automatically adapts its parameters during the evolution, taking into account the statistical distribution of the population fitness. This algorithm improves the state-of-the-art with respect to accuracy and robustness. Also, this problem has been tackled using an improvement of the BBPSO algorithm, which includes a dynamical update of the cognitive and social components in the evolution, combined with mathematical tricks to obtain the fitness of the solutions, which significantly reduces the computational cost of previously proposed coral reef methods. Also, the optimisation of both objectives (clustering quality and approximation quality), which are in conflict, could be an interesting open challenge, which will be tackled in this Thesis. For that, an MOEA for time series segmentation is developed, improving the clustering quality of the solutions and their approximation. The prediction in time series is the estimation of future values by observing and studying the previous ones. In this context, we solve this task by applying prediction over high-order representations of the elements of the time series, i.e. the segments obtained by time series segmentation. This is applied to two challenging problems, i.e. the prediction of extreme wave height and fog prediction. On the one hand, the number of extreme values in SWH time series is less with respect to the number of standard values. In this way, the prediction of these values cannot be done using standard algorithms without taking into account the imbalanced ratio of the dataset. For that, an algorithm that automatically finds the set of segments and then applies EANNs is developed, showing the high ability of the algorithm to detect and predict these special events. On the other hand, fog prediction is affected by the same problem, that is, the number of fog events is much lower tan that of non-fog events, requiring a special treatment too. A preprocessing of different data coming from sensors situated in different parts of the Valladolid airport are used for making a simple ANN model, which is physically corroborated and discussed. The last challenge which opens new horizons is the estimation of the statistical distribution of time series to guide different methodologies. For this, the estimation of a mixed distribution for SWH time series is then used for fixing the threshold of POT approaches. Also, the determination of the fittest distribution for the time series is used for discretising it and making a prediction which treats the problem as ordinal classification. The work developed in this Thesis is supported by twelve papers in international journals, seven papers in international conferences, and four papers in national conferences

    Optimisation of a weightless neural network using particle swarms

    Get PDF
    Among numerous pattern recognition methods the neural network approach has been the subject of much research due to its ability to learn from a given collection of representative examples. This thesis is concerned with the design of weightless neural networks, which decompose a given pattern into several sets of n points, termed n-tuples. Considerable research has shown that by optimising the input connection mapping of such n-tuple networks classification performance can be improved significantly. In this thesis the application of a population-based stochastic optimisation technique, known as Particle Swarm Optimisation (PSO), to the optimisation of the connectivity pattern of such “n-tuple” classifiers is explored. The research was aimed at improving the discriminating power of the classifier in recognising handwritten characters by exploiting more efficient learning strategies. The proposed "learning" scheme searches for ‘good’ input connections of the n-tuples in the solution space and shrinks the search area step by step. It refines its search by attracting the particles to positions with good solutions in an iterative manner. Every iteration the performance or fitness of each input connection is evaluated, so a reward and punishment based fitness function was modelled for the task. The original PSO was refined by combining it with other bio-inspired approaches like Self-Organized Criticality and Nearest Neighbour Interactions. The hybrid algorithms were adapted for the n-tuple system and the performance was measured in selecting better connectivity patterns. The Genetic Algorithm (GA) has been shown to be accomplishing the same goals as the PSO, so the performances and convergence properties of the GA were compared against the PSO to optimise input connections. Experiments were conducted to evaluate the proposed methods by applying the trained classifiers to recognise handprinted digits from a widely used database. Results revealed the superiority of the particle swarm optimised training for the n-tuples over other algorithms including the GA. Low particle velocity in PSO was favourable for exploring more areas in the solution space and resulted in better recognition rates. Use of hybridisation was helpful and one of the versions of the hybrid PSO was found to be the best performing algorithm in finding the optimum set of input maps for the n-tuple network

    Evolving machine learning and deep learning models using evolutionary algorithms

    Get PDF
    Despite the great success in data mining, machine learning and deep learning models are yet subject to material obstacles when tackling real-life challenges, such as feature selection, initialization sensitivity, as well as hyperparameter optimization. The prevalence of these obstacles has severely constrained conventional machine learning and deep learning methods from fulfilling their potentials. In this research, three evolving machine learning and one evolving deep learning models are proposed to eliminate above bottlenecks, i.e. improving model initialization, enhancing feature representation, as well as optimizing model configuration, respectively, through hybridization between the advanced evolutionary algorithms and the conventional ML and DL methods. Specifically, two Firefly Algorithm based evolutionary clustering models are proposed to optimize cluster centroids in K-means and overcome initialization sensitivity as well as local stagnation. Secondly, a Particle Swarm Optimization based evolving feature selection model is developed for automatic identification of the most effective feature subset and reduction of feature dimensionality for tackling classification problems. Lastly, a Grey Wolf Optimizer based evolving Convolutional Neural Network-Long Short-Term Memory method is devised for automatic generation of the optimal topological and learning configurations for Convolutional Neural Network-Long Short-Term Memory networks to undertake multivariate time series prediction problems. Moreover, a variety of tailored search strategies are proposed to eliminate the intrinsic limitations embedded in the search mechanisms of the three employed evolutionary algorithms, i.e. the dictation of the global best signal in Particle Swarm Optimization, the constraint of the diagonal movement in Firefly Algorithm, as well as the acute contraction of search territory in Grey Wolf Optimizer, respectively. The remedy strategies include the diversification of guiding signals, the adaptive nonlinear search parameters, the hybrid position updating mechanisms, as well as the enhancement of population leaders. As such, the enhanced Particle Swarm Optimization, Firefly Algorithm, and Grey Wolf Optimizer variants are more likely to attain global optimality on complex search landscapes embedded in data mining problems, owing to the elevated search diversity as well as the achievement of advanced trade-offs between exploration and exploitation