5,577 research outputs found
Towards a framework for designing full model selection and optimization systems
People from a variety of industrial domains are beginning to realise that appropriate use of machine learning techniques for their data mining projects could bring great benefits. End-users now have to face the new problem of how to choose a combination of data processing tools and algorithms for a given dataset. This problem is usually termed the Full Model Selection (FMS) problem. Extended from our previous work [10], in this paper, we introduce a framework for designing FMS algorithms. Under this framework, we propose a novel algorithm combining both genetic algorithms (GA) and particle swarm optimization (PSO) named GPS (which stands for GA-PSO-FMS), in which a GA is used for searching the optimal structure for a data mining solution, and PSO is used for searching optimal parameters for a particular structure instance. Given a classification dataset, GPS outputs a FMS solution as a directed acyclic graph consisting of diverse data mining operators that are available to the problem. Experimental results demonstrate the benefit of the algorithm. We also present, with detailed analysis, two model-tree-based variants for speeding up the GPS algorithm
An Optimisation-Driven Prediction Method for Automated Diagnosis and Prognosis
open access articleThis article presents a novel hybrid classification paradigm for medical diagnoses and prognoses prediction. The core mechanism of the proposed method relies on a centroid classification algorithm whose logic is exploited to formulate the classification task as a real-valued optimisation problem. A novel metaheuristic combining the algorithmic structure of Swarm Intelligence optimisers with the probabilistic search models of Estimation of Distribution Algorithms is designed to optimise such a problem, thus leading to high-accuracy predictions. This method is tested over 11 medical datasets and compared against 14 cherry-picked classification algorithms. Results show that the proposed approach is competitive and superior to the state-of-the-art on several occasions
Elephant Search with Deep Learning for Microarray Data Analysis
Even though there is a plethora of research in Microarray gene expression
data analysis, still, it poses challenges for researchers to effectively and
efficiently analyze the large yet complex expression of genes. The feature
(gene) selection method is of paramount importance for understanding the
differences in biological and non-biological variation between samples. In
order to address this problem, a novel elephant search (ES) based optimization
is proposed to select best gene expressions from the large volume of microarray
data. Further, a promising machine learning method is envisioned to leverage
such high dimensional and complex microarray dataset for extracting hidden
patterns inside to make a meaningful prediction and most accurate
classification. In particular, stochastic gradient descent based Deep learning
(DL) with softmax activation function is then used on the reduced features
(genes) for better classification of different samples according to their gene
expression levels. The experiments are carried out on nine most popular Cancer
microarray gene selection datasets, obtained from UCI machine learning
repository. The empirical results obtained by the proposed elephant search
based deep learning (ESDL) approach are compared with most recent published
article for its suitability in future Bioinformatics research.Comment: 12 pages, 5 Tabl
Metaheuristic design of feedforward neural networks: a review of two decades of research
Over the past two decades, the feedforward neural network (FNN) optimization has been a key interest among the researchers and practitioners of multiple disciplines. The FNN optimization is often viewed from the various perspectives: the optimization of weights, network architecture, activation nodes, learning parameters, learning environment, etc. Researchers adopted such different viewpoints mainly to improve the FNN's generalization ability. The gradient-descent algorithm such as backpropagation has been widely applied to optimize the FNNs. Its success is evident from the FNN's application to numerous real-world problems. However, due to the limitations of the gradient-based optimization methods, the metaheuristic algorithms including the evolutionary algorithms, swarm intelligence, etc., are still being widely explored by the researchers aiming to obtain generalized FNN for a given problem. This article attempts to summarize a broad spectrum of FNN optimization methodologies including conventional and metaheuristic approaches. This article also tries to connect various research directions emerged out of the FNN optimization practices, such as evolving neural network (NN), cooperative coevolution NN, complex-valued NN, deep learning, extreme learning machine, quantum NN, etc. Additionally, it provides interesting research challenges for future research to cope-up with the present information processing era
Mixed-Integer Convex Nonlinear Optimization with Gradient-Boosted Trees Embedded
Decision trees usefully represent sparse, high dimensional and noisy data.
Having learned a function from this data, we may want to thereafter integrate
the function into a larger decision-making problem, e.g., for picking the best
chemical process catalyst. We study a large-scale, industrially-relevant
mixed-integer nonlinear nonconvex optimization problem involving both
gradient-boosted trees and penalty functions mitigating risk. This
mixed-integer optimization problem with convex penalty terms broadly applies to
optimizing pre-trained regression tree models. Decision makers may wish to
optimize discrete models to repurpose legacy predictive models, or they may
wish to optimize a discrete model that particularly well-represents a data set.
We develop several heuristic methods to find feasible solutions, and an exact,
branch-and-bound algorithm leveraging structural properties of the
gradient-boosted trees and penalty functions. We computationally test our
methods on concrete mixture design instance and a chemical catalysis industrial
instance
Batch Reinforcement Learning on the Industrial Benchmark: First Experiences
The Particle Swarm Optimization Policy (PSO-P) has been recently introduced
and proven to produce remarkable results on interacting with academic
reinforcement learning benchmarks in an off-policy, batch-based setting. To
further investigate the properties and feasibility on real-world applications,
this paper investigates PSO-P on the so-called Industrial Benchmark (IB), a
novel reinforcement learning (RL) benchmark that aims at being realistic by
including a variety of aspects found in industrial applications, like
continuous state and action spaces, a high dimensional, partially observable
state space, delayed effects, and complex stochasticity. The experimental
results of PSO-P on IB are compared to results of closed-form control policies
derived from the model-based Recurrent Control Neural Network (RCNN) and the
model-free Neural Fitted Q-Iteration (NFQ). Experiments show that PSO-P is not
only of interest for academic benchmarks, but also for real-world industrial
applications, since it also yielded the best performing policy in our IB
setting. Compared to other well established RL techniques, PSO-P produced
outstanding results in performance and robustness, requiring only a relatively
low amount of effort in finding adequate parameters or making complex design
decisions
Intelligent feature selection using particle swarm optimization algorithm with a decision tree for DDoS attack detection
The explosive development of information technology is increasingly rising cyber-attacks. Distributed denial of service (DDoS) attack is a malicious threat to the modern cyber-security world, which causes performance disruption to the network servers. It is a pernicious type of attack that can forward a large amount of traffic to damage one or all target’s resources simultaneously and prevents authenticated users from accessing network services. The paper aims to select the least number of relevant DDoS attack detection features by designing an intelligent wrapper feature selection model that utilizes a binary-particle swarm optimization algorithm with a decision tree classifier. In this paper, the Binary-particle swarm optimization algorithm is used to resolve discrete optimization problems such as feature selection and decision tree classifier as a performance evaluator to evaluate the wrapper model’s accuracy using the selected features from the network traffic flows. The model’s intelligence is indicated by selecting 19 convenient features out of 76 features of the dataset. The experiments were accomplished on a large DDoS dataset. The optimal selected features were evaluated with different machine learning algorithms by performance measurement metrics regarding the accuracy, Recall, Precision, and F1-score to detect DDoS attacks. The proposed model showed a high accuracy rate by decision tree classifier 99.52%, random forest 96.94%, and multi-layer perceptron 90.06 %. Also, the paper compares the outcome of the proposed model with previous feature selection models in terms of performance measurement metrics. This outcome will be useful for improving DDoS attack detection systems based on machine learning algorithms. It is also probably applied to other research topics such as DDoS attack detection in the cloud environment and DDoS attack mitigation systems
Air Quality Prediction in Smart Cities Using Machine Learning Technologies Based on Sensor Data: A Review
The influence of machine learning technologies is rapidly increasing and penetrating almost in every field, and air pollution prediction is not being excluded from those fields. This paper covers the revision of the studies related to air pollution prediction using machine learning algorithms based on sensor data in the context of smart cities. Using the most popular databases and executing the corresponding filtration, the most relevant papers were selected. After thorough reviewing those papers, the main features were extracted, which served as a base to link and compare them to each other. As a result, we can conclude that: (1) instead of using simple machine learning techniques, currently, the authors apply advanced and sophisticated techniques, (2) China was the leading country in terms of a case study, (3) Particulate matter with diameter equal to 2.5 micrometers was the main prediction target, (4) in 41% of the publications the authors carried out the prediction for the next day, (5) 66% of the studies used data had an hourly rate, (6) 49% of the papers used open data and since 2016 it had a tendency to increase, and (7) for efficient air quality prediction it is important to consider the external factors such as weather conditions, spatial characteristics, and temporal features
The SOS Platform: Designing, Tuning and Statistically Benchmarking Optimisation Algorithms
open access articleWe present Stochastic Optimisation Software (SOS), a Java platform facilitating the algorithmic design process and the evaluation of metaheuristic optimisation algorithms. SOS reduces the burden of coding miscellaneous methods for dealing with several bothersome and time-demanding tasks such as parameter tuning, implementation of comparison algorithms and testbed problems, collecting and processing data to display results, measuring algorithmic overhead, etc. SOS provides numerous off-the-shelf methods including: (1) customised implementations of statistical tests, such as the Wilcoxon rank-sum test and the Holm–Bonferroni procedure, for comparing the performances of optimisation algorithms and automatically generating result tables in PDF and formats; (2) the implementation of an original advanced statistical routine for accurately comparing couples of stochastic optimisation algorithms; (3) the implementation of a novel testbed suite for continuous optimisation, derived from the IEEE CEC 2014 benchmark, allowing for controlled activation of the rotation on each testbed function. Moreover, we briefly comment on the current state of the literature in stochastic optimisation and highlight similarities shared by modern metaheuristics inspired by nature. We argue that the vast majority of these algorithms are simply a reformulation of the same methods and that metaheuristics for optimisation should be simply treated as stochastic processes with less emphasis on the inspiring metaphor behind them
- …