1,078 research outputs found
Optimising Multilayer Perceptron weights and biases through a Cellular Genetic Algorithm for medical data classification
In recent years, technology in medicine has shown a significant advance due to artificial intelligence becoming a framework to make accurate medical diagnoses. Models like Multilayer Perceptrons (MLPs) can detect implicit patterns in data, allowing identifying patients conditions that cannot be seen easily. MLPs consist of biased neurons arranged in layers, connected by weighted connections. Their effectiveness depends on finding the optimal weights and biases that reduce the classification error, which is usually done by using the Back Propagation algorithm (BP). But BP has several disadvantages that could provoke the MLP not to learn. Metaheuristics are alternatives to BP that reach high-quality solutions without using many computational resources. In this work, the Cellular Genetic Algorithm (CGA) with a specially designed crossover operator called Damped Crossover (DX), is proposed to optimise weights and biases of the MLP to classify medical data. When compared against state-of-the-art algorithms, the CGA configured with DX obtained the minimal Mean Square Error value in three out of the five considered medical datasets and was the quickest algorithm with four datasets, showing a better balance between time consumed and optimisation performance. Additionally, it is competitive in enhancing classification quality, reaching the best accuracy with two datasets and the second-best accuracy with two of the remaining.Fil: Rojas, Matias Gabriel. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Mendoza; Argentina. Universidad Nacional de Cuyo. Instituto para las Tecnologías de la Información y las Comunicaciones; ArgentinaFil: Olivera, Ana Carolina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Mendoza; Argentina. Universidad Nacional de Cuyo. Instituto para las Tecnologías de la Información y las Comunicaciones; Argentina. Universidad Nacional de Cuyo. Facultad de Ingeniería; ArgentinaFil: Vidal, Pablo Javier. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Mendoza; Argentina. Universidad Nacional de Cuyo. Instituto para las Tecnologías de la Información y las Comunicaciones; Argentina. Universidad Nacional de Cuyo. Facultad de Ingeniería; Argentin
Predictive modeling of die filling of the pharmaceutical granules using the flexible neural tree
In this work, a computational intelligence (CI) technique named flexible neural tree (FNT) was developed to predict die filling performance of pharmaceutical granules and to identify significant die filling process variables. FNT resembles feedforward neural network, which creates a tree-like structure by using genetic programming. To improve accuracy, FNT parameters were optimized by using differential evolution algorithm. The performance of the FNT-based CI model was evaluated and compared with other CI techniques: multilayer perceptron, Gaussian process regression, and reduced error pruning tree. The accuracy of the CI model was evaluated experimentally using die filling as a case study. The die filling experiments were performed using a model shoe system and three different grades of microcrystalline cellulose (MCC) powders (MCC PH 101, MCC PH 102, and MCC DG). The feed powders were roll-compacted and milled into granules. The granules were then sieved into samples of various size classes. The mass of granules deposited into the die at different shoe speeds was measured. From these experiments, a dataset consisting true density, mean diameter (d50), granule size, and shoe speed as the inputs and the deposited mass as the output was generated. Cross-validation (CV) methods such as 10FCV and 5x2FCV were applied to develop and to validate the predictive models. It was found that the FNT-based CI model (for both CV methods) performed much better than other CI models. Additionally, it was observed that process variables such as the granule size and the shoe speed had a higher impact on the predictability than that of the powder property such as d50. Furthermore, validation of model prediction with experimental data showed that the die filling behavior of coarse granules could be better predicted than that of fine granules
Personalized Health Monitoring Using Evolvable Block-based Neural Networks
This dissertation presents personalized health monitoring using evolvable block-based neural networks. Personalized health monitoring plays an increasingly important role in modern society as the population enjoys longer life. Personalization in health monitoring considers physiological variations brought by temporal, personal or environmental differences, and demands solutions capable to reconfigure and adapt to specific requirements. Block-based neural networks (BbNNs) consist of 2-D arrays of modular basic blocks that can be easily implemented using reconfigurable digital hardware such as field programmable gate arrays (FPGAs) that allow on-line partial reorganization. The modular structure of BbNNs enables easy expansion in size by adding more blocks. A computationally efficient evolutionary algorithm is developed that simultaneously optimizes structure and weights of BbNNs. This evolutionary algorithm increases optimization speed by integrating a local search operator. An adaptive rate update scheme removing manual tuning of operator rates enhances the fitness trend compared to pre-determined fixed rates. A fitness scaling with generalized disruptive pressure reduces the possibility of premature convergence. The BbNN platform promises an evolvable solution that changes structures and parameters for personalized health monitoring. A BbNN evolved with the proposed evolutionary algorithm using the Hermite transform coefficients and a time interval between two neighboring R peaks of ECG signal, provides a patient-specific ECG heartbeat classification system. Experimental results using the MIT-BIH Arrhythmia database demonstrate a potential for significant performance enhancements over other major techniques
A Cluster-Based Opposition Differential Evolution Algorithm Boosted by a Local Search for ECG Signal Classification
Electrocardiogram (ECG) signals, which capture the heart's electrical
activity, are used to diagnose and monitor cardiac problems. The accurate
classification of ECG signals, particularly for distinguishing among various
types of arrhythmias and myocardial infarctions, is crucial for the early
detection and treatment of heart-related diseases. This paper proposes a novel
approach based on an improved differential evolution (DE) algorithm for ECG
signal classification for enhancing the performance. In the initial stages of
our approach, the preprocessing step is followed by the extraction of several
significant features from the ECG signals. These extracted features are then
provided as inputs to an enhanced multi-layer perceptron (MLP). While MLPs are
still widely used for ECG signal classification, using gradient-based training
methods, the most widely used algorithm for the training process, has
significant disadvantages, such as the possibility of being stuck in local
optimums. This paper employs an enhanced differential evolution (DE) algorithm
for the training process as one of the most effective population-based
algorithms. To this end, we improved DE based on a clustering-based strategy,
opposition-based learning, and a local search. Clustering-based strategies can
act as crossover operators, while the goal of the opposition operator is to
improve the exploration of the DE algorithm. The weights and biases found by
the improved DE algorithm are then fed into six gradient-based local search
algorithms. In other words, the weights found by the DE are employed as an
initialization point. Therefore, we introduced six different algorithms for the
training process (in terms of different local search algorithms). In an
extensive set of experiments, we showed that our proposed training algorithm
could provide better results than the conventional training algorithms.Comment: 44 pages, 9 figure
Adaptive non linear system identification and channel equalization usinf functional link artificial neural network
In system theory, characterization and identification are fundamental problems. When the plant behavior is completely unknown, it may be characterized using certain model and then, its identification may be carried out with some artificial neural networks(ANN) like multilayer perceptron(MLP) or functional link artificial neural network(FLANN) using some learning rules such as back propagation (BP) algorithm. They offer flexibility, adaptability and versatility, so that a variety of approaches may be used to meet a specific goal, depending upon the circumstances and the requirements of the design specifications. The primary aim of the present thesis is to provide a framework for the systematic design of adaptation laws for nonlinear system identification and channel equalization. While constructing an artificial neural network the designer is often faced with the problem of choosing a network of the right size for the task. The advantages of using a smaller neural network are cheaper cost of computation and better generalization ability. However, a network which is too small may never solve the problem, while a larger network may even have the advantage of a faster learning rate. Thus it makes sense to start with a large network and then reduce its size. For this reason a Genetic Algorithm (GA) based pruning strategy is reported. GA is based upon the process of natural selection and does not require error gradient statistics. As a consequence, a GA is able to find a global error minimum. Transmission bandwidth is one of the most precious resources in digital communication systems. Communication channels are usually modeled as band-limited linear finite impulse response (FIR) filters with low pass frequency response. When the amplitude and the envelope delay response are not constant within the bandwidth of the filter, the channel distorts the transmitted signal causing intersymbol interference (ISI). The addition of noise during propagation also degrades the quality of the received signal. All the signal processing methods used at the receiver's end to compensate the introduced channel distortion and recover the transmitted symbols are referred as channel equalization techniques.When the nonlinearity associated with the system or the channel is more the number of branches in FLANN increases even some cases give poor performance. To decrease the number of branches and increase the performance a two stage FLANN called cascaded FLANN (CFLANN) is proposed.This thesis presents a comprehensive study covering artificial neural network (ANN) implementation for nonlinear system identification and channel equalization. Three ANN structures, MLP, FLANN, CFLANN and their conventional gradient-descent training methods are extensively studied. Simulation results demonstrate that FLANN and CFLANN methods are directly applicable for a large class of nonlinear control systems and communication problems
Training Multilayer Perceptron with Genetic Algorithms and Particle Swarm Optimization for Modeling Stock Price Index Prediction
publishedVersio
Performance modelling for scalable deep learning
Performance modelling for scalable deep learning is very important to quantify the
efficiency of large parallel workloads. Performance models are used to obtain run-time
estimates by modelling various aspects of an application on a target system. Designing
performance models requires comprehensive analysis in order to build accurate models.
Limitations of current performance models include poor explainability in the computation
time of the internal processes of a neural network model and limited applicability to
particular architectures.
Existing performance models in deep learning have been proposed, which are broadly
categorized into two methodologies: analytical modelling and empirical modelling. Analytical
modelling utilizes a transparent approach that involves converting the internal
mechanisms of the model or applications into a mathematical model that corresponds to
the goals of the system. Empirical modelling predicts outcomes based on observation and
experimentation, characterizes algorithm performance using sample data, and is a good alternative
to analytical modelling. However, both these approaches have limitations, such
as poor explainability in the computation time of the internal processes of a neural network
model and poor generalisation. To address these issues, hybridization of the analytical and
empirical approaches has been applied, leading to the development of a novel generic performance
model that provides a general expression of a deep neural network framework
in a distributed environment, allowing for accurate performance analysis and prediction.
The contributions can be summarized as follows:
In the initial study, a comprehensive literature review led to the development of a performance
model based on synchronous stochastic gradient descent (S-SGD) for analysing
the execution time performance of deep learning frameworks in a multi-GPU environment.
This model’s evaluation involved three deep learning models (Convolutional Neural Networks (CNN), Autoencoder (AE), and Multilayer Perceptron (MLP)), implemented in three popular deep learning frameworks (MXNet, Chainer, and TensorFlow) respectively, with a focus on following an analytical approach. Additionally, a generic expression for the performance model was formulated, considering intrinsic parameters and extrinsic scaling factors that impact computing time in a distributed environment. This formulation involved a global optimization problem with a cost function dependent on unknown constants within the generic expression. Differential evolution was utilized to identify the best fitting values, matching experimentally determined computation times. Furthermore, to enhance the accuracy and stability of the performance model, regularization techniques were applied. Lastly, the proposed generic performance model underwent experimental evaluation in a real-world application. The results of this evaluation provided valuable insights into the influence of hyperparameters on performance, demonstrating the robustness and applicability of the performance model in understanding and optimizing model behavior
Efficacy of Neural Prediction-Based NAS for Zero-Shot NAS Paradigm
In prediction-based Neural Architecture Search (NAS), performance indicators
derived from graph convolutional networks have shown significant success. These
indicators, achieved by representing feed-forward structures as component
graphs through one-hot encoding, face a limitation: their inability to evaluate
architecture performance across varying search spaces. In contrast, handcrafted
performance indicators (zero-shot NAS), which use the same architecture with
random initialization, can generalize across multiple search spaces. Addressing
this limitation, we propose a novel approach for zero-shot NAS using deep
learning. Our method employs Fourier sum of sines encoding for convolutional
kernels, enabling the construction of a computational feed-forward graph with a
structure similar to the architecture under evaluation. These encodings are
learnable and offer a comprehensive view of the architecture's topological
information. An accompanying multi-layer perceptron (MLP) then ranks these
architectures based on their encodings. Experimental results show that our
approach surpasses previous methods using graph convolutional networks in terms
of correlation on the NAS-Bench-201 dataset and exhibits a higher convergence
rate. Moreover, our extracted feature representation trained on each
NAS-Benchmark is transferable to other NAS-Benchmarks, showing promising
generalizability across multiple search spaces. The code is available at:
https://github.com/minh1409/DFT-NPZS-NASComment: 12 pages, 6 figure
- …