459 research outputs found

    Regularised feed forward neural networks for streamed data classification problems

    Get PDF
    DATA AVAILABILITY : Data will be made available on request.SUPPLEMENTARY MATERIAL : MMC S1. The supplementary material contains empirical results, performance graphs, illustrations, pseudo code and equations.Streamed data classification problems (SDCPs) require classifiers to not just find the optimal decision boundaries that describe the relationships within a data stream, but also to adapt to changes in the decision boundaries in real-time. The requirement is due to concept drift, i.e., incorrect classifications caused by decision boundaries changing over time. Changes include disappearing, appearing or shifting decision boundaries. This article proposes an online learning approach for feed forward neural networks (FFNNs) that meets the requirements of SDCPs. The approach uses regularisation to dynamically optimise the architecture, and quantum particle swarm optimisation (QPSO) to dynamically adjust the weights. The learning approach is applied to a FFNN, which uses rectified linear activation functions, to form a novel SDCP classifier. The classifier is empirically investigated on several SDCPs. Both weight decay (WD) and weight elimination (WE) are investigated as regularisers. Empirical results show that using QPSO with no regularisation causes the classifier to completely saturate. However, using QPSO with regularisation makes the classifier efficient at dynamically adapting both its architecture and weights as decision boundaries change. Furthermore, the results favour WE over WD as a regulariser for QPSO.The National Research Foundation (NRF) of South Africa.https://www.elsevier.com/locate/engappaihj2024Computer ScienceSDG-09: Industry, innovation and infrastructur

    Medical Image Classification Using Transfer Learning and Network Pruning Algorithms

    Get PDF
    Deep neural networks show great advancement in recent decades in classifying medical images (such as CTscans) with high precision to aid disease diagnosis. However, the training of deep neural networks requires significant sample sizes for learning enriched discriminative spatial features. Building a high quality dataset large enough to satisfy model training requirement is a challenging task due to limited disease sample cases, and various data privacy constraints. Therefore in this research, we perform medical image classification using transfer learning based on several well-known deep networks, i.e. GoogLeNet, Resnet and EfficientNet. To tackle data sparsity issues, a Wasserstein Generative Adversarial Network (WGAN) is used to generate new medical image samples to increase the numbers of training instances of the minority classes. The transfer learning process itself also allows the building of strong classifiers by transferring knowledge from the pre-trained image domain to a new medical domain using a small sample size. Moreover, the lottery ticket hypothesis is also used to prune each transfer learning network trained using the new target image data sets. Specifically, the L1 norm unstructured pruning technique is used for network reduction. Hyper-parameter finetuning is also performed to identify optimal settings of key network hyper-parameters such as learning rate, batch size and weight decay. A total of 20 trials are used for optimal hyper-parameter selection. Evaluated using multi-class lung X-ray images for pneumonia conditions and brain tumor CT-scans, the fine-tuned EfficientNet model obtains the best brain tumor classification accuracy rate of 96% and a fine-tuned GoogLeNet model with pruning has the highest pneumonia classification accuracy rate of 81.5%.<br/

    Advances in optimisation algorithms and techniques for deep learning

    Get PDF
    In the last decade, deep learning(DL) has witnessed excellent performances on a variety of problems, including speech recognition, object recognition, detection, and natural language processing (NLP) among many others. Of these applications, one common challenge is to obtain ideal parameters during the training of the deep neural networks (DNN). These typical parameters are obtained by some optimisation techniques which have been studied extensively. These research have produced state-of-art(SOTA) results on speed and memory improvements for deep neural networks(NN) architectures. However, the SOTA optimisers have continued to be an active research area with no compilations of the existing optimisers reported in the literature. This paper provides an overview of the recent advances in optimisation algorithms and techniques used in DNN, highlighting the current SOTA optimisers, improvements made on these optimisation algorithms and techniques, alongside the trends in the development of optimisers used in training DL based models. The results of the search of the Scopus database for the optimisers in DL provides the articles reported as the summary of the DL optimisers. From what we can tell, there is no comprehensive compilation of the optimisation algorithms and techniques so far developed and used in DL research and applications, and this paper summarises these facts

    Evolving and Ensembling Deep CNN Architectures for Image Classification

    Get PDF
    Deep Convolutional Neural Networks (CNNs) have traditionally been hand-designed owing to the complexity of their construction and the computational requirements of their training. Recently however, there has been an increase in research interest towards automatically designing deep CNNs for specific tasks. Ensembling has been shown to effectively increase the performance of deep CNNs, although usually with a duplication of work and therefore a large increase in computational resources required. In this paper we present a method for automatically designing and ensembling deep CNN models with a central weight repository to avoid work duplication. The models are trained and optimised together using particle swarm optimisation (PSO), with architecture convergence encouraged. At the conclusion of the joint optimisation and training process a base model nomination method is used to determine the best candidates for the ensemble. Two base model nomination methods are proposed, one using the local best particle positions from the PSO process, and one using the contents of the central weight repository. Once the base model pool has been created, the individual models inherit their parameters from the central weight repository and are then finetuned and ensembled in order to create a final system. We evaluate our system on the CIFAR-10 classification dataset and demonstrate improved results over the single global best model suggested by the optimisation process, with a minor increase in resources required by the finetuning process. Our system achieves an error rate of 4.27% on the CIFAR-10 image classification task with only 36 hours of combined optimisation and training on a single NVIDIA GTX 1080Ti GPU

    Predicting HIV Status Using Neural Networks and Demographic Factors

    Get PDF
    Student Number : 0006036T - MSc(Eng) project report - School of Electrical and Information Engineering - Faculty of Engineering and the Built EnvironmentDemographic and medical history information obtained from annual South African antenatal surveys is used to estimate the risk of acquiring HIV. The estimation system consists of a classifier: a neural network trained to perform binary classification, using supervised learning with the survey data. The survey information contains discrete variables such as age, gravidity and parity, as well as the quantitative variables race and location, making up the input to the neural network. HIV status is the output. A multilayer perceptron with a logistic function is trained with a cross entropy error function, providing a probabilistic interpretation of the output. Predictive and classification performance is measured, and the sensitivity and specificity are illustrated on the Receiver Operating Characteristic. An auto-associative neural network is trained on complete datasets, and when presented with partial data, global optimisation methods are used to approximate the missing entries. The effect of the imputed data on the network prediction is investigated
    • …
    corecore