81 research outputs found

    Review of Deep Learning Algorithms and Architectures

    Get PDF
    Deep learning (DL) is playing an increasingly important role in our lives. It has already made a huge impact in areas, such as cancer diagnosis, precision medicine, self-driving cars, predictive forecasting, and speech recognition. The painstakingly handcrafted feature extractors used in traditional learning, classification, and pattern recognition systems are not scalable for large-sized data sets. In many cases, depending on the problem complexity, DL can also overcome the limitations of earlier shallow networks that prevented efficient training and abstractions of hierarchical representations of multi-dimensional training data. Deep neural network (DNN) uses multiple (deep) layers of units with highly optimized algorithms and architectures. This paper reviews several optimization methods to improve the accuracy of the training and to reduce training time. We delve into the math behind training algorithms used in recent deep networks. We describe current shortcomings, enhancements, and implementations. The review also covers different types of deep architectures, such as deep convolution networks, deep residual networks, recurrent neural networks, reinforcement learning, variational autoencoders, and others.https://doi.org/10.1109/ACCESS.2019.291220

    Enhanced Deep Network Designs Using Mitochondrial DNA Based Genetic Algorithm And Importance Sampling

    Get PDF
    Machine learning (ML) is playing an increasingly important role in our lives. It has already made huge impact in areas such as cancer diagnosis, precision medicine, self-driving cars, natural disasters predictions, speech recognition, etc. The painstakingly handcrafted feature extractors used in the traditional learning, classification and pattern recognition systems are not scalable for large-sized datasets or adaptable to different classes of problems or domains. Machine learning resurgence in the form of Deep Learning (DL) in the last decade after multiple AI (artificial intelligence) winters and hype cycles is a result of the convergence of advancements in training algorithms, availability of massive data (big data) and innovation in compute resources (GPUs and cloud). If we want to solve more complex problems with machine learning, we need to optimize all three of these areas, i.e., algorithms, dataset and compute. Our dissertation research work presents the original application of nature-inspired idea of mitochondrial DNA (mtDNA) to improve deep learning network design. Additional fine-tuning is provided with Monte Carlo based method called importance sampling (IS). The primary performance indicators for machine learning are model accuracy, loss and training time. The goal of our dissertation is to provide a framework to address all these areas by optimizing network designs (in the form of hyperparameter optimization) and dataset using enhanced Genetic Algorithm (GA) and importance sampling. Algorithms are by far the most important aspect of machine learning. We demonstrate the application of mitochondrial DNA to complement the standard genetic algorithm for architecture optimization of deep Convolution Neural Network (CNN). We use importance sampling to reduce the dataset variance and sample more often from the instances that add greater value from the training outcome perspective. And finally, we leverage massive parallel and distributed processing of GPUs in the cloud to speed up training. Thus, our multi-approach method for enhancing deep learning combines architecture optimization, dataset optimization and the power of the cloud to drive better model accuracy and reduce training time

    Neural-Kalman Schemes for Non-Stationary Channel Tracking and Learning

    Get PDF
    This Thesis focuses on channel tracking in Orthogonal Frequency-Division Multiplexing (OFDM), a widely-used method of data transmission in wireless communications, when abrupt changes occur in the channel. In highly mobile applications, new dynamics appear that might make channel tracking non-stationary, e.g. channels might vary with location, and location rapidly varies with time. Simple examples might be the di erent channel dynamics a train receiver faces when it is close to a station vs. crossing a bridge vs. entering a tunnel, or a car receiver in a route that grows more tra c-dense. Some of these dynamics can be modelled as channel taps dying or being reborn, and so tap birth-death detection is of the essence. In order to improve the quality of communications, we delved into mathematical methods to detect such abrupt changes in the channel, such as the mathematical areas of Sequential Analysis/ Abrupt Change Detection and Random Set Theory (RST), as well as the engineering advances in Neural Network schemes. This knowledge helped us nd a solution to the problem of abrupt change detection by informing and inspiring the creation of low-complexity implementations for real-world channel tracking. In particular, two such novel trackers were created: the Simpli- ed Maximum A Posteriori (SMAP) and the Neural-Network-switched Kalman Filtering (NNKF) schemes. The SMAP is a computationally inexpensive, threshold-based abrupt-change detector. It applies the three following heuristics for tap birth-death detection: a) detect death if the tap gain jumps into approximately zero (memoryless detection); b) detect death if the tap gain has slowly converged into approximately zero (memory detection); c) detect birth if the tap gain is far from zero. The precise parameters for these three simple rules can be approximated with simple theoretical derivations and then ne-tuned through extensive simulations. The status detector for each tap using only these three computationally inexpensive threshold comparisons achieves an error reduction matching that of a close-to-perfect path death/birth detection, as shown in simulations. This estimator was shown to greatly reduce channel tracking error in the target Signal-to-Noise Ratio (SNR) range at a very small computational cost, thus outperforming previously known systems. The underlying RST framework for the SMAP was then extended to combined death/birth and SNR detection when SNR is dynamical and may drift. We analyzed how di erent quasi-ideal SNR detectors a ect the SMAP-enhanced Kalman tracker's performance. Simulations showed SMAP is robust to SNR drift in simulations, although it was also shown to bene t from an accurate SNR detection. The core idea behind the second novel tracker, NNKFs, is similar to the SMAP, but now the tap birth/death detection will be performed via an arti cial neuronal network (NN). Simulations show that the proposed NNKF estimator provides extremely good performance, practically identical to a detector with 100% accuracy. These proposed Neural-Kalman schemes can work as novel trackers for multipath channels, since they are robust to wide variations in the probabilities of tap birth and death. Such robustness suggests a single, low-complexity NNKF could be reusable over di erent tap indices and communication environments. Furthermore, a di erent kind of abrupt change was proposed and analyzed: energy shifts from one channel tap to adjacent taps (partial tap lateral hops). This Thesis also discusses how to model, detect and track such changes, providing a geometric justi cation for this and additional non-stationary dynamics in vehicular situations, such as road scenarios where re ections on trucks and vans are involved, or the visual appearance/disappearance of drone swarms. An extensive literature review of empirically-backed abrupt-change dynamics in channel modelling/measuring campaigns is included. For this generalized framework of abrupt channel changes that includes partial tap lateral hopping, a neural detector for lateral hops with large energy transfers is introduced. Simulation results suggest the proposed NN architecture might be a feasible lateral hop detector, suitable for integration in NNKF schemes. Finally, the newly found understanding of abrupt changes and the interactions between Kalman lters and neural networks is leveraged to analyze the neural consequences of abrupt changes and brie y sketch a novel, abrupt-change-derived stochastic model for neural intelligence, extract some neuro nancial consequences of unstereotyped abrupt dynamics, and propose a new portfolio-building mechanism in nance: Highly Leveraged Abrupt Bets Against Failing Experts (HLABAFEOs). Some communication-engineering-relevant topics, such as a Bayesian stochastic stereotyper for hopping Linear Gauss-Markov (LGM) models, are discussed in the process. The forecasting problem in the presence of expert disagreements is illustrated with a hopping LGM model and a novel structure for a Bayesian stereotyper is introduced that might eventually solve such problems through bio-inspired, neuroscienti cally-backed mechanisms, like dreaming and surprise (biological Neural-Kalman). A generalized framework for abrupt changes and expert disagreements was introduced with the novel concept of Neural-Kalman Phenomena. This Thesis suggests mathematical (Neural-Kalman Problem Category Conjecture), neuro-evolutionary and social reasons why Neural-Kalman Phenomena might exist and found signi cant evidence for their existence in the areas of neuroscience and nance. Apart from providing speci c examples, practical guidelines and historical (out)performance for some HLABAFEO investing portfolios, this multidisciplinary research suggests that a Neural- Kalman architecture for ever granular stereotyping providing a practical solution for continual learning in the presence of unstereotyped abrupt dynamics would be extremely useful in communications and other continual learning tasks.Programa de Doctorado en Multimedia y Comunicaciones por la Universidad Carlos III de Madrid y la Universidad Rey Juan CarlosPresidente: Luis Castedo Ribas.- Secretaria: Ana García Armada.- Vocal: José Antonio Portilla Figuera

    Towards Deeper Understanding in Neuroimaging

    Get PDF
    Neuroimaging is a growing domain of research, with advances in machine learning having tremendous potential to expand understanding in neuroscience and improve public health. Deep neural networks have recently and rapidly achieved historic success in numerous domains, and as a consequence have completely redefined the landscape of automated learners, giving promise of significant advances in numerous domains of research. Despite recent advances and advantages over traditional machine learning methods, deep neural networks have yet to have permeated significantly into neuroscience studies, particularly as a tool for discovery. This dissertation presents well-established and novel tools for unsupervised learning which aid in feature discovery, with relevant applications to neuroimaging. Through our works within, this dissertation presents strong evidence that deep learning is a viable and important tool for neuroimaging studies

    Trust Region Methods for Training Neural Networks

    Get PDF
    Artificial feed-forward neural networks (ff-ANNs) serve as powerful machine learning models for supervised classification problems. They have been used to solve problems stretching from natural language processing to computer vision. ff-ANNs are typically trained using gradient based approaches, which only require the computation of first order derivatives. In this thesis we explore the benefits and drawbacks of training an ff-ANN with a method which requires the computation of second order derivatives of the objective function. We also explore whether stochastic approximations can be used to decrease the computation time of such a method. A numerical investigation was performed into the behaviour of trust region methods, a type of second order numerical optimization method, when used to train ff-ANNs on several datasets. Our study compares a classical trust region approach and evaluates the effect of adapting this method using stochastic variations. The exploration includes three approaches to reducing the computations required to perform the classical method: stochastic subsampling of training examples, stochastic subsampling of parameters and using a gradient based approach in combination with the classical trust region method. We found that stochastic subsampling methods can, in some cases, reduce the CPU time required to reach a reasonable solution when compared to the classical trust region method but this was not consistent across all datasets. We also found that using the classical trust region method in combination with mini-batch gradient descent either successfully matched (within 0.1s) or decreased the CPU time required to reach a reasonable solution for all datasets. This was achieved by only computing the trust region step when training progress using the gradient approach had stalled

    Classification of Frequency and Phase Encoded Steady State Visual Evoked Potentials for Brain Computer Interface Speller Applications using Convolutional Neural Networks

    Get PDF
    Over the past decade there have been substantial improvements in vision based Brain-Computer Interface (BCI) spellers for quadriplegic patient populations. This thesis contains a review of the numerous bio-signals available to BCI researchers, as well as a brief chronology of foremost decoding methodologies used to date. Recent advances in classification accuracy and information transfer rate can be primarily attributed to time consuming patient specific parameter optimization procedures. The aim of the current study was to develop analysis software with potential ‘plug-in-and-play’ functionality. To this end, convolutional neural networks, presently established as state of the art analytical techniques for image processing, were utilized. The thesis herein defines deep convolutional neural network architecture for the offline classification of phase and frequency encoded SSVEP bio-signals. Networks were trained using an extensive 35 participant open source Electroencephalographic (EEG) benchmark dataset (Department of Bio-medical Engineering, Tsinghua University, Beijing). Average classification accuracies of 82.24% and information transfer rates of 22.22 bpm were achieved on a BCI naïve participant dataset for a 40 target alphanumeric display, in absence of any patient specific parameter optimization

    Perspectives on adaptive dynamical systems

    Get PDF
    Adaptivity is a dynamical feature that is omnipresent in nature, socio-economics, and technology. For example, adaptive couplings appear in various real-world systems, such as the power grid, social, and neural networks, and they form the backbone of closed-loop control strategies and machine learning algorithms. In this article, we provide an interdisciplinary perspective on adaptive systems. We reflect on the notion and terminology of adaptivity in different disciplines and discuss which role adaptivity plays for various fields. We highlight common open challenges and give perspectives on future research directions, looking to inspire interdisciplinary approaches
    • …
    corecore