207 research outputs found

    Expanding the theoretical framework of reservoir computing

    Get PDF

    Tensor Networks for Dimensionality Reduction and Large-Scale Optimizations. Part 2 Applications and Future Perspectives

    Full text link
    Part 2 of this monograph builds on the introduction to tensor networks and their operations presented in Part 1. It focuses on tensor network models for super-compressed higher-order representation of data/parameters and related cost functions, while providing an outline of their applications in machine learning and data analytics. A particular emphasis is on the tensor train (TT) and Hierarchical Tucker (HT) decompositions, and their physically meaningful interpretations which reflect the scalability of the tensor network approach. Through a graphical approach, we also elucidate how, by virtue of the underlying low-rank tensor approximations and sophisticated contractions of core tensors, tensor networks have the ability to perform distributed computations on otherwise prohibitively large volumes of data/parameters, thereby alleviating or even eliminating the curse of dimensionality. The usefulness of this concept is illustrated over a number of applied areas, including generalized regression and classification (support tensor machines, canonical correlation analysis, higher order partial least squares), generalized eigenvalue decomposition, Riemannian optimization, and in the optimization of deep neural networks. Part 1 and Part 2 of this work can be used either as stand-alone separate texts, or indeed as a conjoint comprehensive review of the exciting field of low-rank tensor networks and tensor decompositions.Comment: 232 page

    Tensor Networks for Dimensionality Reduction and Large-Scale Optimizations. Part 2 Applications and Future Perspectives

    Full text link
    Part 2 of this monograph builds on the introduction to tensor networks and their operations presented in Part 1. It focuses on tensor network models for super-compressed higher-order representation of data/parameters and related cost functions, while providing an outline of their applications in machine learning and data analytics. A particular emphasis is on the tensor train (TT) and Hierarchical Tucker (HT) decompositions, and their physically meaningful interpretations which reflect the scalability of the tensor network approach. Through a graphical approach, we also elucidate how, by virtue of the underlying low-rank tensor approximations and sophisticated contractions of core tensors, tensor networks have the ability to perform distributed computations on otherwise prohibitively large volumes of data/parameters, thereby alleviating or even eliminating the curse of dimensionality. The usefulness of this concept is illustrated over a number of applied areas, including generalized regression and classification (support tensor machines, canonical correlation analysis, higher order partial least squares), generalized eigenvalue decomposition, Riemannian optimization, and in the optimization of deep neural networks. Part 1 and Part 2 of this work can be used either as stand-alone separate texts, or indeed as a conjoint comprehensive review of the exciting field of low-rank tensor networks and tensor decompositions.Comment: 232 page

    Novel Architectures and Optimization Algorithms for Training Neural Networks and Applications

    Get PDF
    The two main areas of Deep Learning are Unsupervised and Supervised Learning. Unsupervised Learning studies a class of data processing problems in which only descriptions of objects are known, without label information. Generative Adversarial Networks (GANs) have become among the most widely used unsupervised neural net models. GAN combines two neural nets, generative and discriminative, that work simultaneously. We introduce a new family of discriminator loss functions that adopts a weighted sum of real and fake parts, which we call adaptive weighted loss functions. Using the gradient information, we can adaptively choose weights to train a discriminator in the direction that benefits the GAN\u27s stability. Also, we propose several improvements to the GAN training schemes. One is self-correcting optimization for training a GAN discriminator on Speech Enhancement tasks, which helps avoid ``harmful\u27\u27 training directions for parts of the discriminator loss. The other improvement is a consistency loss, which targets the inconsistency in time and time-frequency domains caused by Fourier Transforms. Contrary to Unsupervised Learning, Supervised Learning uses labels for each object, and it is required to find the relationship between objects and labels. Building computing methods to interpret and represent human language automatically is known as Natural Language Processing which includes tasks such as word prediction, machine translation, etc. In this area, we propose a novel Neumann-Cayley Gated Recurrent Unit (NC-GRU) architecture based on a Neumann series-based Scaled Cayley transformation. The NC-GRU uses orthogonal matrices to prevent exploding gradient problems and enhance long-term memory on various prediction tasks. In addition, we propose using our newly introduced NC-GRU unit inside Neural Nets model to create neural molecular fingerprints. Integrating novel NC-GRU fingerprints and Multi-Task Deep Neural Networks schematics help to improve the performance of several molecular-related tasks. We also introduce a new normalization method - Assorted-Time Normalization, that helps to preserve information from multiple consecutive time steps and normalize using them in Recurrent Nets like architectures. Finally, we propose a Symmetry Structured Convolutional Neural Network (SCNN), an architecture with 2D structured symmetric features over spatial dimensions, that generates and preserves the symmetry structure in the network\u27s convolutional layers

    Multiuser detection employing recurrent neural networks for DS-CDMA systems.

    Get PDF
    Thesis (M.Sc.Eng.)-University of KwaZulu-Natal, 2006.Over the last decade, access to personal wireless communication networks has evolved to a point of necessity. Attached to the phenomenal growth of the telecommunications industry in recent times is an escalating demand for higher data rates and efficient spectrum utilization. This demand is fuelling the advancement of third generation (3G), as well as future, wireless networks. Current 3G technologies are adding a dimension of mobility to services that have become an integral part of modem everyday life. Wideband code division multiple access (WCDMA) is the standardized multiple access scheme for 3G Universal Mobile Telecommunication System (UMTS). As an air interface solution, CDMA has received considerable interest over the past two decades and a great deal of current research is concerned with improving the application of CDMA in 3G systems. A factoring component of CDMA is multiuser detection (MUD), which is aimed at enhancing system capacity and performance, by optimally demodulating multiple interfering signals that overlap in time and frequency. This is a major research problem in multipoint-to-point communications. Due to the complexity associated with optimal maximum likelihood detection, many different sub-optimal solutions have been proposed. This focus of this dissertation is the application of neural networks for MUD, in a direct sequence CDMA (DS-CDMA) system. Specifically, it explores how the Hopfield recurrent neural network (RNN) can be employed to give yet another suboptimal solution to the optimization problem of MUD. There is great scope for neural networks in fields encompassing communications. This is primarily attributed to their non-linearity, adaptivity and key function as data classifiers. In the context of optimum multiuser detection, neural networks have been successfully employed to solve similar combinatorial optimization problems. The concepts of CDMA and MUD are discussed. The use of a vector-valued transmission model for DS-CDMA is illustrated, and common linear sub-optimal MUD schemes, as well as the maximum likelihood criterion, are reviewed. The performance of these sub-optimal MUD schemes is demonstrated. The Hopfield neural network (HNN) for combinatorial optimization is discussed. Basic concepts and techniques related to the field of statistical mechanics are introduced and it is shown how they may be employed to analyze neural classification. Stochastic techniques are considered in the context of improving the performance of the HNN. A neural-based receiver, which employs a stochastic HNN and a simulated annealing technique, is proposed. Its performance is analyzed in a communication channel that is affected by additive white Gaussian noise (AWGN) by way of simulation. The performance of the proposed scheme is compared to that of the single-user matched filter, linear decorrelating and minimum mean-square error detectors, as well as the classical HNN and the stochastic Hopfield network (SHN) detectors. Concluding, the feasibility of neural networks (in this case the HNN) for MUD in a DS-CDMA system is explored by quantifying the relative performance of the proposed model using simulation results and in view of implementation issues

    Support vector machine prediction of HIV-1 drug resistance using The Viral Nucleotide patterns

    Get PDF
    Student Number : 0213068F - MSc Dissertation - School of Computer Science - Faculty of ScienceDrug resistance of the HI virus due to its fast replication and error-prone mutation is a key factor in the failure to combat the HIV epidemic. For this reason, performing pre-therapy drug resistance testing and administering appropriate drugs or combination of drugs accordingly is very useful. There are two approaches to HIV drug resistance testing: phenotypic (clinical) and genotypic (based on the particular virus’s DNA). Genotyping tests HIV drug resistance by detecting specific mutations known to confer drug resistance. It is cheaper and can be computerised. However, it requires being able to know or learn what mutations confer drug resistance. Previous research using pattern recognition techniques has been promising, but the performance needs to be improved. It is also important for techniques that can quickly learn new rules when faced with new mutations or drugs. A relatively recent addition to these techniques is the Support Vector Machines (SVMs). SVMs have proved very successful in many benchmark applications such as face recognition, text recognition, and have also performed well in many computational biology problems where the number of features targeted is large compared to the number of available samples. This paper explores the use of SVMs in predicting the drug resistance of an HIV strain extracted from a patient based on the genetic sequence of those parts of the viral DNA encoding for the two enzymes, Reverse Transcriptase or Protease, which are critical for the replication of the HIV virus. In particular, it is the aim of this reseach to design the model without incorporating the biological knowledge at hand to enable the resulting classifier accommodate new drugs and mutations. To evaluate the performance of SVMs we used cross validation technique to measure the unbiased estimate on 2045 data points. The accuracy of classification and the area under the receiver operating characteristics curve (AUC) was used as a performance measure. Furthermore, to compare the performance of our SVMs model we also developed other prediction models based on popular classification algorithms, namely neural networks, decision trees and logistic regressions. The results show that SVMs are a highly successful classifier and out-perform other techniques with performance ranging between (94.13%–96.33%) accuracy and (81.26% - 97.49%) AUC. Decision trees were rated second and logistic regression performed the worst

    Towards an integrated understanding of neural networks

    Get PDF
    Thesis: Ph. D., Massachusetts Institute of Technology, Department of Mathematics, 2018.Cataloged from PDF version of thesis.Includes bibliographical references (pages 123-136).Neural networks underpin both biological intelligence and modern Al systems, yet there is relatively little theory for how the observed behavior of these networks arises. Even the connectivity of neurons within the brain remains largely unknown, and popular deep learning algorithms lack theoretical justification or reliability guarantees. This thesis aims towards a more rigorous understanding of neural networks. We characterize and, where possible, prove essential properties of neural algorithms: expressivity, learning, and robustness. We show how observed emergent behavior can arise from network dynamics, and we develop algorithms for learning more about the network structure of the brain.by David Rolnick.Ph. D
    corecore