Search CORE

4 research outputs found

Novel Architectures and Optimization Algorithms for Training Neural Networks and Applications

Author: Zadorozhnyy Vasily I
Publication venue: UKnowledge
Publication date: 01/01/2023
Field of study

The two main areas of Deep Learning are Unsupervised and Supervised Learning. Unsupervised Learning studies a class of data processing problems in which only descriptions of objects are known, without label information. Generative Adversarial Networks (GANs) have become among the most widely used unsupervised neural net models. GAN combines two neural nets, generative and discriminative, that work simultaneously. We introduce a new family of discriminator loss functions that adopts a weighted sum of real and fake parts, which we call adaptive weighted loss functions. Using the gradient information, we can adaptively choose weights to train a discriminator in the direction that benefits the GAN\u27s stability. Also, we propose several improvements to the GAN training schemes. One is self-correcting optimization for training a GAN discriminator on Speech Enhancement tasks, which helps avoid ``harmful\u27\u27 training directions for parts of the discriminator loss. The other improvement is a consistency loss, which targets the inconsistency in time and time-frequency domains caused by Fourier Transforms. Contrary to Unsupervised Learning, Supervised Learning uses labels for each object, and it is required to find the relationship between objects and labels. Building computing methods to interpret and represent human language automatically is known as Natural Language Processing which includes tasks such as word prediction, machine translation, etc. In this area, we propose a novel Neumann-Cayley Gated Recurrent Unit (NC-GRU) architecture based on a Neumann series-based Scaled Cayley transformation. The NC-GRU uses orthogonal matrices to prevent exploding gradient problems and enhance long-term memory on various prediction tasks. In addition, we propose using our newly introduced NC-GRU unit inside Neural Nets model to create neural molecular fingerprints. Integrating novel NC-GRU fingerprints and Multi-Task Deep Neural Networks schematics help to improve the performance of several molecular-related tasks. We also introduce a new normalization method - Assorted-Time Normalization, that helps to preserve information from multiple consecutive time steps and normalize using them in Recurrent Nets like architectures. Finally, we propose a Symmetry Structured Convolutional Neural Network (SCNN), an architecture with 2D structured symmetric features over spatial dimensions, that generates and preserves the symmetry structure in the network\u27s convolutional layers

University of Kentucky

Recommended from our members

Using data-derived charge densities in electronic structure methods

Author: Fowler Andrew Thomas
Publication venue: University of Cambridge
Publication date: 17/09/2019
Field of study

For supplementary information, see: https://www.repository.cam.ac.uk/handle/1810/294380In Condensed Matter Physics, the computational expense to evaluate the total potential energy of a collection of atoms using standard ab initio methods is typically large. This limits the scale of phenomena that can be studied in both length and time. Data-driven techniques have established a pragmatic extension to ab initio calculations, balancing reductions in the calculation time with potential losses of accuracy in the properties of interest. Both paradigms compliment one another and when used appropriately, are valuable tools that enable and stimulate research in Materials Science. Unlike traditional efforts, modern techniques to include data employ flexible functional forms, extending the applicability of such methods to a diverse range of physical quantities. Recently, interest in utilising data in total energy calculations has turned towards the electron density. With an electron density that is close to the ground state, data-derived kinetic energy functionals in orbital-free density functional theory can be applied to evaluate the total energy without using gradients of the functional with respect to the electron density. For this purpose, a number of approaches to calculate data-derived densities have been proposed in recent years. In this thesis, we begin by reviewing several fixed-form expressions to approximate the potential energy of hexagonal layered crystals and show how a flexible form is essential to fully utilise the available data. We then focus on developing new approaches to approximate ground state electron densities and on novel applications that help to further unify data-driven and ab initio techniques within electronic structure. By calculating reliable uncertainty estimates, we show that data-derived densities can be incorporated into density functional theory in a “safe” manner. We also show that with accurate initial densities and for systems that otherwise have a poor initial estimate, we can reduce the number of self-consistent field iterations that are necessary to reach self-consistency in Kohn-Sham density functional theory. We hope that the work in this thesis will contribute to improving initial states in density functional theory, support the application of data-derived orbital-free kinetic energy functionals and encourage an ever closer and mutually beneficial cohesion between data-driven and ab initio techniques throughout the Natural Sciences.EPSRC Centre for Doctoral Training in Computational Methods for Materials Science, grant numberEP/L015552/1

Apollo (Cambridge)