Search CORE

2,575 research outputs found

An examination and analysis of the Boltzmann machine, its mean field theory approximation, and learning algorithm

Author: Phillips Vincent Clive
Publication venue: Edith Cowan University, Research Online, Perth, Western Australia
Publication date: 01/01/1991
Field of study

It is currently believed that artificial neural network models may form the basis for inte1ligent computational devices. The Boltzmann Machine belongs to the class of recursive artificial neural networks and uses a supervised learning algorithm to learn the mapping between input vectors and desired outputs. This study examines the parameters that influence the performance of the Boltzmann Machine learning algorithm. Improving the performance of the algorithm through the use of a naïve mean field theory approximation is also examined. The study was initiated to examine the hypothesis that the Boltzmann Machine learning algorithm, when used with the mean field approximation, is an efficient, reliable, and flexible model of machine learning. An empirical analysis of the performance of the algorithm supports this hypothesis. The performance of the algorithm is investigated by applying it to training the Boltzmann Machine, and its mean field approximation, the exclusive-Or function. Simulation results suggest that the mean field theory approximation learns faster than the Boltzmann Machine, and shows better stability. The size of the network and the learning rate were found to have considerable impact upon the performance of the algorithm, especially in the case of the mean field theory approximation. A comparison is made with the feed forward back propagation paradigm and it is found that the back propagation network learns the exclusive-Or function eight times faster than the mean field approximation. However, the mean field approximation demonstrated better reliability and stability. Because the mean field approximation is local and asynchronous it has an advantage over back propagation with regard to a parallel implementation. The mean field approximation is domain independent and structurally flexible. These features make the network suitable for use with a structural adaption algorithm, allowing the network to modify its architecture in response to the external environment

Research Online @ ECU

Improving variational methods via pairwise linear response identities

Author: RICCI TERSENGHI Federico
Publication venue
Publication date: 01/01/2017
Field of study

nference methods are often formulated as variational approximations: these approxima-tions allow easy evaluation of statistics by marginalization or linear response, but theseestimates can be inconsistent. We show that by introducing constraints on covariance, onecan ensure consistency of linear response with the variational parameters, and in so doinginference of marginal probability distributions is improved. For the Bethe approximationand its generalizations, improvements are achieved with simple choices of the constraints.The approximations are presented as variational frameworks; iterative procedures relatedto message passing are provided for finding the minim

Archivio della ricerca- Università di Roma La Sapienza

On Similarities between Inference in Game Theory and Machine Learning

Author: Dash Rajdeep
Jennings Nick
Leslie D.
Reece S
Rezek I
Roberts S
Rogers Alex
Publication venue
Publication date: 01/01/2008
Field of study

In this paper, we elucidate the equivalence between inference in game theory and machine learning. Our aim in so doing is to establish an equivalent vocabulary between the two domains so as to facilitate developments at the intersection of both fields, and as proof of the usefulness of this approach, we use recent developments in each field to make useful improvements to the other. More specifically, we consider the analogies between smooth best responses in fictitious play and Bayesian inference methods. Initially, we use these insights to develop and demonstrate an improved algorithm for learning in games based on probabilistic moderation. That is, by integrating over the distribution of opponent strategies (a Bayesian approach within machine learning) rather than taking a simple empirical average (the approach used in standard fictitious play) we derive a novel moderated fictitious play algorithm and show that it is more likely than standard fictitious play to converge to a payoff-dominant but risk-dominated Nash equilibrium in a simple coordination game. Furthermore we consider the converse case, and show how insights from game theory can be used to derive two improved mean field variational learning algorithms. We first show that the standard update rule of mean field variational learning is analogous to a Cournot adjustment within game theory. By analogy with fictitious play, we then suggest an improved update rule, and show that this results in fictitious variational play, an improved mean field variational learning algorithm that exhibits better convergence in highly or strongly connected graphical models. Second, we use a recent advance in fictitious play, namely dynamic fictitious play, to derive a derivative action variational learning algorithm, that exhibits superior convergence properties on a canonical machine learning problem (clustering a mixture distribution)

CiteSeerX

Southampton (e-Prints Soton)

Oxford University Research Archive

Spiral - Imperial College Digital Repository

Lancaster E-Prints

Explore Bristol Research

Methodological contributions to the simulation of charge and energy transport in molecular materials

Author: Kranz Julian
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 01/01/2018
Field of study

Diese Arbeit beschäftigt sich mit methodischen Entwicklungen zur Untersuchung von Ladungs- und Energietransportprozessen in molekularen Materialien. Damit ist gemeint, dass neue Ansätzen zur Untersuchung solcher Prozesse eingeführt und getestet, nicht etwa spezielle Prozesse im Detail ergründet, werden. Insbesondere liegt der Fokus auf Methoden zur Untersuchung organischer, halbleitender Materialien mit hohen Ladungsträgermobilitäten oder effizienter Ekzitonendiffusion, wobei die vorgestellten Methoden weitaus breiter anwendbar sind. Zunächst wenden wir eine ursprünglich für den Ladungstransport in DNA-Strängen entwickelte, und später von Heck et al. für organische Halbleiter adaptierte, Methode auf Anthrazenkristalle an. Wir berechnen damit die korrekte Temperaturabhängigkeit der Lochmobilität. Diese ist eng mit dem zugrundeliegenden Transportmechanismus verwoben und kann im Falle von bandartigem Transport, wie in Anthrazen, nicht mit hoppingbasierten Methoden reproduziert werden. Daraufhin führen wir eine Methode zur Berechnung von Ekzitonendiffusionskonstanten in molekularen Materialien auf Basis der direkten Propagation der Ekzitonenwellenfunktion ein. Um solche Rechnungen möglich zu machen, werden unter Ausnutzung der molekularen Struktur Näherungen auf verschiedenen Ebenen eingeführt. Die neue Methode wird, um sie zu testen, auf Ekzitonentransport in Anthrazen angewendet und wir diskutieren dabei auch technische Details, die für die obig angesprochenen Ladungstransportstudien ebenfalls relevant sind. Bei der Propagation der Ekzitonenwellenfunktion müssen viele elektronische Strukturrechnungen angeregter Zustände durchgeführt werden, so dass dazu eine sehr schnelle Methode notwendig ist. Wir verwenden die approximative TD-DFTB Methode, die auf DFT mit einem GGA Funktional basiert. Es ist bekannt, dass GGA Funktionale für ausgedehnte π-Elektronensysteme, wie sie in organischen Halbleitern ständig vorkommen, nicht zuverlässig sind. Innerhalb von DFT lösen sogenannte long-range corrected (LC) Funktionale das Problem. Wir führen LC Funktionale in TD-DFTB ein, was Änderung am Formalismus erfordert. Wir zeigen, dass damit typische Probleme mit π-Systemen und Ladungstransferanregungen gelöst werden, bei tausendfach schnelleren Rechnungen als mit konventionellem TD-DFT. Abschließend beschäftigen wir uns mir der DFTB Methode selbst. LC Funktionale haben einen Parameter, der idealerweise systemspezifisch gewählt wird. Bei jeder Anpassung müssen für DFTB neue Parameter berechnet werden. Ein Satz von atompaarweisen Funktionen, genannt Repulsivpotentiale, erfordern dabei bisher viel Handarbeit. Wir versuchen diesen Vorgang zu automatisieren, indem wir DFTB mit Methoden aus der künstlichen Intelligenz verbinden

KITopen

Online Multi-Stage Deep Architectures for Feature Extraction and Object Recognition

Author: Rose Derek Christopher
Publication venue: TRACE: Tennessee Research and Creative Exchange
Publication date: 01/08/2013
Field of study

Multi-stage visual architectures have recently found success in achieving high classification accuracies over image datasets with large variations in pose, lighting, and scale. Inspired by techniques currently at the forefront of deep learning, such architectures are typically composed of one or more layers of preprocessing, feature encoding, and pooling to extract features from raw images. Training these components traditionally relies on large sets of patches that are extracted from a potentially large image dataset. In this context, high-dimensional feature space representations are often helpful for obtaining the best classification performances and providing a higher degree of invariance to object transformations. Large datasets with high-dimensional features complicate the implementation of visual architectures in memory constrained environments. This dissertation constructs online learning replacements for the components within a multi-stage architecture and demonstrates that the proposed replacements (namely fuzzy competitive clustering, an incremental covariance estimator, and multi-layer neural network) can offer performance competitive with their offline batch counterparts while providing a reduced memory footprint. The online nature of this solution allows for the development of a method for adjusting parameters within the architecture via stochastic gradient descent. Testing over multiple datasets shows the potential benefits of this methodology when appropriate priors on the initial parameters are unknown. Alternatives to batch based decompositions for a whitening preprocessing stage which take advantage of natural image statistics and allow simple dictionary learners to work well in the problem domain are also explored. Expansions of the architecture using additional pooling statistics and multiple layers are presented and indicate that larger codebook sizes are not the only step forward to higher classification accuracies. Experimental results from these expansions further indicate the important role of sparsity and appropriate encodings within multi-stage visual feature extraction architectures

University of Tennessee, Knoxville: Trace