2,575 research outputs found

    An examination and analysis of the Boltzmann machine, its mean field theory approximation, and learning algorithm

    Get PDF
    It is currently believed that artificial neural network models may form the basis for inte1ligent computational devices. The Boltzmann Machine belongs to the class of recursive artificial neural networks and uses a supervised learning algorithm to learn the mapping between input vectors and desired outputs. This study examines the parameters that influence the performance of the Boltzmann Machine learning algorithm. Improving the performance of the algorithm through the use of a naĂŻve mean field theory approximation is also examined. The study was initiated to examine the hypothesis that the Boltzmann Machine learning algorithm, when used with the mean field approximation, is an efficient, reliable, and flexible model of machine learning. An empirical analysis of the performance of the algorithm supports this hypothesis. The performance of the algorithm is investigated by applying it to training the Boltzmann Machine, and its mean field approximation, the exclusive-Or function. Simulation results suggest that the mean field theory approximation learns faster than the Boltzmann Machine, and shows better stability. The size of the network and the learning rate were found to have considerable impact upon the performance of the algorithm, especially in the case of the mean field theory approximation. A comparison is made with the feed forward back propagation paradigm and it is found that the back propagation network learns the exclusive-Or function eight times faster than the mean field approximation. However, the mean field approximation demonstrated better reliability and stability. Because the mean field approximation is local and asynchronous it has an advantage over back propagation with regard to a parallel implementation. The mean field approximation is domain independent and structurally flexible. These features make the network suitable for use with a structural adaption algorithm, allowing the network to modify its architecture in response to the external environment

    Improving variational methods via pairwise linear response identities

    Get PDF
    nference methods are often formulated as variational approximations: these approxima-tions allow easy evaluation of statistics by marginalization or linear response, but theseestimates can be inconsistent. We show that by introducing constraints on covariance, onecan ensure consistency of linear response with the variational parameters, and in so doinginference of marginal probability distributions is improved. For the Bethe approximationand its generalizations, improvements are achieved with simple choices of the constraints.The approximations are presented as variational frameworks; iterative procedures relatedto message passing are provided for finding the minim

    On Similarities between Inference in Game Theory and Machine Learning

    No full text
    In this paper, we elucidate the equivalence between inference in game theory and machine learning. Our aim in so doing is to establish an equivalent vocabulary between the two domains so as to facilitate developments at the intersection of both fields, and as proof of the usefulness of this approach, we use recent developments in each field to make useful improvements to the other. More specifically, we consider the analogies between smooth best responses in fictitious play and Bayesian inference methods. Initially, we use these insights to develop and demonstrate an improved algorithm for learning in games based on probabilistic moderation. That is, by integrating over the distribution of opponent strategies (a Bayesian approach within machine learning) rather than taking a simple empirical average (the approach used in standard fictitious play) we derive a novel moderated fictitious play algorithm and show that it is more likely than standard fictitious play to converge to a payoff-dominant but risk-dominated Nash equilibrium in a simple coordination game. Furthermore we consider the converse case, and show how insights from game theory can be used to derive two improved mean field variational learning algorithms. We first show that the standard update rule of mean field variational learning is analogous to a Cournot adjustment within game theory. By analogy with fictitious play, we then suggest an improved update rule, and show that this results in fictitious variational play, an improved mean field variational learning algorithm that exhibits better convergence in highly or strongly connected graphical models. Second, we use a recent advance in fictitious play, namely dynamic fictitious play, to derive a derivative action variational learning algorithm, that exhibits superior convergence properties on a canonical machine learning problem (clustering a mixture distribution)

    Methodological contributions to the simulation of charge and energy transport in molecular materials

    Get PDF
    Diese Arbeit beschĂ€ftigt sich mit methodischen Entwicklungen zur Untersuchung von Ladungs- und Energietransportprozessen in molekularen Materialien. Damit ist gemeint, dass neue AnsĂ€tzen zur Untersuchung solcher Prozesse eingefĂŒhrt und getestet, nicht etwa spezielle Prozesse im Detail ergrĂŒndet, werden. Insbesondere liegt der Fokus auf Methoden zur Untersuchung organischer, halbleitender Materialien mit hohen LadungstrĂ€germobilitĂ€ten oder effizienter Ekzitonendiffusion, wobei die vorgestellten Methoden weitaus breiter anwendbar sind. ZunĂ€chst wenden wir eine ursprĂŒnglich fĂŒr den Ladungstransport in DNA-StrĂ€ngen entwickelte, und spĂ€ter von Heck et al. fĂŒr organische Halbleiter adaptierte, Methode auf Anthrazenkristalle an. Wir berechnen damit die korrekte TemperaturabhĂ€ngigkeit der LochmobilitĂ€t. Diese ist eng mit dem zugrundeliegenden Transportmechanismus verwoben und kann im Falle von bandartigem Transport, wie in Anthrazen, nicht mit hoppingbasierten Methoden reproduziert werden. Daraufhin fĂŒhren wir eine Methode zur Berechnung von Ekzitonendiffusionskonstanten in molekularen Materialien auf Basis der direkten Propagation der Ekzitonenwellenfunktion ein. Um solche Rechnungen möglich zu machen, werden unter Ausnutzung der molekularen Struktur NĂ€herungen auf verschiedenen Ebenen eingefĂŒhrt. Die neue Methode wird, um sie zu testen, auf Ekzitonentransport in Anthrazen angewendet und wir diskutieren dabei auch technische Details, die fĂŒr die obig angesprochenen Ladungstransportstudien ebenfalls relevant sind. Bei der Propagation der Ekzitonenwellenfunktion mĂŒssen viele elektronische Strukturrechnungen angeregter ZustĂ€nde durchgefĂŒhrt werden, so dass dazu eine sehr schnelle Methode notwendig ist. Wir verwenden die approximative TD-DFTB Methode, die auf DFT mit einem GGA Funktional basiert. Es ist bekannt, dass GGA Funktionale fĂŒr ausgedehnte π-Elektronensysteme, wie sie in organischen Halbleitern stĂ€ndig vorkommen, nicht zuverlĂ€ssig sind. Innerhalb von DFT lösen sogenannte long-range corrected (LC) Funktionale das Problem. Wir fĂŒhren LC Funktionale in TD-DFTB ein, was Änderung am Formalismus erfordert. Wir zeigen, dass damit typische Probleme mit π-Systemen und Ladungstransferanregungen gelöst werden, bei tausendfach schnelleren Rechnungen als mit konventionellem TD-DFT. Abschließend beschĂ€ftigen wir uns mir der DFTB Methode selbst. LC Funktionale haben einen Parameter, der idealerweise systemspezifisch gewĂ€hlt wird. Bei jeder Anpassung mĂŒssen fĂŒr DFTB neue Parameter berechnet werden. Ein Satz von atompaarweisen Funktionen, genannt Repulsivpotentiale, erfordern dabei bisher viel Handarbeit. Wir versuchen diesen Vorgang zu automatisieren, indem wir DFTB mit Methoden aus der kĂŒnstlichen Intelligenz verbinden

    Online Multi-Stage Deep Architectures for Feature Extraction and Object Recognition

    Get PDF
    Multi-stage visual architectures have recently found success in achieving high classification accuracies over image datasets with large variations in pose, lighting, and scale. Inspired by techniques currently at the forefront of deep learning, such architectures are typically composed of one or more layers of preprocessing, feature encoding, and pooling to extract features from raw images. Training these components traditionally relies on large sets of patches that are extracted from a potentially large image dataset. In this context, high-dimensional feature space representations are often helpful for obtaining the best classification performances and providing a higher degree of invariance to object transformations. Large datasets with high-dimensional features complicate the implementation of visual architectures in memory constrained environments. This dissertation constructs online learning replacements for the components within a multi-stage architecture and demonstrates that the proposed replacements (namely fuzzy competitive clustering, an incremental covariance estimator, and multi-layer neural network) can offer performance competitive with their offline batch counterparts while providing a reduced memory footprint. The online nature of this solution allows for the development of a method for adjusting parameters within the architecture via stochastic gradient descent. Testing over multiple datasets shows the potential benefits of this methodology when appropriate priors on the initial parameters are unknown. Alternatives to batch based decompositions for a whitening preprocessing stage which take advantage of natural image statistics and allow simple dictionary learners to work well in the problem domain are also explored. Expansions of the architecture using additional pooling statistics and multiple layers are presented and indicate that larger codebook sizes are not the only step forward to higher classification accuracies. Experimental results from these expansions further indicate the important role of sparsity and appropriate encodings within multi-stage visual feature extraction architectures
    • 

    corecore