Search CORE

4,825 research outputs found

Bayesian filtering unifies adaptive and non-adaptive neural network optimization methods

Author: Aitchison Laurence
Publication venue
Publication date: 31/07/2019
Field of study

We formulate the problem of neural network optimization as Bayesian filtering, where the observations are the backpropagated gradients. While neural network optimization has previously been studied using natural gradient methods which are closely related to Bayesian inference, they were unable to recover standard optimizers such as Adam and RMSprop with a root-mean-square gradient normalizer, instead getting a mean-square normalizer. To recover the root-mean-square normalizer, we find it necessary to account for the temporal dynamics of all the other parameters as they are geing optimized. The resulting optimizer, AdaBayes, adaptively transitions between SGD-like and Adam-like behaviour, automatically recovers AdamW, a state of the art variant of Adam with decoupled weight decay, and has generalisation performance competitive with SGD

arXiv.org e-Print Archive

Explore Bristol Research

Modeling sequences and temporal networks with dynamic community structures

Author: Peixoto Tiago P.
Rosvall Martin
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

In evolving complex systems such as air traffic and social organizations, collective effects emerge from their many components' dynamic interactions. While the dynamic interactions can be represented by temporal networks with nodes and links that change over time, they remain highly complex. It is therefore often necessary to use methods that extract the temporal networks' large-scale dynamic community structure. However, such methods are subject to overfitting or suffer from effects of arbitrary, a priori imposed timescales, which should instead be extracted from data. Here we simultaneously address both problems and develop a principled data-driven method that determines relevant timescales and identifies patterns of dynamics that take place on networks as well as shape the networks themselves. We base our method on an arbitrary-order Markov chain model with community structure, and develop a nonparametric Bayesian inference framework that identifies the simplest such model that can explain temporal interaction data.Comment: 15 Pages, 6 figures, 2 table

arXiv.org e-Print Archive

Publikationer från Umeå universitet

Directory of Open Access Journals

Digitala Vetenskapliga Arkivet - Academic Archive On-line

A network approach to topic models

Author: Altmann Eduardo G.
Gerlach Martin
Peixoto Tiago P.
Publication venue: 'American Association for the Advancement of Science (AAAS)'
Publication date: 04/07/2018
Field of study

One of the main computational and scientific challenges in the modern age is to extract useful information from unstructured texts. Topic models are one popular machine-learning approach which infers the latent topical structure of a collection of documents. Despite their success --- in particular of its most widely used variant called Latent Dirichlet Allocation (LDA) --- and numerous applications in sociology, history, and linguistics, topic models are known to suffer from severe conceptual and practical problems, e.g. a lack of justification for the Bayesian priors, discrepancies with statistical properties of real texts, and the inability to properly choose the number of topics. Here we obtain a fresh view on the problem of identifying topical structures by relating it to the problem of finding communities in complex networks. This is achieved by representing text corpora as bipartite networks of documents and words. By adapting existing community-detection methods -- using a stochastic block model (SBM) with non-parametric priors -- we obtain a more versatile and principled framework for topic modeling (e.g., it automatically detects the number of topics and hierarchically clusters both the words and documents). The analysis of artificial and real corpora demonstrates that our SBM approach leads to better topic models than LDA in terms of statistical model selection. More importantly, our work shows how to formally relate methods from community detection and topic modeling, opening the possibility of cross-fertilization between these two fields.Comment: 22 pages, 10 figures, code available at https://topsbm.github.io

arXiv.org e-Print Archive

MPG.PuRe

Non-parametric Bayesian modeling of complex networks

Author: Mørup Morten
Schmidt Mikkel N.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2013
Field of study

Modeling structure in complex networks using Bayesian non-parametrics makes it possible to specify flexible model structures and infer the adequate model complexity from the observed data. This paper provides a gentle introduction to non-parametric Bayesian modeling of complex networks: Using an infinite mixture model as running example we go through the steps of deriving the model as an infinite limit of a finite parametric model, inferring the model parameters by Markov chain Monte Carlo, and checking the model's fit and predictive performance. We explain how advanced non-parametric models for complex networks can be derived and point out relevant literature

arXiv.org e-Print Archive

Crossref

Online Research Database In Technology

Efficient inference of overlapping communities in complex networks

Author: Fruergaard Bjarne Ørum
Herlau Tue
Publication venue
Publication date: 01/01/2014
Field of study

We discuss two views on extending existing methods for complex network modeling which we dub the communities first and the networks first view, respectively. Inspired by the networks first view that we attribute to White, Boorman, and Breiger (1976)[1], we formulate the multiple-networks stochastic blockmodel (MNSBM), which seeks to separate the observed network into subnetworks of different types and where the problem of inferring structure in each subnetwork becomes easier. We show how this model is specified in a generative Bayesian framework where parameters can be inferred efficiently using Gibbs sampling. The result is an effective multiple-membership model without the drawbacks of introducing complex definitions of "groups" and how they interact. We demonstrate results on the recovery of planted structure in synthetic networks and show very encouraging results on link prediction performances using multiple-networks models on a number of real-world network data sets

arXiv.org e-Print Archive

Online Research Database In Technology

Learning Topic Models and Latent Bayesian Networks Under Expansion Constraints

Author: Anandkumar Animashree
Hsu Daniel
Javanmard Adel
Kakade Sham M.
Publication venue
Publication date: 24/09/2012
Field of study

Unsupervised estimation of latent variable models is a fundamental problem central to numerous applications of machine learning and statistics. This work presents a principled approach for estimating broad classes of such models, including probabilistic topic models and latent linear Bayesian networks, using only second-order observed moments. The sufficient conditions for identifiability of these models are primarily based on weak expansion constraints on the topic-word matrix, for topic models, and on the directed acyclic graph, for Bayesian networks. Because no assumptions are made on the distribution among the latent variables, the approach can handle arbitrary correlations among the topics or latent factors. In addition, a tractable learning method via

\ell_1

optimization is proposed and studied in numerical experiments.Comment: 38 pages, 6 figures, 2 tables, applications in topic models and Bayesian networks are studied. Simulation section is adde

arXiv.org e-Print Archive

CiteSeerX

eScholarship - University of California