31 research outputs found

    Polya-gamma augmentations for factor models

    Get PDF
    Jufo_ID 71804.Bayesian inference for latent factor models, such as principal component and canonical correlation analysis, is easy for Gaussian likelihoods with conjugate priors using both Gibbs sampling and mean-field variational approximation. For other likelihood potentials one needs to either resort to more complex sampling schemes or to specifying dedicated forms for variational lower bounds. Recently, however, it was shown that for specific likelihoods related to the logistic function it is possible to augment the joint density with auxiliary variables following a P`olya-Gamma distribution, leading to closed-form updates for binary and over-dispersed count models. In this paper we describe how Gibbs sampling and mean-field variational approximation for various latent factor models can be implemented for these cases, presenting easy-to-implement and efficient inference schemas.Peer reviewe

    Probabilistic Tensor Decomposition of Neural Population Spiking Activity

    Get PDF
    The firing of neural populations is coordinated across cells, in time, and across experimental conditions or repeated experimental trials, and so a full understanding of the computational significance of neural responses must be based on a separation of these different contributions to structured activity. Tensor decomposition is an approach to untangling the influence of multiple factors in data that is common in many fields. However, despite some recent interest in neuroscience, wider applicability of the approach is hampered by the lack of a full probabilistic treatment allowing principled inference of a decomposition from non-Gaussian spike-count data. Here, we extend the Polya-Gamma (PG) augmentation, previously used in sampling-based Bayesian inference, to implement scalable variational inference in non-conjugate spike-count models. Using this new approach, we develop techniques related to automatic relevance determination to infer the most appropriate tensor rank, as well as to incorporate priors based on known brain anatomy such as the segregation of cell response properties by brain area. We apply the model to neural recordings taken under conditions of visual-vestibular sensory integration, revealing how the encoding of self- and visual-motion signals is modulated by the sensory information available to the animal

    Efficient Bayesian Inference of Sigmoidal Gaussian Cox Processes

    Get PDF
    We present an approximate Bayesian inference approach for estimating the intensity of a inhomogeneous Poisson process, where the intensity function is modelled using a Gaussian process (GP) prior via a sigmoid link function. Augmenting the model using a latent marked Poisson process and Polya--Gamma random variables we obtain a representation of the likelihood which is conjugate to the GP prior. We estimate the posterior using a variational free--form mean field optimisation together with the framework of sparse GPs. Furthermore, as alternative approximation we suggest a sparse Laplace's method for the posterior, for which an efficient expectation--maximisation algorithm is derived to find the posterior's mode. Both algorithms compare well against exact inference obtained by a Markov Chain Monte Carlo sampler and standard variational Gauss approach solving the same model, while being one order of magnitude faster. Furthermore, the performance and speed of our method is competitive with that of another recently proposed Poisson process model based on a quadratic link function, while not being limited to GPs with squared exponential kernels and rectangular domains.DFG, 318763901, Approximative Bayes’sche Schätzung und Modellauswahl für stochastische Differentialgleichungen (A06)DFG, 318763901, SFB 1294: Datenassimilation: Die nahtlose Verschmelzung von Daten und Modelle

    Scalable Bayesian Induction of Word Embeddings

    Get PDF
    Traditional natural language processing has been shown to have excessive reliance on human-annotated corpora. However, the recent successes of machine translation and speech recognition, ascribed to the effective use of the increasingly availability of web-scale data in the wild, has given momentum to a re-surging interest in attempting to model natural language with simple statistical models, such as the n-gram model, that are easily scaled. Indeed, words and word combinations provide all the representational machinery one needs for solving many natural language tasks. The degree of semantic similarity between two words is a function of the similarity of the linguistic contexts in which they appear. Word representations are mathematical objects, often vectors, that capture syntactic and semantic properties of a word. This results in words that are semantic cognates having similar word representations, an important property that we will widely use. We claim that word representations provide a superb framework for unsupervised learning on unlabelled data by compactly representing the distributional properties of words. The current state-of-the-art word representation adopts the skip-gram model to train shallow neural networks and presents negative sampling, an idea borrowed from Noise Contrastive Estimation, as an efficient method of inducing embeddings. An alternative approach contends that the inherent multi-contextual nature of words entails a more Canonical Correlation Analysis-like approach for best results. In this thesis we develop the first fully Bayesian model to induce word embeddings. The prominent contributions of this thesis are: 1. A crystallisation of the best practices from previous literature on word embeddings and matrix factorisation into a single hierarchical Bayesian model. 2. A scalable matrix factorisation technique for structured sparse data. 3. Representation of the latent dimensions as continuous Gaussian densities instead of as point estimates. We analyse a corpus of 170 million tokens and learn for each word form a vectorial representation based on the 8 surrounding context words with a negative sampling rate of 2 per token. We would like to stress that while we certainly hope to beat the state-of-the-art, our primary goal is to develop a stochastic and scalable Bayesian model. We evaluate the quality of the word embeddings against the word analogy tasks as well as other such tasks as word similarity and chunking. We demonstrate competitive performance on standard benchmarks

    Accentuation du phénotype des souris YG8sR, un modèle de l'ataxie de Friedreich, à l'aide de shARNs ciblant le gène de la frataxine

    Get PDF
    L'ataxie de Friedreich (FRDA est la plus fréquente ataxie neurodégénérative invalidante. Elle est une maladie héréditaire récessive progressive qui touche sévèrement le système nerveux et cardiaque. La FRDA pose non seulement un défis de thérapie curative mais aussi celui du modèle animal reproduisant la symptomatologie. Dépendant du nombre de répétitions GAA, les souris modèles, telles que les souris YG8sR contenant entre 250-300 GAA, présentent un phénotype plus ou moins sévère. Notre étude a pour but d'accentuer le phénotype des souris YG8sR en utilisant des short hairpin ARNs (shARNs) ciblant l'ARNm de la frataxine pour réduire l'expression de cette protéine. Nous avons pu, après un test d'efficacité des shARNs in vitro dans les cellules HeLa et HEK 293T, choisir 2 shARNs parmi les 4 testés capables de réduire le taux de frataxine. Nous avons sélectionné les shARN6 et shARN1 qui étaient capables après la transfection dans les cellules à 2 µg d'ADN de réduire respectivement de 40% et 70% le taux de frataxine dans les cellules. Lorsque nous avons injecté en intraveineuse 1.2X10¹² ou 2.4X10¹² copies d'AAV-PHP.B codant pour ces shARNs, nous avons observé une perte de poids, des troubles de la motricité et de la coordination, ainsi qu'une diminution de la force motrice chez les souris YG8sR ayant reçu du shARN1 à 1.2X10¹². Nous avons donc développé un modèle amélioré de souris (Imp-YG8sR) en réduisant davantage l'expression de la frataxine avec cette dose de shARN1. Le phénotype plus sévère de ces souris est plus proche de celui des patients atteints de l'ataxie de Friedreich que le modèle original YG8sR utilisé sans les shARNs. Notre modèle de souris Imp-YG8sR sera donc bénéfique pour des tests de thérapies géniques actuellement en développement.Friedreich's ataxia (FRDA) is the most common disabling neurodegenerative ataxia. It is a progressive recessive inherited disease that severely affects the nervous and cardiac systems. FRDA poses not only a challenge of curative therapy but also that of the animal model reproducing the symptomatology. Depending on the number of GAA repeats, mouse models, such as YG8sR containing between 250-300 GAA, exhibit a more or less severe phenotype. Our study aims to enhance the phenotype of YG8sR mice by using short hairpin RNAs (shRNAs) targeting the frataxin mRNA to reduce the expression of this protein. We were able, after an in vitro efficacy test of the shRNAs in HeLa and HEK 293T cells, to choose 2 shRNAs among the 4 tested, to reduce the frataxin level. We selected shARN6 and shARN1 which were able to reduce the frataxin level by up to 40% and 70%, respectively in cells, when injected intravenously at 1.2X10¹² or 2.4X10¹² copies of AAV-PHP.B encoding these shRNAs. We observed a loss of weight, a disturbance of the motor skills, a reduced coordination and force in the YG8sR mice, which received the shRNA1 at 1.2X10¹². We have therefore developed an improved mouse model (Imp-YG8sR) by further reducing the expression of frataxin with this dose of shRNA1. The more severe phenotype of the Imp-YG8sR is closer to that of patients with Friedreich's ataxia than the original model used without the shRNAs. Our Imp-YG8sR mouse model will therefore be beneficial for gene therapies currently in development

    The Bayesian Learning Rule

    Full text link
    We show that many machine-learning algorithms are specific instances of a single algorithm called the Bayesian learning rule. The rule, derived from Bayesian principles, yields a wide-range of algorithms from fields such as optimization, deep learning, and graphical models. This includes classical algorithms such as ridge regression, Newton's method, and Kalman filter, as well as modern deep-learning algorithms such as stochastic-gradient descent, RMSprop, and Dropout. The key idea in deriving such algorithms is to approximate the posterior using candidate distributions estimated by using natural gradients. Different candidate distributions result in different algorithms and further approximations to natural gradients give rise to variants of those algorithms. Our work not only unifies, generalizes, and improves existing algorithms, but also helps us design new ones
    corecore