247 research outputs found

    Online but Accurate Inference for Latent Variable Models with Local Gibbs Sampling

    Get PDF
    We study parameter inference in large-scale latent variable models. We first propose an unified treatment of online inference for latent variable models from a non-canonical exponential family, and draw explicit links between several previously proposed frequentist or Bayesian methods. We then propose a novel inference method for the frequentist estimation of parameters, that adapts MCMC methods to online inference of latent variable models with the proper use of local Gibbs sampling. Then, for latent Dirich-let allocation,we provide an extensive set of experiments and comparisons with existing work, where our new approach outperforms all previously proposed methods. In particular, using Gibbs sampling for latent variable inference is superior to variational inference in terms of test log-likelihoods. Moreover, Bayesian inference through variational methods perform poorly, sometimes leading to worse fits with latent variables of higher dimensionality

    Working Document on Gloss Ontology

    Get PDF
    This document describes the Gloss Ontology. The ontology and associated class model are organised into several packages. Section 2 describes each package in detail, while Section 3 contains a summary of the whole ontology

    Calcium imaging in the ant Camponotus fellah reveals a conserved odour-similarity space in insects and mammals

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Olfactory systems create representations of the chemical world in the animal brain. Recordings of odour-evoked activity in the primary olfactory centres of vertebrates and insects have suggested similar rules for odour processing, in particular through spatial organization of chemical information in their functional units, the glomeruli. Similarity between odour representations can be extracted from across-glomerulus patterns in a wide range of species, from insects to vertebrates, but comparison of odour similarity in such diverse taxa has not been addressed. In the present study, we asked how 11 aliphatic odorants previously tested in honeybees and rats are represented in the antennal lobe of the ant <it>Camponotus fellah</it>, a social insect that relies on olfaction for food search and social communication.</p> <p>Results</p> <p>Using calcium imaging of specifically-stained second-order neurons, we show that these odours induce specific activity patterns in the ant antennal lobe. Using multidimensional analysis, we show that clustering of odours is similar in ants, bees and rats. Moreover, odour similarity is highly correlated in all three species.</p> <p>Conclusion</p> <p>This suggests the existence of similar coding rules in the neural olfactory spaces of species among which evolutionary divergence happened hundreds of million years ago.</p

    Learning Determinantal Point Processes in Sublinear Time

    Get PDF
    Under review for AISTATS 2017We propose a new class of determinantal point processes (DPPs) which can be manipulated for inference and parameter learning in potentially sublinear time in the number of items. This class, based on a specific low-rank factorization of the marginal kernel, is particularly suited to a subclass of continuous DPPs and DPPs defined on exponentially many items. We apply this new class to modelling text documents as sampling a DPP of sentences, and propose a conditional maximum likelihood formulation to model topic proportions, which is made possible with no approximation for our class of DPPs. We present an application to document summarization with a DPP on 25002^{500} items

    Learning Determinantal Point Processes in Sublinear Time

    Get PDF
    Under review for AISTATS 2017We propose a new class of determinantal point processes (DPPs) which can be manipulated for inference and parameter learning in potentially sublinear time in the number of items. This class, based on a specific low-rank factorization of the marginal kernel, is particularly suited to a subclass of continuous DPPs and DPPs defined on exponentially many items. We apply this new class to modelling text documents as sampling a DPP of sentences, and propose a conditional maximum likelihood formulation to model topic proportions, which is made possible with no approximation for our class of DPPs. We present an application to document summarization with a DPP on 25002^{500} items

    Decentralized Topic Modelling with Latent Dirichlet Allocation

    Get PDF
    International audiencePrivacy preserving networks can be modelled as decentralized networks (e.g., sensors , connected objects, smartphones), where communication between nodes of the network is not controlled by a master or central node. For this type of networks, the main issue is to gather/learn global information on the network (e.g., by optimizing a global cost function) while keeping the (sensitive) information at each node. In this work, we focus on text information that agents do not want to share (e.g., , text messages, emails, confidential reports). We use recent advances on decentralized optimization and topic models to infer topics from a graph with limited communication. We propose a method to adapt latent Dirichlet allocation (LDA) model to decentralized optimization and show on synthetic data that we still recover similar parameters and similar performance at each node than with stochastic methods accessing to the whole information in the graph

    Assessing the Impact of Bycatch on Dolphin Populations: The Case of the Common Dolphin in the Eastern North Atlantic

    Get PDF
    Fisheries interactions have been implicated in the decline of many marine vertebrates worldwide. In the eastern North Atlantic, at least 1000 common dolphins (Delphinus delphis) are bycaught each year, particularly in pelagic pair-trawls. We have assessed the resulting impact of bycatch on this population using a demographic modeling approach. We relied on a sample of females stranded along the French Atlantic and western Channel coasts. Strandings represent an extensive source of demographic information to monitor our study population. Necropsy analysis provided an estimate of individual age and reproductive state. Then we estimated effective survivorship (including natural and human-induced mortality), age at first reproduction and pregnancy rates. Reproductive parameters were consistent with literature, but effective survivorship was unexpectedly low. Demographic parameters were then used as inputs in two models. A constant parameter matrix proposed an effective growth rate of −5.5±0.5%, corresponding to the current situation (including bycatch mortality). Subsequently, deterministic projections suggested that the population would be reduced to 20% of its current size in 30 years and would be extinct in 100 years. The demographic invariant model suggested a maximum growth rate of +4.5±0.09%, corresponding to the optimal demographic situation. Then, a risk analysis incorporating Potential Biological Removal (PBR), based on two plausible scenarii for stock structure suggested that bycatch level was unsustainable for the neritic population of the Bay of Biscay under a two-stock scenario. In depth assessment of stock structure and improved observer programs to provide scientifically robust bycatch estimates are needed. Effective conservation measures would be reducing bycatch to less than 50% of the current level in the neritic stock to reach PBR. Our approach provided indicators of the status and trajectory of the common dolphin population in the eastern North Atlantic and therefore proved to be a valuable tool for management, applicable to other dolphin populations

    Exploiting crowd sourced reviews to explain movie recommendation

    Get PDF
    International audienceStreaming services such as Netflix, M-Go, and Hulu use advanced recommender systems to help their customers identify relevant content quickly and easily. These recommenders display the list of recommended movies organized in sublists labeled with the genre or some more specific labels. Unfortunately , existing methods to extract these labeled sublists require human annotators to manually label movies, which is time-consuming and biased by the views of annotators. In this paper, we design a method that relies on crowd sourced reviews to automatically identify groups of similar movies and label these groups. Our method takes the content of movie reviews available online as input for an algorithm based on Latent Dirichlet Allocation (LDA) that identifies groups of similar movies. We separate the set of similar movies that share the same combination of genre in sublists and personalize the movies to show in each sublist using matrix factorization. The results of a side-by-side comparison of our method against Technicolor's M-Go VoD service are encouraging
    corecore