9 research outputs found

    Fast matrix computations for functional additive models

    Full text link
    It is common in functional data analysis to look at a set of related functions: a set of learning curves, a set of brain signals, a set of spatial maps, etc. One way to express relatedness is through an additive model, whereby each individual function gi(x)g_{i}\left(x\right) is assumed to be a variation around some shared mean f(x)f(x). Gaussian processes provide an elegant way of constructing such additive models, but suffer from computational difficulties arising from the matrix operations that need to be performed. Recently Heersink & Furrer have shown that functional additive model give rise to covariance matrices that have a specific form they called quasi-Kronecker (QK), whose inverses are relatively tractable. We show that under additional assumptions the two-level additive model leads to a class of matrices we call restricted quasi-Kronecker, which enjoy many interesting properties. In particular, we formulate matrix factorisations whose complexity scales only linearly in the number of functions in latent field, an enormous improvement over the cubic scaling of na\"ive approaches. We describe how to leverage the properties of rQK matrices for inference in Latent Gaussian Models

    Using auditory classification images for the identification of fine acoustic cues used in speech perception

    Get PDF
    International audienceAn essential step in understanding the processes underlying the general mechanism of perceptual categorization is to identify which portions of a physical stimulation modulate the behavior of our perceptual system. More specifically, in the context of speech comprehension, it is still a major open challenge to understand which information is used to categorize a speech stimulus as one phoneme or another, the auditory primitives relevant for the categorical perception of speech being still unknown. Here we propose to adapt a method relying on a Generalized Linear Model with smoothness priors, already used in the visual domain for the estimation of so-called classification images, to auditory experiments. This statistical model offers a rigorous framework for dealing with non-Gaussian noise, as it is often the case in the auditory modality, and limits the amount of noise in the estimated template by enforcing smoother solutions. By applying this technique to a specific two-alternative forced choice experiment between stimuli " aba " and " ada " in noise with an adaptive SNR, we confirm that the second formantic transition is key for classifying phonemes into /b/ or /d/ in noise, and that its estimation by the auditory system is a relative measurement across spectral bands and in relation to the perceived height of the second formant in the preceding syllable. Through this example, we show how the GLM with smoothness priors approach can be applied to the identification of fine functional acoustic cues in speech perception. Finally we discuss some assumptions of the model in the specific case of speech perception

    Probabilistic models for structured sparsity

    Get PDF

    Change blindness: eradication of gestalt strategies

    Get PDF
    Arrays of eight, texture-defined rectangles were used as stimuli in a one-shot change blindness (CB) task where there was a 50% chance that one rectangle would change orientation between two successive presentations separated by an interval. CB was eliminated by cueing the target rectangle in the first stimulus, reduced by cueing in the interval and unaffected by cueing in the second presentation. This supports the idea that a representation was formed that persisted through the interval before being 'overwritten' by the second presentation (Landman et al, 2003 Vision Research 43149–164]. Another possibility is that participants used some kind of grouping or Gestalt strategy. To test this we changed the spatial position of the rectangles in the second presentation by shifting them along imaginary spokes (by ±1 degree) emanating from the central fixation point. There was no significant difference seen in performance between this and the standard task [F(1,4)=2.565, p=0.185]. This may suggest two things: (i) Gestalt grouping is not used as a strategy in these tasks, and (ii) it gives further weight to the argument that objects may be stored and retrieved from a pre-attentional store during this task

    Advances in scalable learning and sampling of unnormalised models

    Get PDF
    We study probabilistic models that are known incompletely, up to an intractable normalising constant. To reap the full benefit of such models, two tasks must be solved: learning and sampling. These two tasks have been subject to decades of research, and yet significant challenges still persist. Traditional approaches often suffer from poor scalability with respect to dimensionality and model-complexity, generally rendering them inapplicable to models parameterised by deep neural networks. In this thesis, we contribute a new set of methods for addressing this scalability problem. We first explore the problem of learning unnormalised models. Our investigation begins with a well-known learning principle, Noise-contrastive Estimation, whose underlying mechanism is that of density-ratio estimation. By examining why existing density-ratio estimators scale poorly, we identify a new framework, telescoping density-ratio estimation (TRE), that can learn ratios between highly dissimilar densities in high-dimensional spaces. Our experiments demonstrate that TRE not only yields substantial improvements for the learning of deep unnormalised models, but can do the same for a broader set of tasks including mutual information estimation and representation learning. Subsequently, we explore the problem of sampling unnormalised models. A large literature on Markov chain Monte Carlo (MCMC) can be leveraged here, and in continuous domains, gradient-based samplers such as Metropolis-adjusted Langevin algorithm (MALA) and Hamiltonian Monte Carlo are excellent options. However, there has been substantially less progress in MCMC for discrete domains. To advance this subfield, we introduce several discrete Metropolis-Hastings samplers that are conceptually inspired by MALA, and demonstrate their strong empirical performance across a range of challenging sampling tasks

    Scalable Algorithms for the Analysis of Massive Networks

    Get PDF
    Die Netzwerkanalyse zielt darauf ab, nicht-triviale Erkenntnisse aus vernetzten Daten zu gewinnen. Beispiele für diese Erkenntnisse sind die Wichtigkeit einer Entität im Verhältnis zu anderen nach bestimmten Kriterien oder das Finden des am besten geeigneten Partners für jeden Teilnehmer eines Netzwerks - bekannt als Maximum Weighted Matching (MWM). Da der Begriff der Wichtigkeit an die zu betrachtende Anwendung gebunden ist, wurden zahlreiche Zentralitätsmaße eingeführt. Diese Maße stammen hierbei aus Jahrzehnten, in denen die Rechenleistung sehr begrenzt war und die Netzwerke im Vergleich zu heute viel kleiner waren. Heute sind massive Netzwerke mit Millionen von Kanten allgegenwärtig und eine triviale Berechnung von Zentralitätsmaßen ist oft zu zeitaufwändig. Darüber hinaus ist die Suche nach der Gruppe von k Knoten mit hoher Zentralität eine noch kostspieligere Aufgabe. Skalierbare Algorithmen zur Identifizierung hochzentraler (Gruppen von) Knoten in großen Graphen sind von großer Bedeutung für eine umfassende Netzwerkanalyse. Heutigen Netzwerke verändern sich zusätzlich im zeitlichen Verlauf und die effiziente Aktualisierung der Ergebnisse nach einer Änderung ist eine Herausforderung. Effiziente dynamische Algorithmen sind daher ein weiterer wesentlicher Bestandteil moderner Analyse-Pipelines. Hauptziel dieser Arbeit ist es, skalierbare algorithmische Lösungen für die zwei oben genannten Probleme zu finden. Die meisten unserer Algorithmen benötigen Sekunden bis einige Minuten, um diese Aufgaben in realen Netzwerken mit bis zu Hunderten Millionen von Kanten zu lösen, was eine deutliche Verbesserung gegenüber dem Stand der Technik darstellt. Außerdem erweitern wir einen modernen Algorithmus für MWM auf dynamische Graphen. Experimente zeigen, dass unser dynamischer MWM-Algorithmus Aktualisierungen in Graphen mit Milliarden von Kanten in Millisekunden bewältigt.Network analysis aims to unveil non-trivial insights from networked data by studying relationship patterns between the entities of a network. Among these insights, a popular one is to quantify the importance of an entity with respect to the others according to some criteria. Another one is to find the most suitable matching partner for each participant of a network knowing the pairwise preferences of the participants to be matched with each other - known as Maximum Weighted Matching (MWM). Since the notion of importance is tied to the application under consideration, numerous centrality measures have been introduced. Many of these measures, however, were conceived in a time when computing power was very limited and networks were much smaller compared to today's, and thus scalability to large datasets was not considered. Today, massive networks with millions of edges are ubiquitous, and a complete exact computation for traditional centrality measures are often too time-consuming. This issue is amplified if our objective is to find the group of k vertices that is the most central as a group. Scalable algorithms to identify highly central (groups of) vertices on massive graphs are thus of pivotal importance for large-scale network analysis. In addition to their size, today's networks often evolve over time, which poses the challenge of efficiently updating results after a change occurs. Hence, efficient dynamic algorithms are essential for modern network analysis pipelines. In this work, we propose scalable algorithms for identifying important vertices in a network, and for efficiently updating them in evolving networks. In real-world graphs with hundreds of millions of edges, most of our algorithms require seconds to a few minutes to perform these tasks. Further, we extend a state-of-the-art algorithm for MWM to dynamic graphs. Experiments show that our dynamic MWM algorithm handles updates in graphs with billion edges in milliseconds

    In silico modelling of parasite dynamics

    Get PDF
    Understanding host-parasite systems are challenging if biologists employ just the experimental approaches adopted, whereas mathematical models can help uncover other in-depth knowledge about host infection dynamics. Previous experimental studies have explored the infrapopulation dynamics of Gyrodactylus turnbulli and G. bullatarudis ectoparasites on their fish host, Poecilia reticulata. However, other important and open biological questions exist concerning parasite microhabitat preference, host survival, parasite virulence, and the transmission dynamics of different Gyrodactylus strains across different host populations over time. This thesis mathematically investigates these relevant biological questions to understand the gyrodactylid-fish system’s complexity better using a sophisticated multi-state Markov model (MSM) and a novel individual-based stochastic simulation model. The infection dynamics of three different gyrodactylid strains are compared across three different host populations. A modified approximate Bayesian computation (ABC) with sequential Monte Carlo (SMC) and sequential importance sampling (SIS) is developed for calibrating the novel stochastic model based on existing empirical data and an auxiliary stochastic model. In addition, an extended local-linear regression (with L2 regularisation) for ABC post-processing analysis has been proposed. Advanced statistics and an MSM are used to assess spatial-temporal parasite dynamics. A linear birth-death process with catastrophic extinction (B-D-C process) is considered the auxiliary model for the complex simulation model to refine the modified ABC’s summary statistics, with other theoretical justifications and parameter estimation techniques of the B-D-C process provided. The B-D-C process simulation using τ -leaping also provides additional insights on accelerating the complex simulation model by proposing a reasonable error threshold based on the trade-off between simulation accuracy and computational speed. The mathematical models can be extended and adapted for other host-parasite systems, and the modified ABC methodologies can also aid in efficiently calibrating other multi-parameter models with a high-dimensional set of correlating or independent summary statistics

    Auxiliary variable Markov chain Monte Carlo methods

    Get PDF
    Markov chain Monte Carlo (MCMC) methods are a widely applicable class of algorithms for estimating integrals in statistical inference problems. A common approach in MCMC methods is to introduce additional auxiliary variables into the Markov chain state and perform transitions in the joint space of target and auxiliary variables. In this thesis we consider novel methods for using auxiliary variables within MCMC methods to allow approximate inference in otherwise intractable models and to improve sampling performance in models exhibiting challenging properties such as multimodality. We first consider the pseudo-marginal framework. This extends the Metropolis–Hastings algorithm to cases where we only have access to an unbiased estimator of the density of target distribution. The resulting chains can sometimes show ‘sticking’ behaviour where long series of proposed updates are rejected. Further the algorithms can be difficult to tune and it is not immediately clear how to generalise the approach to alternative transition operators. We show that if the auxiliary variables used in the density estimator are included in the chain state it is possible to use new transition operators such as those based on slice-sampling algorithms within a pseudo-marginal setting. This auxiliary pseudo-marginal approach leads to easier to tune methods and is often able to improve sampling efficiency over existing approaches. As a second contribution we consider inference in probabilistic models defined via a generative process with the probability density of the outputs of this process only implicitly defined. The approximate Bayesian computation (ABC) framework allows inference in such models when conditioning on the values of observed model variables by making the approximation that generated observed variables are ‘close’ rather than exactly equal to observed data. Although making the inference problem more tractable, the approximation error introduced in ABC methods can be difficult to quantify and standard algorithms tend to perform poorly when conditioning on high dimensional observations. This often requires further approximation by reducing the observations to lower dimensional summary statistics. We show how including all of the random variables used in generating model outputs as auxiliary variables in a Markov chain state can allow the use of more efficient and robust MCMC methods such as slice sampling and Hamiltonian Monte Carlo (HMC) within an ABC framework. In some cases this can allow inference when conditioning on the full set of observed values when standard ABC methods require reduction to lower dimensional summaries for tractability. Further we introduce a novel constrained HMC method for performing inference in a restricted class of differentiable generative models which allows conditioning the generated observed variables to be arbitrarily close to observed data while maintaining computational tractability. As a final topicwe consider the use of an auxiliary temperature variable in MCMC methods to improve exploration of multimodal target densities and allow estimation of normalising constants. Existing approaches such as simulated tempering and annealed importance sampling use temperature variables which take on only a discrete set of values. The performance of these methods can be sensitive to the number and spacing of the temperature values used, and the discrete nature of the temperature variable prevents the use of gradient-based methods such as HMC to update the temperature alongside the target variables. We introduce new MCMC methods which instead use a continuous temperature variable. This both removes the need to tune the choice of discrete temperature values and allows the temperature variable to be updated jointly with the target variables within a HMC method
    corecore