Search CORE

10 research outputs found

VaiPhy: a Variational Inference Based Algorithm for Phylogeny

Author: Koptagel Hazal
Kviman Oskar
Lagergren Jens
Melin Harald
Safinianaini Negar
Publication venue
Publication date: 05/10/2022
Field of study

Phylogenetics is a classical methodology in computational biology that today has become highly relevant for medical investigation of single-cell data, e.g., in the context of cancer development. The exponential size of the tree space is, unfortunately, a substantial obstacle for Bayesian phylogenetic inference using Markov chain Monte Carlo based methods since these rely on local operations. And although more recent variational inference (VI) based methods offer speed improvements, they rely on expensive auto-differentiation operations for learning the variational parameters. We propose VaiPhy, a remarkably fast VI based algorithm for approximate posterior inference in an augmented tree space. VaiPhy produces marginal log-likelihood estimates on par with the state-of-the-art methods on real data and is considerably faster since it does not require auto-differentiation. Instead, VaiPhy combines coordinate ascent update equations with two novel sampling schemes: (i) SLANTIS, a proposal distribution for tree topologies in the augmented tree space, and (ii) the JC sampler, to the best of our knowledge, the first-ever scheme for sampling branch lengths directly from the popular Jukes-Cantor model. We compare VaiPhy in terms of density estimation and runtime. Additionally, we evaluate the reproducibility of the baselines. We provide our code on GitHub: \url{https://github.com/Lagergren-Lab/VaiPhy}.Comment: NeurIPS-22 conference pape

arXiv.org e-Print Archive

Learning with MISELBO: The Mixture Cookbook

Author: Elvira Víctor
Hotti Alexandra
Kurt Semih
Kviman Oskar
Lagergren Jens
Molén Ricky
Publication venue
Publication date: 30/09/2022
Field of study

Mixture models in variational inference (VI) is an active field of research. Recent works have established their connection to multiple importance sampling (MIS) through the MISELBO and advanced the use of ensemble approximations for large-scale problems. However, as we show here, an independent learning of the ensemble components can lead to suboptimal diversity. Hence, we study the effect of instead using MISELBO as an objective function for learning mixtures, and we propose the first ever mixture of variational approximations for a normalizing flow-based hierarchical variational autoencoder (VAE) with VampPrior and a PixelCNN decoder network. Two major insights led to the construction of this novel composite model. First, mixture models have potential to be off-the-shelf tools for practitioners to obtain more flexible posterior approximations in VAEs. Therefore, we make them more accessible by demonstrating how to apply them to four popular architectures. Second, the mixture components cooperate in order to cover the target distribution while trying to maximize their diversity when MISELBO is the objective function. We explain this cooperative behavior by drawing a novel connection between VI and adaptive importance sampling. Finally, we demonstrate the superiority of the Mixture VAEs' learned feature representations on both image and single-cell transcriptome data, and obtain state-of-the-art results among VAE architectures in terms of negative log-likelihood on the MNIST and FashionMNIST datasets. Code available here: \url{https://github.com/Lagergren-Lab/MixtureVAEs}

arXiv.org e-Print Archive

KL/TV Reshuffling : Statistical Distance Based Offspring Selection in SMC Methods

Author: Kviman Oskar
Publication venue: KTH, Skolan för elektroteknik och datavetenskap (EECS)
Publication date: 01/01/2022
Field of study

Over the years sequential Monte Carlo (SMC), and, equivalently, particle filter (PF) theory has enjoyed much attention from researchers. However, the intensity of developing innovative resampling methods, also known as offspring selection methods, has long been declining, with most of the popular schemes aging back two decades. Especially, the set of deterministic offspring selection methods is limited. In light of this, and inspired by variational inference, we propose offspring selection schemes which multiply/discard particles in order to minimize statistical distances between relevant distributions. By regarding offspring selection as a problem of minimizing statistical distances, we further bridge the gap between optimisation-based density estimation and SMC theory. Our contribution is in a sense twofold. Partly, we provide novel, deterministic offspring selection schemes, and, partly, we extend the class of SMC algorithms by using the particle likelihoods instead of importance weights when doing offspring selection. Our proposed methods outperform or compare favourably with the two most popular resampling schemes on density-estimation benchmark tests, which are commonly turned to in the SMC and particle Markov chain Monte Carlo (PMCMC) literature. Under åren har teorin inom sekventiell Monte Carlo (SMC) och, likväl, partikelfilter (PF) fått stor uppmärksamhet från forskare. Intensiteten att utveckla innovativa metoder för urval av avkommor, har dock länge avtagit och de flesta av de populära systemen har funnits i två decennier. Speciellt är uppsättningen av deterministiska urvalsmetoder begränsad. Mot bakgrund av detta, och inspirerad av variationsslutledning, föreslår vi urvalsmetoder som multiplicerar / kasserar partiklar för att minimera statistiska avstånd mellan relevanta fördelningar. Genom att betrakta urvalet som ett problem för att minimera statistiska avstånd, överbryggar vi ytterligare klyftan mellan optimeringsbaserad densitetsuppskattning och SMC-teori. Vårt bidrag är på sätt och vis dubbelt. Delvis tillhandahåller vi nya, deterministiska urvalsscheman, och delvis utökar vi klassen av SMC-algoritmer genom att använda partikel sannolikheter istället för viktvikter när man gör avkommesval. Våra föreslagna metoder överträffar eller jämför fördelaktigt med de två mest populära urvalsmetoderna för densitetsuppskattning på test som vanligtvis används för utvärdera metoder inom SMC och Markov-kedje-Monte Carlo

Digitala Vetenskapliga Arkivet - Academic Archive On-line

KL/TV Reshuffling : Statistical Distance Based Offspring Selection in SMC Methods

Author: Kviman Oskar
Publication venue: KTH, Skolan för elektroteknik och datavetenskap (EECS)
Publication date: 01/01/2022
Field of study

Publikationer från KTH

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Applicability of a Translucent Barrier Based Model of Noise

Author: Kviman Oskar
Nilsson Linus
Publication venue: KTH, Skolan för elektroteknik och datavetenskap (EECS)
Publication date: 01/01/2018
Field of study

The aim of this project was to create our own data set consisting of images of fruits and vegetables. A subset of the data set was composed of images where the fruits and vegetables were obscured by a plastic bag. We then evaluated the difficulty of this data set using a simple kernel machine algorithm. The performance drops considerably when introducing the above mentioned subset to the data set. The algorithm was to classify the different types of fruits and vegetables present in the data set. We also created the data set in different pixel dimensions, sufficiently reducing the computation time of the algorithm while not suffering a large drop in classification performance. This enables algorithms which complexity are highly dependent on input dimension size to use the data set. From our different experimental setups we were able to conclude that the machine outperforms humans on small input dimensions, given that the humans had no prior knowledge of the data set

Applicability of a Translucent Barrier Based Model of Noise

Author: Kviman Oskar
Nilsson Linus
Publication venue: KTH, Skolan för elektroteknik och datavetenskap (EECS)
Publication date: 01/01/2018
Field of study

Publikationer från KTH

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Applicability of a Translucent Barrier Based Model of Noise

Author: Kviman Oskar
Nilsson Linus
Publication venue: KTH, Skolan för elektroteknik och datavetenskap (EECS)
Publication date: 01/01/2018
Field of study

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Multiple Importance Sampling ELBO and Deep Ensembles of Variational Approximations

Author: Elvira Víctor
Koptagel Hazal
Kviman Oskar
Lagergren Jens
Melin Harald
Publication venue
Publication date: 22/02/2022
Field of study

In variational inference (VI), the marginal log-likelihood is estimated using the standard evidence lower bound (ELBO), or improved versions as the importance weighted ELBO (IWELBO). We propose the multiple importance sampling ELBO (MISELBO), a \textit{versatile} yet \textit{simple} framework. MISELBO is applicable in both amortized and classical VI, and it uses ensembles, e.g., deep ensembles, of independently inferred variational approximations. As far as we are aware, the concept of deep ensembles in amortized VI has not previously been established. We prove that MISELBO provides a tighter bound than the average of standard ELBOs, and demonstrate empirically that it gives tighter bounds than the average of IWELBOs. MISELBO is evaluated in density-estimation experiments that include MNIST and several real-data phylogenetic tree inference problems. First, on the MNIST dataset, MISELBO boosts the density-estimation performances of a state-of-the-art model, nouveau VAE. Second, in the phylogenetic tree inference setting, our framework enhances a state-of-the-art VI algorithm that uses normalizing flows. On top of the technical benefits of MISELBO, it allows to unveil connections between VI and recent advances in the importance sampling literature, paving the way for further methodological advances. We provide our code at \url{https://github.com/Lagergren-Lab/MISELBO}.Comment: AISTATS 202

arXiv.org e-Print Archive

Edinburgh Research Explorer

VaiPhy: a Variational Inference Based Algorithm for Phylogeny

Author: Koptagel Hazal
Kviman Oskar
Lagergren Jens
Melin Harald
Safinianaini Negar
Publication venue: KTH, Programvaruteknik och datorsystem, SCS
Publication date
Field of study

Phylogenetics is a classical methodology in com- putational biology that today has become highly relevant for medical investigation of single-cell data, e.g., in the context of development of can- cer. The exponential size of the tree space is unfortunately a formidable obstacle for current Bayesian phylogenetic inference using Markov chain Monte Carlo based methods since these rely on local operations. And although more re- cent variational inference (VI) based methods of- fer speed improvements, they rely on expensive auto-differentiation operations for learning the variational parameters. We propose VaiPhy, a remarkably fast VI based algorithm for approx- imate posterior inference in an augmented tree space. VaiPhy produces marginal log-likelihood estimates on par with the state-of-the-art meth- ods on real data, and is considerably faster since it does not require auto-differentiation. Instead, VaiPhy combines coordinate ascent update equa- tions with two novel sampling schemes: (i) SLANTIS, a proposal distribution for tree topolo- gies in the augmented tree space, and (ii) the JC sampler, the, to the best of our knowledge, first ever scheme for sampling branch lengths directly from the popular Jukes-Cantor model. We compare VaiPhy in terms of density esti- mation and runtime. Additionally, we evaluate the reproducibility of the baselines. We provide our code on GitHub: https://github.com/ Lagergren-Lab/VaiPhy. QC 20220421</p

Digitala Vetenskapliga Arkivet - Academic Archive On-line

VaiPhy: a Variational Inference Based Algorithm for Phylogeny

Author: Koptagel Hazal
Kviman Oskar
Lagergren Jens
Melin Harald
Safinianaini Negar
Publication venue: KTH, Programvaruteknik och datorsystem, SCS
Publication date
Field of study

Publikationer från KTH