Search CORE

9 research outputs found

VaiPhy: a Variational Inference Based Algorithm for Phylogeny

Author: Koptagel Hazal
Kviman Oskar
Lagergren Jens
Melin Harald
Safinianaini Negar
Publication venue
Publication date: 05/10/2022
Field of study

Phylogenetics is a classical methodology in computational biology that today has become highly relevant for medical investigation of single-cell data, e.g., in the context of cancer development. The exponential size of the tree space is, unfortunately, a substantial obstacle for Bayesian phylogenetic inference using Markov chain Monte Carlo based methods since these rely on local operations. And although more recent variational inference (VI) based methods offer speed improvements, they rely on expensive auto-differentiation operations for learning the variational parameters. We propose VaiPhy, a remarkably fast VI based algorithm for approximate posterior inference in an augmented tree space. VaiPhy produces marginal log-likelihood estimates on par with the state-of-the-art methods on real data and is considerably faster since it does not require auto-differentiation. Instead, VaiPhy combines coordinate ascent update equations with two novel sampling schemes: (i) SLANTIS, a proposal distribution for tree topologies in the augmented tree space, and (ii) the JC sampler, to the best of our knowledge, the first-ever scheme for sampling branch lengths directly from the popular Jukes-Cantor model. We compare VaiPhy in terms of density estimation and runtime. Additionally, we evaluate the reproducibility of the baselines. We provide our code on GitHub: \url{https://github.com/Lagergren-Lab/VaiPhy}.Comment: NeurIPS-22 conference pape

arXiv.org e-Print Archive

Novel likelihood-based inference techniques for sequential data with medical and biological applications

Author: Safinianaini Negar
Publication venue
Publication date: 01/01/2022
Field of study

The probabilistic approach is crucial in modern machine learning, as it provides transparency and quantification of uncertainty. This thesis is concerned with the probabilistic building blocks, i.e., probabilistic graphical models (PGM) followed by application of standard deterministic approximate inference, i.e., Expectation-Maximization (EM) and Variational Inference (VI). The contribution regards improvements on the parameter learning of EM, most importantly, novel probabilistic models, and new VI methodology for phylogenetic inference. Firstly, this thesis improves upon the vanilla EM algorithm for hidden Markov models (HMM) and mixtures of HMMs (MHMM). The proposed constrained EM algorithm for HMMs compensates for the lack of long-range context in HMMs. The two other proposed novel regularized EM algorithms provide better local optima for parameter learning of MHMMs, particularly in cancer analysis. The novel EMs are merely modifications of the standard EM algorithm that do not add any extra complexity, unlike other modifications targeting the context and poor local optima issues. Secondly, this thesis introduces one novel and one augmented PGMs together with the VI frameworks for robust and fast Bayesian inference. The first method, CopyMix, uses a single-phase framework to simultaneously provide clonal decomposition and copy number pro- filing of single-cell cancer data. So, in contrast to previous approaches, it does not achieve the two objectives in a sequential and ad-hoc fashion, which is prune to introduce artifacts and errors. The second method provides an augmented PGM with a faster framework for phylogenetic inference; specifically, a novel natural gradient-based VI algorithm is devised. Regarding the cancer analysis, this thesis concludes that CopyMix is superior to MH- MMs, despite that the two novel EM algorithms proposed in this thesis partially improve the performance of clonal tumor decomposition. The empirical support presented throughout this thesis confirms that the proposed likelihood-based methods and optimization tools provide opportunities for better analysis algorithms, particularly suited for cancer research. QC 20220422</p

Publikationer från KTH

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Novel likelihood-based inference techniques for sequential data with medical and biological applications

Author: Safinianaini Negar
Publication venue
Publication date: 01/01/2022
Field of study

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Towards interpretability of Mixtures of Hidden Markov Models

Author: Boström Henrik
Safinianaini Negar
Publication venue: KTH, Programvaruteknik och datorsystem, SCS
Publication date: 01/01/2021
Field of study

QC 20220419</p

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Towards interpretability of Mixtures of Hidden Markov Models

Author: Boström Henrik
Safinianaini Negar
Publication venue: KTH, Programvaruteknik och datorsystem, SCS
Publication date: 01/01/2021
Field of study

QC 20220419</p

arXiv.org e-Print Archive

Publikationer från KTH

Digitala Vetenskapliga Arkivet - Academic Archive On-line

VaiPhy: a Variational Inference Based Algorithm for Phylogeny

Author: Koptagel Hazal
Kviman Oskar
Lagergren Jens
Melin Harald
Safinianaini Negar
Publication venue: KTH, Programvaruteknik och datorsystem, SCS
Publication date
Field of study

Phylogenetics is a classical methodology in com- putational biology that today has become highly relevant for medical investigation of single-cell data, e.g., in the context of development of can- cer. The exponential size of the tree space is unfortunately a formidable obstacle for current Bayesian phylogenetic inference using Markov chain Monte Carlo based methods since these rely on local operations. And although more re- cent variational inference (VI) based methods of- fer speed improvements, they rely on expensive auto-differentiation operations for learning the variational parameters. We propose VaiPhy, a remarkably fast VI based algorithm for approx- imate posterior inference in an augmented tree space. VaiPhy produces marginal log-likelihood estimates on par with the state-of-the-art meth- ods on real data, and is considerably faster since it does not require auto-differentiation. Instead, VaiPhy combines coordinate ascent update equa- tions with two novel sampling schemes: (i) SLANTIS, a proposal distribution for tree topolo- gies in the augmented tree space, and (ii) the JC sampler, the, to the best of our knowledge, first ever scheme for sampling branch lengths directly from the popular Jukes-Cantor model. We compare VaiPhy in terms of density esti- mation and runtime. Additionally, we evaluate the reproducibility of the baselines. We provide our code on GitHub: https://github.com/ Lagergren-Lab/VaiPhy. QC 20220421</p

Publikationer från KTH

VaiPhy: a Variational Inference Based Algorithm for Phylogeny

Author: Koptagel Hazal
Kviman Oskar
Lagergren Jens
Melin Harald
Safinianaini Negar
Publication venue: KTH, Programvaruteknik och datorsystem, SCS
Publication date
Field of study

Digitala Vetenskapliga Arkivet - Academic Archive On-line

CopyMix : Mixture Model Based Single-Cell Clustering and Copy Number Profiling using Variational Inference

Author: de Souza Camila P.E.
Koptagel Hazal
Lagergren Jens
Roth Andrew
Safinianaini Negar
Toosi Hosein
Publication venue: KTH, Science for Life Laboratory, SciLifeLab
Publication date
Field of study

QC 20220421</p

Publikationer från KTH

CopyMix : Mixture Model Based Single-Cell Clustering and Copy Number Profiling using Variational Inference

Author: de Souza Camila P.E.
Koptagel Hazal
Lagergren Jens
Roth Andrew
Safinianaini Negar
Toosi Hosein
Publication venue: KTH, Science for Life Laboratory, SciLifeLab
Publication date
Field of study

QC 20220421</p

Digitala Vetenskapliga Arkivet - Academic Archive On-line