Search CORE

214 research outputs found

Generalized Species Sampling Priors with Latent Beta reinforcements

Author: Blei D.
Blei D.M.
Cardin N.
Curtis C.
Edoardo M. Airoldi
Fabrizio Leisen
Federico Bassetti
Ferguson J.D.
Fortini S.
Jara A.
Michele Guindani
Müller P.
Park J.H.
Sudderth E.B.
Sun W.
Thiago Costa
———
Publication venue
Publication date: 01/01/2014
Field of study

Many popular Bayesian nonparametric priors can be characterized in terms of exchangeable species sampling sequences. However, in some applications, exchangeability may not be appropriate. We introduce a {novel and probabilistically coherent family of non-exchangeable species sampling sequences characterized by a tractable predictive probability function with weights driven by a sequence of independent Beta random variables. We compare their theoretical clustering properties with those of the Dirichlet Process and the two parameters Poisson-Dirichlet process. The proposed construction provides a complete characterization of the joint process, differently from existing work. We then propose the use of such process as prior distribution in a hierarchical Bayes modeling framework, and we describe a Markov Chain Monte Carlo sampler for posterior inference. We evaluate the performance of the prior and the robustness of the resulting inference in a simulation study, providing a comparison with popular Dirichlet Processes mixtures and Hidden Markov Models. Finally, we develop an application to the detection of chromosomal aberrations in breast cancer by leveraging array CGH data.Comment: For correspondence purposes, Edoardo M. Airoldi's email is [email protected]; Federico Bassetti's email is [email protected]; Michele Guindani's email is [email protected] ; Fabrizo Leisen's email is [email protected]. To appear in the Journal of the American Statistical Associatio

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Politecnico di Milano

Crossref

Archivio Istituzionale della Ricerca - Università degli Studi di Pavia

PubMed Central

eScholarship - University of California

Kent Academic Repository

FigShare

Measuring the Influence of Observations in HMMs through the Kullback-Leibler Distance

Author: Nuel Gregory
Perduca Vittorio
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2012
Field of study

We measure the influence of individual observations on the sequence of the hidden states of the Hidden Markov Model (HMM) by means of the Kullback-Leibler distance (KLD). Namely, we consider the KLD between the conditional distribution of the hidden states' chain given the complete sequence of observations and the conditional distribution of the hidden chain given all the observations but the one under consideration. We introduce a linear complexity algorithm for computing the influence of all the observations. As an illustration, we investigate the application of our algorithm to the problem of detecting outliers in HMM data series

arXiv.org e-Print Archive

HAL Descartes

Assessing Population Level Genetic Instability via Moving Average

Author: Betensky Rebecca
Cai Tianxi
McDaniel Samuel
Publication venue: Collection of Biostatistics Research Archive
Publication date: 06/11/2007
Field of study

Collection Of Biostatistics Research Archive

S-estimation of hidden Markov models

Author: Farcomeni Alessio
L. Greco
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

A method for robust estimation of dynamic mixtures of multivariate distributions is proposed. The EM algorithm is modified by replacing the classical M-step with high breakdown S-estimation of location and scatter, performed by using the bisquare multivariate S-estimator. Estimates are obtained by solving a system of estimating equations that are characterized by component specific sets of weights, based on robust Mahalanobis-type distances. Convergence of the resulting algorithm is proved and its finite sample behavior is investigated by means of a brief simulation study and n application to a multivariate time series of daily returns for seven stock markets

ART

Archivio della ricerca- Università di Roma La Sapienza

Functional characterization and annotation of trait-associated genomic regions by transcriptome analysis

Author: Du Yang (gnd: 1062825780)
Publication venue: Universität Rostock Rostock
Publication date: 01/01/2014
Field of study

In this work, two novel implementations have been presented, which could assist in the design and data analysis of high-throughput genomic experiments. An efficient and flexible tiling probe selection pipeline utilizing the penalized uniqueness score has been implemented, which could be employed in the design of various types and scales of genome tiling task. A novel hidden semi-Markov model (HSMM) implementation is made available within the Bioconductor project, which provides a unified interface for segmenting genomic data in a wide range of research subjects.In dieser Arbeit werden zwei neuartige Implementierungen präsentiert, die im Design und in der Datenanalyse von genomischen Hochdurchsatz-Experiment hilfreich sein könnten. Die erste Implementierung bildet eine effiziente und flexible Auswahl-Pipeline für Tiling-Proben, basierend auf einem Eindeutigkeitsmaß mit einer Maluswertung. Als zweite Implementierung wurde ein neuartiges Hidden-Semi-Markov-Modell (HSMM) im Bioconductor Projekt verfügbar gemacht

Rostocker Dokumentenserver

Hidden Markov Models

Author
Publication venue: 'IntechOpen'
Publication date: 20/04/2021
Field of study

Hidden Markov Models (HMMs), although known for decades, have made a big career nowadays and are still in state of development. This book presents theoretical issues and a variety of HMMs applications in speech recognition and synthesis, medicine, neurosciences, computational biology, bioinformatics, seismology, environment protection and engineering. I hope that the reader will find this book useful and helpful for their own research

Directory of Open Access Books (DOAB)

Robust unmixing of tumor states in array comparative genomic hybridization data

Author: Atkins
Ayshwarya Subramanian
Beerenwinkel
Bild
Boyd
Chan
Charalampos Tsourakakis
Comon
David Tolliver
Desper
Ehrlich
Etzioni
Gerstung
Golub
Guha
Hglund
Kamb
Lamy
Lee
Liu
Navin
Olshen
Packer
Paik
Pearson
Pegram
Pennington
Perou
Russell Schwartz
Schwartz
Schölkopf
Schölkopf
Shackney
Sorlie
Sorlie
Sotiriou
Stanley Shackney
Zhou
Publication venue: Oxford University Press
Publication date
Field of study

Motivation: Tumorigenesis is an evolutionary process by which tumor cells acquire sequences of mutations leading to increased growth, invasiveness and eventually metastasis. It is hoped that by identifying the common patterns of mutations underlying major cancer sub-types, we can better understand the molecular basis of tumor development and identify new diagnostics and therapeutic targets. This goal has motivated several attempts to apply evolutionary tree reconstruction methods to assays of tumor state. Inference of tumor evolution is in principle aided by the fact that tumors are heterogeneous, retaining remnant populations of different stages along their development along with contaminating healthy cell populations. In practice, though, this heterogeneity complicates interpretation of tumor data because distinct cell types are conflated by common methods for assaying the tumor state. We previously proposed a method to computationally infer cell populations from measures of tumor-wide gene expression through a geometric interpretation of mixture type separation, but this approach deals poorly with noisy and outlier data

Crossref

PubMed Central

Gene Copy Number Analysis for Family Data Using Semiparametric Copula Model

Author: Bonney George
Chen Guanjie
Rotimi Charles
Yuan Ao
Zhou Zhong-Cheng
Publication venue: Libertas Academica
Publication date: 01/01/2008
Field of study

Gene copy number changes are common characteristics of many genetic disorders. A new technology, array comparative genomic hybridization (a-CGH), is widely used today to screen for gains and losses in cancers and other genetic diseases with high resolution at the genome level or for specific chromosomal region. Statistical methods for analyzing such a-CGH data have been developed. However, most of the existing methods are for unrelated individual data and the results from them provide explanation for horizontal variations in copy number changes. It is potentially meaningful to develop a statistical method that will allow for the analysis of family data to investigate the vertical kinship effects as well. Here we consider a semiparametric model based on clustering method in which the marginal distributions are estimated nonparametrically, and the familial dependence structure is modeled by copula. The model is illustrated and evaluated using simulated data. Our results show that the proposed method is more robust than the commonly used multivariate normal model. Finally, we demonstrated the utility of our method using a real dataset

Directory of Open Access Journals

PubMed Central

Approximation algorithms for speeding up dynamic programming and denoising aCGH data

Author: Barry D.
Charalampos E. Tsourakakis
Ding J.
Gary L. Miller
Jagadish H. V.
Lejeune J.
Lejeune J.
Maria A. Tsiarli
Richard Peng
Russell Schwartz
Shi Y.
Viti F.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref