Search CORE

2,057 research outputs found

Finite mixtures of matrix-variate Poisson-log normal distributions for three-way count data

Author: McNicholas Paul D.
Rothstein Steven J.
Silva Anjali
Subedi Sanjeena
Publication venue
Publication date: 22/07/2018
Field of study

Three-way data structures, characterized by three entities, the units, the variables and the occasions, are frequent in biological studies. In RNA sequencing, three-way data structures are obtained when high-throughput transcriptome sequencing data are collected for n genes across p conditions at r occasions. Matrix-variate distributions offer a natural way to model three-way data and mixtures of matrix-variate distributions can be used to cluster three-way data. Clustering of gene expression data is carried out as means to discovering gene co-expression networks. In this work, a mixture of matrix-variate Poisson-log normal distributions is proposed for clustering read counts from RNA sequencing. By considering the matrix-variate structure, full information on the conditions and occasions of the RNA sequencing dataset is simultaneously considered, and the number of covariance parameters to be estimated is reduced. A Markov chain Monte Carlo expectation-maximization algorithm is used for parameter estimation and information criteria are used for model selection. The models are applied to both real and simulated data, giving favourable clustering results

arXiv.org e-Print Archive

Clustering high-throughput sequencing data with Poisson mixture models

Author: Celeux Gilles
Martin-Magniette Marie-Laure
Maugis-Rabusseau Cathy
Rau Andrea
Publication venue: HAL CCSD
Publication date: 01/01/2011
Field of study

In recent years gene expression studies have increasingly made use of next generation sequencing technology. In turn, research concerning the appropriate statistical methods for the analysis of digital gene expression has flourished, primarily in the context of normalization and differential analysis. In this work, we focus on the question of clustering digital gene expression profiles as a means to discover groups of co-expressed genes. We propose two parameterizations of a Poisson mixture model to cluster expression profiles of high-throughput sequencing data. A set of simulation studies compares the performance of the proposed models with that of an approach developed for a similar type of data, namely serial analysis of gene expression. We also study the performance of these approaches on two real high-throughput sequencing data sets. The R package HTSCluster used to implement the proposed Poisson mixture models is available on CRAN.De plus en plus, les études d'expression de gènes utilisent les techniques de séquençage de nouvelle génération, entraînant une recherche grandissante sur les méthodes les plus appropriées pour l'exploitation des données digitales d'expression, à commencer pour leur normalisation et l'analyse différentielle. Ici, nous nous intéressons à la classification non supervisée des profils d'expression pour la découverte de groupes de gènes coexprimés. Nous proposons deux paramétrisations d'un modèle de mélange de Poisson pour classer des données de séquençage haut-débit. Par des simulations, nous comparons les performances de ces modèles avec des méthodes similaires conçus pour l'analyse en série de l'expression des gènes (SAGE). Nous étudions aussi les performances de ces modèles sur deux jeux de données réelles. Le package R HTSCluster associé à cette étude est disponible sur le CRAN

HAL Evry

Scientific Publications of the University of Toulouse II Le Mirail

INRIA a CCSD electronic archive server

HAL-INSA Toulouse

ProdInra

Hal-Diderot

Detecting differential usage of exons from RNA-Seq data

Author: Alejandro Reyes
Simon Anders
Wolfgang Huber
Publication venue
Publication date: 25/01/2012
Field of study

RNA-Seq is a powerful tool for the study of alternative splicing and other forms of alternative isoform expression. Understanding the regulation of these processes requires comparisons between treatments, tissues or conditions. For the analysis of such experiments, we present _DEXSeq_, a statistical method to test for differential exon usage in RNA-Seq data. _DEXSeq_ employs generalized linear models and offers good detection power and reliable control of false discoveries by taking biological variation into account. An implementation is available as an R/Bioconductor package

CiteSeerX

Crossref

Nature Precedings

Bayesian Modeling Approaches for Temporal Dynamics in RNA-seq Data

Author: Oh Sunghee
Song Seongho
Publication venue: 'IntechOpen'
Publication date: 02/05/2018
Field of study

Analysis of differential expression has been a central role to address the variety of biological questions in the manner to characterize abnormal patterns of cellular and molecular functions for last decades. To date, identification of differentially expressed genes and isoforms has been more intensively focused on temporal dynamics over a series of time points. Bayesian strategies have been successfully employed to uncover the complexity of biological interest with the methodological and analytical perspectives for the various platforms of high-throughput data, for instance, methods in differential expression analysis and network modules in transcriptome data, peak-callers in ChipSeq data, target prediction in microRNA data and meta-methods between different platforms. In this chapter, we will discuss how our methodological works based on Bayesian models address important questions to arise in the architecture of temporal dynamics in RNA-seq data

IntechOpen

Crossref

Statistical methods for high-throughput genomic data

Author: De Beuf Kristof
Publication venue: Ghent University. Faculty of Bioscience Engineering
Publication date: 01/01/2013
Field of study

Ghent University Academic Bibliography

Bayesian Methods for Gene Expression Analysis from High-Throughput Sequencing data

Author: Glaus Peter
Publication venue
Publication date: 01/08/2014
Field of study

The University of Manchester - Institutional Repository