Search CORE

8,616 research outputs found

Multivariate approach to the analysis of correlated RNA-seq data

Author: 박현진
Publication venue: 서울대학교 대학원
Publication date: 01/02/2017
Field of study

학위논문 (석사)-- 서울대학교 대학원 : 통계학과, 2017. 2. 박태성.High-throughput RNA-seq technology has emerged as a powerful tool for understanding the molecular basis of phenotype variation in biology, including disease. Recently, some correlated RNA-seq datasets started to be generated. While there have been several approaches proposed for identifying the differentially expressed genes (DEGs), not many methods can analyze correlated RNA-seq data. We expect the simultaneous analysis of correlated RNA-seq data to increase of power of detecting DEGs. In this paper, we propose a multivariate method to find DEGs on correlated RNA-seq data based on the Generalized Estimating Equations (GEE) approach. The advantage of the proposed method is to consider correlated RNA-seq data simultaneously while accounting for correlations. Through real data analysis and simulation studies, we show that our multivariate approach has higher power of detecting DEGs than the existing methods.1 Introduction 1 1.1 Background 1 1.2 Purpose 2 2 Material and Methods 3 2.1 Real RNA-seq datasets 3 2.1.1 Diet data 3 2.1.2 Toxicity data 4 2.2 Review of commonly used approach 5 2.2.1 edgeR 5 2.2.2 DESeq 5 2.2.3 limma+voom 6 2.3 Proposed approach : GEE method 7 3 Simulations 9 3.1 Simulation Settings 10 3.1.1 Different number of DEGs 10 3.1.2 Different value of φ 10 3.1.3 Different number of correlated datasets 10 3.2 Results of Simulation 11 4 Application to Real Data 15 5 Discussion 19 Bibliography 21 초록 23Maste

Recommended from our members

Identification and characterization of dysregulated P-element induced wimpy testis-interacting RNAs in head and neck squamous cell carcinoma.

Author: Ku Jonjei
Kuo Selena Z
Li Pin Xue
Ongkeko Weg M
Saad Maarouf A
Wang-Rodriguez Jessica
Yu Michael Andrew
Zheng Hao
Publication venue: eScholarship, University of California
Publication date: 01/03/2019
Field of study

It is clear that alcohol consumption is a major risk factor in the pathogenesis of head and neck squamous cell carcinoma (HNSCC); however, the molecular mechanism underlying the pathogenesis of alcohol-associated HNSCC remains poorly understood. The aim of the present study was to identify and characterize P-element-induced wimpy testis (PIWI)-interacting RNAs (piRNAs) and PIWI proteins dysregulated in alcohol-associated HNSCC to elucidate their function in the development of this cancer. Using next generation RNA-sequencing (RNA-seq) data obtained from 40 HNSCC patients, the piRNA and PIWI protein expression of HNSCC samples was compared between alcohol drinkers and non-drinkers. A separate piRNA expression RNA-seq analysis of 18 non-smoker HNSCC patients was also conducted. To verify piRNA expression, reverse transcription-quantitative polymerase chain reaction (RT-qPCR) was performed on the most differentially expressed alcohol-associated piRNAs in ethanol and acetaldehyde-treated normal oral keratinocytes. The correlation between piRNA expression and patient survival was analyzed using Kaplan-Meier estimators and multivariate Cox proportional hazard models. A comparison between alcohol drinking and non-drinking HNSCC patients demonstrated that a panel of 3,223 piRNA transcripts were consistently detected and differentially expressed. RNA-seq analysis and in vitro RT-qPCR verification revealed that 4 of these piRNAs, piR-35373, piR-266308, piR-58510 and piR-38034, were significantly dysregulated between drinking and non-drinking cohorts. Of these four piRNAs, low expression of piR-58510 and piR-35373 significantly correlated with improved patient survival. Furthermore, human PIWI-like protein 4 was consistently upregulated in ethanol and acetaldehyde-treated normal oral keratinocytes. These results demonstrate that alcohol consumption may cause dysregulation of piRNA expression in HNSCC and in vitro verifications identified 4 piRNAs that may be involved in the pathogenesis of alcohol-associated HNSCC

eScholarship - University of California

Inferring evolutionary histories of pathway regulation from transcriptional profiling data

Author: Brem Rachel B.
Hsu Tiffany Y.
Mostovoy Yulia
Schraiber Joshua G.
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 25/07/2013
Field of study

One of the outstanding challenges in comparative genomics is to interpret the evolutionary importance of regulatory variation between species. Rigorous molecular evolution-based methods to infer evidence for natural selection from expression data are at a premium in the field, and to date, phylogenetic approaches have not been well-suited to address the question in the small sets of taxa profiled in standard surveys of gene expression. We have developed a strategy to infer evolutionary histories from expression profiles by analyzing suites of genes of common function. In a manner conceptually similar to molecular evolution models in which the evolutionary rates of DNA sequence at multiple loci follow a gamma distribution, we modeled expression of the genes of an \emph{a priori}-defined pathway with rates drawn from an inverse gamma distribution. We then developed a fitting strategy to infer the parameters of this distribution from expression measurements, and to identify gene groups whose expression patterns were consistent with evolutionary constraint or rapid evolution in particular species. Simulations confirmed the power and accuracy of our inference method. As an experimental testbed for our approach, we generated and analyzed transcriptional profiles of four \emph{Saccharomyces} yeasts. The results revealed pathways with signatures of constrained and accelerated regulatory evolution in individual yeasts and across the phylogeny, highlighting the prevalence of pathway-level expression change during the divergence of yeast species. We anticipate that our pathway-based phylogenetic approach will be of broad utility in the search to understand the evolutionary relevance of regulatory change.Comment: 30 pages, 12 figures, 2 tables, contact authors for supplementary table

arXiv.org e-Print Archive

Directory of Open Access Journals

FigShare

Variance component score test for time-course gene set analysis of longitudinal RNA-seq data

Author: Agniel Denis
Hejblum Boris P
Publication venue
Publication date: 01/01/2017
Field of study

As gene expression measurement technology is shifting from microarrays to sequencing, the statistical tools available for their analysis must be adapted since RNA-seq data are measured as counts. Recently, it has been proposed to tackle the count nature of these data by modeling log-count reads per million as continuous variables, using nonparametric regression to account for their inherent heteroscedasticity. Adopting such a framework, we propose tcgsaseq, a principled, model-free and efficient top-down method for detecting longitudinal changes in RNA-seq gene sets. Considering gene sets defined a priori, tcgsaseq identifies those whose expression vary over time, based on an original variance component score test accounting for both covariates and heteroscedasticity without assuming any specific parametric distribution for the transformed counts. We demonstrate that despite the presence of a nonparametric component, our test statistic has a simple form and limiting distribution, and both may be computed quickly. A permutation version of the test is additionally proposed for very small sample sizes. Applied to both simulated data and two real datasets, the proposed method is shown to exhibit very good statistical properties, with an increase in stability and power when compared to state of the art methods ROAST, edgeR and DESeq2, which can fail to control the type I error under certain realistic settings. We have made the method available for the community in the R package tcgsaseq.Comment: 23 pages, 6 figures, typo corrections & acceptance acknowledgemen

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

HAL - UPEC / UPEM

Network-based approaches to explore complex biological systems towards network medicine

Author: Conte Federica
Farina Lorenzo
Fiscon Giulia
Paci Paola
Publication venue: 'MDPI AG'
Publication date: 01/01/2018
Field of study

Network medicine relies on different types of networks: from the molecular level of protein–protein interactions to gene regulatory network and correlation studies of gene expression. Among network approaches based on the analysis of the topological properties of protein–protein interaction (PPI) networks, we discuss the widespread DIAMOnD (disease module detection) algorithm. Starting from the assumption that PPI networks can be viewed as maps where diseases can be identified with localized perturbation within a specific neighborhood (i.e., disease modules), DIAMOnD performs a systematic analysis of the human PPI network to uncover new disease-associated genes by exploiting the connectivity significance instead of connection density. The past few years have witnessed the increasing interest in understanding the molecular mechanism of post-transcriptional regulation with a special emphasis on non-coding RNAs since they are emerging as key regulators of many cellular processes in both physiological and pathological states. Recent findings show that coding genes are not the only targets that microRNAs interact with. In fact, there is a pool of different RNAs—including long non-coding RNAs (lncRNAs) —competing with each other to attract microRNAs for interactions, thus acting as competing endogenous RNAs (ceRNAs). The framework of regulatory networks provides a powerful tool to gather new insights into ceRNA regulatory mechanisms. Here, we describe a data-driven model recently developed to explore the lncRNA-associated ceRNA activity in breast invasive carcinoma. On the other hand, a very promising example of the co-expression network is the one implemented by the software SWIM (switch miner), which combines topological properties of correlation networks with gene expression data in order to identify a small pool of genes—called switch genes—critically associated with drastic changes in cell phenotype. Here, we describe SWIM tool along with its applications to cancer research and compare its predictions with DIAMOnD disease genes

Directory of Open Access Journals

Archivio della ricerca- Università di Roma La Sapienza