Search CORE

18,794 research outputs found

Seeking unique and common biological themes in multiple gene lists or datasets: pathway pattern extraction pipeline for pathway-level comparative analysis

Author: Anney Che
Ming Yi
Robert M Stephens
Uma Mudunuri
Yi Ming
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background One of the challenges in the analysis of microarray data is to integrate and compare the selected (e.g., differential) gene lists from multiple experiments for common or unique underlying biological themes. A common way to approach this problem is to extract common genes from these gene lists and then subject these genes to enrichment analysis to reveal the underlying biology. However, the capacity of this approach is largely restricted by the limited number of common genes shared by datasets from multiple experiments, which could be caused by the complexity of the biological system itself. Results We now introduce a new Pathway Pattern Extraction Pipeline (PPEP), which extends the existing WPS application by providing a new pathway-level comparative analysis scheme. To facilitate comparing and correlating results from different studies and sources, PPEP contains new interfaces that allow evaluation of the pathway-level enrichment patterns across multiple gene lists. As an exploratory tool, this analysis pipeline may help reveal the underlying biological themes at both the pathway and gene levels. The analysis scheme provided by PPEP begins with multiple gene lists, which may be derived from different studies in terms of the biological contexts, applied technologies, or methodologies. These lists are then subjected to pathway-level comparative analysis for extraction of pathway-level patterns. This analysis pipeline helps to explore the commonality or uniqueness of these lists at the level of pathways or biological processes from different but relevant biological systems using a combination of statistical enrichment measurements, pathway-level pattern extraction, and graphical display of the relationships of genes and their associated pathways as Gene-Term Association Networks (GTANs) within the WPS platform. As a proof of concept, we have used the new method to analyze many datasets from our collaborators as well as some public microarray datasets. Conclusion This tool provides a new pathway-level analysis scheme for integrative and comparative analysis of data derived from different but relevant systems. The tool is freely available as a Pathway Pattern Extraction Pipeline implemented in our existing software package WPS, which can be obtained at <url>http://www.abcc.ncifcrf.gov/wps/wps_index.php</url></p

Springer

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Transformation of metabolism with age and lifestyle in Antarctic seals: a case study of systems biology approach to cross-species microarray experiment

Author: Andrey Ptitsyn
Shane Kanatous
Publication venue
Publication date: 28/06/2009
Field of study

*_Background:_* The metabolic transformation that changes Weddell seal pups born on land into aquatic animals is not only interesting for the study of general biology, but it also provides a model for the acquired and congenital muscle disorders which are associated with oxygen metabolism in skeletal muscle. However, the analysis of gene expression in seals is hampered by the lack of specific microarrays and the very limited annotation of known Weddell seal (_Leptonychotes weddellii_) genes.

*_Results:_* Muscle samples from newborn, juvenile, and adult Weddell seals were collected during an Antarctic expedition. Extracted RNA was hybridized on Affymetrix Human Expression chips. Preliminary studies showed a detectable signal from at least 7000 probe sets present in all samples and replicates. Relative expression levels for these genes was used for further analysis of the biological pathways implicated in the metabolism transformation which occurs in the transition from newborn, to juvenile, to adult seals. Cytoskeletal remodeling, WNT signaling, FAK signaling, hypoxia-induced HIF1 activation, and insulin regulation were identified as being among the most important biological pathways involved in transformation. 

*_Conclusion:_* In spite of certain losses in specificity and sensitivity, the cross-species application of gene expression microarrays is capable of solving challenging puzzles in biology. A Systems Biology approach based on gene interaction patterns can compensate adequately for the lack of species-specific genomics information.&#xa

Nature Precedings

Recommended from our members

A systems biology design and implementation of novel bioinformatics software tools for high throughput gene expression analysis

Author: Khan Mohsin Amir Faiz
Publication venue: Brunel University School of Health Sciences and Social Care PhD Theses
Publication date: 01/01/2009
Field of study

This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.Microarray technology has revolutionized the field of molecular biology by offering an efficient and cost effective platform for the simultaneous quantification of thousands of genes or even entire genomes in a single experiment. Unlike southern blotting, which is restricted to the measurement of one gene at-a-time, microarrays offer biologists with the opportunity to carry out genome-wide experiments in order to help them gain a systems level understanding of cell regulation and control. The application of bioinformatics in the milieu of gene expression analysis has attracted a great deal of attention in the recent past due to specific algorithms and software solutions that attempt to illustrate complex multidimensional microarray data in a biologically coherent fashion so that it can be understood by the biologist. This has given rise to some exciting prospects for deciphering microarray data, by helping us refine our comprehension pertinent to the underlying physiological dynamics of disease. Although much progress is being made in the development of specialized bioinformatics software pipelines with the purpose of decoding large volumes of gene expression data in the context of systems biology, several loopholes exist. Perhaps most notable of these loopholes is the fact that there is an increasing demand for software solutions that specialize in automating the comparison of multiple gene expression profiles, derived from microarray experiments sharing a common biological theme. This is no doubt an important challenge, since common genes across different biological conditions having similar expression patterns are likely to be involved in the same biological process and hence, may share the same regulatory signatures. The potential benefits of this in refining our understanding of the physiology of disease are undeniable. The research presented in this thesis provides a systematic walkthrough of a series of software pipelines developed for the purpose of streamlining gene expression analysis in a systems biology context. Firstly, we present BiSAn, a software tool that deciphers expression data from the perspective of transcriptional regulation. Following this, we present Genome Interaction Analyzer (GIA), which analyzes microarray data in the integrative framework of transcription factor binding sites, protein-protein interactions and molecular pathways. The final contribution is a software pipeline called MicroPath, which analyzes multiple sets of gene expression profiles and attempts to extract common regulatory signatures that may be implicating the biological question

Brunel University Research Archive

Discovering study-specific gene regulatory networks

Author: A Lysenko
AL Barabási
Alberto de la Fuente
Allan Tucker
Artem Lysenko
B Grigorova
B Zhang
D Baek
D Heckerman
DA Samac
DJ Spiegelhalter
E Baalmann
E Segal
E Steele
E Wientjes
F Alakwaa
F Llorente
H Parkinson
J Choi
J Friedman
J Hartigan
J Zhang
JA Ihalainen
JT Damkjær
K Ando
L Marri
L Marri
M Ashburner
Mansoor Saqi
N Friedman
N Meinshausen
N Mochizuki
O Thimm
P Erdős
P Kirk
P Langfelder
PE Jensen
R Srinivasan
RA Irizarry
S Anvar
S Dash
S Infanger
S Madeira
S Swift
S Zhang
Stephen Swift
T Obayashi
Tanya Curtis
U Andersson
U Sengupta
Valeria Bo
WY Bang
Y Kluger
Y Kwon
YJ Kim
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2014
Field of study

This article has been made available through the Brunel Open Access Publishing Fund.Microarrays are commonly used in biology because of their ability to simultaneously measure thousands of genes under different conditions. Due to their structure, typically containing a high amount of variables but far fewer samples, scalable network analysis techniques are often employed. In particular, consensus approaches have been recently used that combine multiple microarray studies in order to find networks that are more robust. The purpose of this paper, however, is to combine multiple microarray studies to automatically identify subnetworks that are distinctive to specific experimental conditions rather than common to them all. To better understand key regulatory mechanisms and how they change under different conditions, we derive unique networks from multiple independent networks built using glasso which goes beyond standard correlations. This involves calculating cluster prediction accuracies to detect the most predictive genes for a specific set of conditions. We differentiate between accuracies calculated using cross-validation within a selected cluster of studies (the intra prediction accuracy) and those calculated on a set of independent studies belonging to different study clusters (inter prediction accuracy). Finally, we compare our method's results to related state-of-the art techniques. We explore how the proposed pipeline performs on both synthetic data and real data (wheat and Fusarium). Our results show that subnetworks can be identified reliably that are specific to subsets of studies and that these networks reflect key mechanisms that are fundamental to the experimental conditions in each of those subsets

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Rothamsted Repository

Brunel University Research Archive

Error, reproducibility and sensitivity : a pipeline for data processing of Agilent oligonucleotide expression arrays

Author: AR Dabney
AR Dabney
Benjamin Chain
BM Bolstad
BP Durbin
BS Everitt
CR Hampton
D Wang
E Birney
Helen Bowen
J Fan
J Rasaiyaah
J Rasaiyaah
Jane Rasaiyaah
Jhen Tsang
John Hammond
JP Hammond
L Shi
M Noursadeghi
M Sultan
Mahdad Noursadeghi
MN McCall
PA 't Hoen
TA Patterson
TC Kroll
WE Johnson
Wilfried Posch
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

Background Expression microarrays are increasingly used to obtain large scale transcriptomic information on a wide range of biological samples. Nevertheless, there is still much debate on the best ways to process data, to design experiments and analyse the output. Furthermore, many of the more sophisticated mathematical approaches to data analysis in the literature remain inaccessible to much of the biological research community. In this study we examine ways of extracting and analysing a large data set obtained using the Agilent long oligonucleotide transcriptomics platform, applied to a set of human macrophage and dendritic cell samples. Results We describe and validate a series of data extraction, transformation and normalisation steps which are implemented via a new R function. Analysis of replicate normalised reference data demonstrate that intrarray variability is small (only around 2% of the mean log signal), while interarray variability from replicate array measurements has a standard deviation (SD) of around 0.5 log2 units ( 6% of mean). The common practise of working with ratios of Cy5/Cy3 signal offers little further improvement in terms of reducing error. Comparison to expression data obtained using Arabidopsis samples demonstrates that the large number of genes in each sample showing a low level of transcription reflect the real complexity of the cellular transcriptome. Multidimensional scaling is used to show that the processed data identifies an underlying structure which reflect some of the key biological variables which define the data set. This structure is robust, allowing reliable comparison of samples collected over a number of years and collected by a variety of operators. Conclusions This study outlines a robust and easily implemented pipeline for extracting, transforming normalising and visualising transcriptomic array data from Agilent expression platform. The analysis is used to obtain quantitative estimates of the SD arising from experimental (non biological) intra- and interarray variability, and for a lower threshold for determining whether an individual gene is expressed. The study provides a reliable basis for further more extensive studies of the systems biology of eukaryotic cells

Central Archive at the University of Reading

Crossref

Springer - Publisher Connector

UCL Discovery

PubMed Central

Warwick Research Archives Portal Repository

Model-based clustering with data correction for removing artifacts in gene expression data

Author: Raftery Adrian E.
Yeung Ka Yee
Young William Chad
Publication venue
Publication date: 19/02/2016
Field of study

The NIH Library of Integrated Network-based Cellular Signatures (LINCS) contains gene expression data from over a million experiments, using Luminex Bead technology. Only 500 colors are used to measure the expression levels of the 1,000 landmark genes measured, and the data for the resulting pairs of genes are deconvolved. The raw data are sometimes inadequate for reliable deconvolution leading to artifacts in the final processed data. These include the expression levels of paired genes being flipped or given the same value, and clusters of values that are not at the true expression level. We propose a new method called model-based clustering with data correction (MCDC) that is able to identify and correct these three kinds of artifacts simultaneously. We show that MCDC improves the resulting gene expression data in terms of agreement with external baselines, as well as improving results from subsequent analysis.Comment: 28 page

arXiv.org e-Print Archive

University of Washington: UW Tacoma Digital Commons

How to understand the cell by breaking it: network analysis of gene perturbation screens

Author: Markowetz Florian
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 26/11/2009
Field of study

Modern high-throughput gene perturbation screens are key technologies at the forefront of genetic research. Combined with rich phenotypic descriptors they enable researchers to observe detailed cellular reactions to experimental perturbations on a genome-wide scale. This review surveys the current state-of-the-art in analyzing perturbation screens from a network point of view. We describe approaches to make the step from the parts list to the wiring diagram by using phenotypes for network inference and integrating them with complementary data sources. The first part of the review describes methods to analyze one- or low-dimensional phenotypes like viability or reporter activity; the second part concentrates on high-dimensional phenotypes showing global changes in cell morphology, transcriptome or proteome.Comment: Review based on ISMB 2009 tutorial; after two rounds of revisio

arXiv.org e-Print Archive

Directory of Open Access Journals

PubMed Central