213 research outputs found
ResponseNet: revealing signaling and regulatory networks linking genetic transcriptomic screening data
Cellular response to stimuli is typically complex and involves both regulatory and metabolic processes. Large-scale experimental efforts to identify components of these processes often comprise of genetic screening and transcriptomic profiling assays. We previously established that in yeast genetic screens tend to identify response regulators, while transcriptomic profiling assays tend to identify components of metabolic processes. ResponseNet is a network-optimization approach that integrates the results from these assays with data of known molecular interactions. Specifically, ResponseNet identifies a high-probability sub-network, composed of signaling and regulatory molecular interaction paths, through which putative response regulators may lead to the measured transcriptomic changes. Computationally, this is achieved by formulating a minimum-cost flow optimization problem and solving it efficiently using linear programming tools. The ResponseNet web server offers a simple interface for applying ResponseNet. Users can upload weighted lists of proteins and genes and obtain a sparse, weighted, molecular interaction sub-network connecting their data. The predicted sub-network and its gene ontology enrichment analysis are presented graphically or as text. Consequently, the ResponseNet web server enables researchers that were previously limited to separate analysis of their distinct, large-scale experiments, to meaningfully integrate their data and substantially expand their understanding of the underlying cellular response. ResponseNet is available at http://bioinfo.bgu.ac.il/respnet.Seventh Framework Programme (European Commission) (FP7-PEOPLE-MCA-IRG)United States-Israel Binational Science Foundation (Grant 2009323
Mapping transcription mechanisms from multimodal genomic data
Background
Identification of expression quantitative trait loci (eQTLs) is an emerging area in genomic study. The task requires an integrated analysis of genome-wide single nucleotide polymorphism (SNP) data and gene expression data, raising a new computational challenge due to the tremendous size of data.
Results
We develop a method to identify eQTLs. The method represents eQTLs as information flux between genetic variants and transcripts. We use information theory to simultaneously interrogate SNP and gene expression data, resulting in a Transcriptional Information Map (TIM) which captures the network of transcriptional information that links genetic variations, gene expression and regulatory mechanisms. These maps are able to identify both cis- and trans- regulating eQTLs. The application on a dataset of leukemia patients identifies eQTLs in the regions of the GART, PCP4, DSCAM, and RIPK4 genes that regulate ADAMTS1, a known leukemia correlate.
Conclusions
The information theory approach presented in this paper is able to infer the dependence networks between SNPs and transcripts, which in turn can identify cis- and trans-eQTLs. The application of our method to the leukemia study explains how genetic variants and gene expression are linked to leukemia.National Human Genome Research Institute (U.S.) (R01HG003354)National Institute of Allergy and Infectious Diseases (U.S.) (U19 AI067854-05)National Heart, Lung, and Blood Institute (grant T32 HL007427-28)National Institutes of Health (U.S.) (grant K99 LM009826
A direct comparison of protein interaction confidence assignment schemes
BACKGROUND: Recent technological advances have enabled high-throughput measurements of protein-protein interactions in the cell, producing large protein interaction networks for various species at an ever-growing pace. However, common technologies like yeast two-hybrid may experience high rates of false positive detection. To combat false positive discoveries, a number of different methods have been recently developed that associate confidence scores with protein interactions. Here, we perform a rigorous comparative analysis and performance assessment among these different methods. RESULTS: We measure the extent to which each set of confidence scores correlates with similarity of the interacting proteins in terms of function, expression, pattern of sequence conservation, and homology to interacting proteins in other species. We also employ a new metric, the Signal-to-Noise Ratio of protein complexes embedded in each network, to assess the power of the different methods. Seven confidence assignment schemes, including those of Bader et al., Deane et al., Deng et al., Sharan et al., and Qi et al., are compared in this work. CONCLUSION: Although the performance of each assignment scheme varies depending on the particular metric used for assessment, we observe that Deng et al. yields the best performance overall (in three out of four viable measures). Importantly, we also find that utilizing any of the probability assignment schemes is always more beneficial than assuming all observed interactions to be true or equally likely
Global alignment of protein-protein interaction networks by graph matching methods
Aligning protein-protein interaction (PPI) networks of different species has
drawn a considerable interest recently. This problem is important to
investigate evolutionary conserved pathways or protein complexes across
species, and to help in the identification of functional orthologs through the
detection of conserved interactions. It is however a difficult combinatorial
problem, for which only heuristic methods have been proposed so far. We
reformulate the PPI alignment as a graph matching problem, and investigate how
state-of-the-art graph matching algorithms can be used for that purpose. We
differentiate between two alignment problems, depending on whether strict
constraints on protein matches are given, based on sequence similarity, or
whether the goal is instead to find an optimal compromise between sequence
similarity and interaction conservation in the alignment. We propose new
methods for both cases, and assess their performance on the alignment of the
yeast and fly PPI networks. The new methods consistently outperform
state-of-the-art algorithms, retrieving in particular 78% more conserved
interactions than IsoRank for a given level of sequence similarity.
Availability:http://cbio.ensmp.fr/proj/graphm\_ppi/, additional data and codes
are available upon request. Contact: [email protected]: Preprint versio
eQED: an efficient method for interpreting eQTL associations using protein networks
Analysis of expression quantitative trait loci (eQTLs) is an emerging technique in which individuals are genotyped across a panel of genetic markers and, simultaneously, phenotyped using DNA microarrays. Because of the spacing of markers and linkage disequilibrium, each marker may be near many genes making it difficult to finely map which of these genes are the causal factors responsible for the observed changes in the downstream expression. To address this challenge, we present an efficient method for prioritizing candidate genes at a locus. This approach, called ‘eQTL electrical diagrams' (eQED), integrates eQTLs with protein interaction networks by modeling the two data sets as a wiring diagram of current sources and resistors. eQED achieved a 79% accuracy in recovering a reference set of regulator–target pairs in yeast, which is significantly higher than the performance of three competing methods. eQED also annotates 368 protein–protein interactions with their directionality of information flow with an accuracy of approximately 75%
Recommended from our members
Increased brain expression of GPNMB is associated with genome wide significant risk for Parkinson's disease on chromosome 7p15.3
Genome wide association studies (GWAS) for Parkinson's disease (PD) have previously revealed a significant association with a locus on chromosome 7p15.3, initially designated as the glycoprotein non-metastatic melanoma protein B (GPNMB) locus. In this study, the functional consequences of this association on expression were explored in depth by integrating different expression quantitative trait locus (eQTL) datasets (Braineac, CAGEseq, GTEx, and Phenotype-Genotype Integrator (PheGenI)). Top risk SNP rs199347 eQTLs demonstrated increased expressions of GPNMB, KLHL7, and NUPL2 with the major allele (AA) in brain, with most significant eQTLs in cortical regions, followed by putamen. In addition, decreased expression of the antisense RNA KLHL7-AS1 was observed in GTEx. Furthermore, rs199347 is an eQTL with long non-coding RNA (AC005082.12) in human tissues other than brain. Interestingly, transcript-specific eQTLs in immune-related tissues (spleen and lymphoblastoid cells) for NUPL2 and KLHL7-AS1 were observed, which suggests a complex functional role of this eQTL in specific tissues, cell types at specific time points. Significantly increased expression of GPNMB linked to rs199347 was consistent across all datasets, and taken in combination with the risk SNP being located within the GPNMB gene, these results suggest that increased expression of GPNMB is the causative link explaining the association of this locus with PD. However, other transcript eQTLs and subsequent functional roles cannot be excluded. This highlights the importance of further investigations to understand the functional interactions between the coding genes, antisense, and non-coding RNA species considering the tissue and cell-type specificity to understand the underlying biological mechanisms in PD
PheNetic : network-based interpretation of molecular profiling data
Molecular profiling experiments have become standard in current wet-lab practices. Classically, enrichment analysis has been used to identify biological functions related to these experimental results. Combining molecular profiling results with the wealth of currently available interactomics data, however, offers the opportunity to identify the molecular mechanism behind an observed molecular phenotype. In this paper, we therefore introduce 'PheNetic', a userfriendly web server for inferring a sub-network based on probabilistic logical querying. PheNetic extracts from an interactome, the sub-network that best explains genes prioritized through a molecular profiling experiment. Depending on its run mode, PheNetic searches either for a regulatorymechanism that gave explains to the observed molecular phenotype or for the pathways (in) activated in the molecular phenotype. The web server provides access to a large number of interactomes, making sub-network inference readily applicable to a wide variety of organisms. The inferred sub-networks can be interactively visualized in the browser. PheNetic's method and use are illustrated using an example analysis of differential expression results of ampicillin treated Escherichia coli cells. The PheNetic web service is available at http://bioinformatics.intec.ugent.be/phenetic/
Webb Miller and Trey Ideker To Receive Top International Bioinformatics Awards for 2009 from the International Society for Computational Biology
Recommended from our members
Computational solutions for omics data
High-throughput experimental technologies are generating increasingly massive and complex genomic data sets. The sheer enormity and heterogeneity of these data threaten to make the arising problems computationally infeasible. Fortunately, powerful algorithmic techniques lead to software that can answer important biomedical questions in practice. In this Review, we sample the algorithmic landscape, focusing on state-of-the-art techniques, the understanding of which will aid the bench biologist in analysing omics data. We spotlight specific examples that have facilitated and enriched analyses of sequence, transcriptomic and network data sets.National Institutes of Health (U.S.) (Grant GM081871
FunSimMat: a comprehensive functional similarity database
Functional similarity based on Gene Ontology (GO) annotation is used in diverse applications like gene clustering, gene expression data analysis, protein interaction prediction and evaluation. However, there exists no comprehensive resource of functional similarity values although such a database would facilitate the use of functional similarity measures in different applications. Here, we describe FunSimMat (Functional Similarity Matrix, http://funsimmat.bioinf.mpi-inf.mpg.de/), a large new database that provides several different semantic similarity measures for GO terms. It offers various precomputed functional similarity values for proteins contained in UniProtKB and for protein families in Pfam and SMART. The web interface allows users to efficiently perform both semantic similarity searches with GO terms and functional similarity searches with proteins or protein families. All results can be downloaded in tab-delimited files for use with other tools. An additional XML–RPC interface gives automatic online access to FunSimMat for programs and remote services
- …
