Search CORE

3,351 research outputs found

Annotation Enrichment Analysis: An Alternative Method for Evaluating the Functional Properties of Gene Sets

Author: Girvan Michelle
Glass Kimberly
Publication venue
Publication date: 03/05/2013
Field of study

Gene annotation databases (compendiums maintained by the scientific community that describe the biological functions performed by individual genes) are commonly used to evaluate the functional properties of experimentally derived gene sets. Overlap statistics, such as Fisher's Exact Test (FET), are often employed to assess these associations, but don't account for non-uniformity in the number of genes annotated to individual functions or the number of functions associated with individual genes. We find FET is strongly biased toward over-estimating overlap significance if a gene set has an unusually high number of annotations. To correct for these biases, we develop Annotation Enrichment Analysis (AEA), which properly accounts for the non-uniformity of annotations. We show that AEA is able to identify biologically meaningful functional enrichments that are obscured by numerous false-positive enrichment scores in FET, and we therefore suggest it be used to more accurately assess the biological properties of gene sets

arXiv.org e-Print Archive

Harvard University - DASH

Progress and challenges in the computational prediction of gene function using networks: 2012-2013 update

Author: Gillis J.
Pavlidis P.
Publication venue: 'F1000 Research Ltd'
Publication date: 01/10/2013
Field of study

In an opinion published in 2012, we reviewed and discussed our studies of how gene network-based guilt-by-association (GBA) is impacted by confounds related to gene multifunctionality. We found such confounds account for a significant part of the GBA signal, and as a result meaningfully evaluating and applying computationally-guided GBA is more challenging than generally appreciated. We proposed that effort currently spent on incrementally improving algorithms would be better spent in identifying the features of data that do yield novel functional insights. We also suggested that part of the problem is the reliance by computational biologists on gold standard annotations such as the Gene Ontology. In the year since, there has been continued heavy activity in GBA-based research, including work that contributes to our understanding of the issues we raised. Here we provide a review of some of the most relevant recent work, or which point to new areas of progress and challenges

Cold Spring Harbor Laboratory Institutional Repository

Directory of Open Access Journals

PubMed Central

The Impact of Multifunctional Genes on "Guilt by Association" Analysis

Author: A Ben-Hur
A Patil
A Subramanian
A Typas
A Vazquez
A Yang
AH Tong
AI Su
AM Dudley
B Lehner
B Schwikowski
BJ Breitkreutz
C von Mering
CJ Wolfe
CK Griswold
CL Myers
CMBS Lill
D Eisenberg
DJ Lynn
E Zotenko
EV Koonin
G Cesareni
G Chen
GP Wagner
H Agrawal
H Jeong
H Zhang
HA Orr
HK Lee
I Lee
I Newton
I Nobeli
I Xenarios
IK Jordan
J Amberger
J Ivanic
J Ivanic
J van de Peppel
Jesse Gillis
JJ Faith
JJ Welch
Joel Bader
JS Bader
K Horan
K Saito
K Tsuda
KI Goh
L Bertram
L Pena-Castillo
M Ashburner
M Costanzo
M Gribskov
M Janitz
M Kanehisa
M Mistry
M Salathe
ME Cusick
ME Newman
MW Lee
NC Allen
NN Batada
Paul Pavlidis
PJ Young
R Jansen
S Martin
S Maslov
S Mostafavi
S Oliver
S Pu
SN Basu
T Casci
T Yamada
U Guldener
X He
Y Daniely
Y Zeng
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

Many previous studies have shown that by using variants of “guilt-by-association”, gene function predictions can be made with very high statistical confidence. In these studies, it is assumed that the “associations” in the data (e.g., protein interaction partners) of a gene are necessary in establishing “guilt”. In this paper we show that multifunctionality, rather than association, is a primary driver of gene function prediction. We first show that knowledge of the degree of multifunctionality alone can produce astonishingly strong performance when used as a predictor of gene function. We then demonstrate how multifunctionality is encoded in gene interaction data (such as protein interactions and coexpression networks) and how this can feed forward into gene function prediction algorithms. We find that high-quality gene function predictions can be made using data that possesses no information on which gene interacts with which. By examining a wide range of networks from mouse, human and yeast, as well as multiple prediction methods and evaluation metrics, we provide evidence that this problem is pervasive and does not reflect the failings of any particular algorithm or data type. We propose computational controls that can be used to provide more meaningful control when estimating gene function prediction performance. We suggest that this source of bias due to multifunctionality is important to control for, with widespread implications for the interpretation of genomics studies

Public Library of Science (PLOS)

Crossref

Cold Spring Harbor Laboratory Institutional Repository

Directory of Open Access Journals

PubMed Central

The Spermatophore in Glossina morsitans morsitans: Insights into Male Contributions to Reproduction.

Author: Abd-Alla Adly MM
Aksoy Emre
Aksoy Serap
Attardo Geoffrey M
Benoit Joshua B
Malacrida Anna R
Michalkova Veronika
Scolari Francesca
Takac Peter
Publication venue: eScholarship, University of California
Publication date: 01/01/2016
Field of study

Male Seminal Fluid Proteins (SFPs) transferred during copulation modulate female reproductive physiology and behavior, impacting sperm storage/use, ovulation, oviposition, and remating receptivity. These capabilities make them ideal targets for developing novel methods of insect disease vector control. Little is known about the nature of SFPs in the viviparous tsetse flies (Diptera: Glossinidae), vectors of Human and Animal African trypanosomiasis. In tsetse, male ejaculate is assembled into a capsule-like spermatophore structure visible post-copulation in the female uterus. We applied high-throughput approaches to uncover the composition of the spermatophore in Glossina morsitans morsitans. We found that both male accessory glands and testes contribute to its formation. The male accessory glands produce a small number of abundant novel proteins with yet unknown functions, in addition to enzyme inhibitors and peptidase regulators. The testes contribute sperm in addition to a diverse array of less abundant proteins associated with binding, oxidoreductase/transferase activities, cytoskeletal and lipid/carbohydrate transporter functions. Proteins encoded by female-biased genes are also found in the spermatophore. About half of the proteins display sequence conservation relative to other Diptera, and low similarity to SFPs from other studied species, possibly reflecting both their fast evolutionary pace and the divergent nature of tsetse's viviparous biology

Archivio Istituzionale della Ricerca - Università degli Studi di Pavia

PubMed Central

eScholarship - University of California

Recommended from our members

HNRNPK maintains epidermal progenitor function through transcription of proliferation genes and degrading differentiation promoting mRNAs.

Author: Chen Yifang
Harismendy Olivier
Jones Jackson
Li Jingting
Ling Ji
Sen George L
Tiwari Manisha
Wang Ying
Xu Xiaojun
Publication venue: eScholarship, University of California
Publication date: 01/09/2019
Field of study

Maintenance of high-turnover tissues such as the epidermis requires a balance between stem cell proliferation and differentiation. The molecular mechanisms governing this process are an area of investigation. Here we show that HNRNPK, a multifunctional protein, is necessary to prevent premature differentiation and sustains the proliferative capacity of epidermal stem and progenitor cells. To prevent premature differentiation of progenitor cells, HNRNPK is necessary for DDX6 to bind a subset of mRNAs that code for transcription factors that promote differentiation. Upon binding, these mRNAs such as GRHL3, KLF4, and ZNF750 are degraded through the mRNA degradation pathway, which prevents premature differentiation. To sustain the proliferative capacity of the epidermis, HNRNPK is necessary for RNA Polymerase II binding to proliferation/self-renewal genes such as MYC, CYR61, FGFBP1, EGFR, and cyclins to promote their expression. Our study establishes a prominent role for HNRNPK in maintaining adult tissue self-renewal through both transcriptional and post-transcriptional mechanisms

eScholarship - University of California

Gene expression profiling of connective tissue growth factor (CTGF) stimulated primary human tenon fibroblasts reveals an inflammatory and wound healing response in vitro

Author: Gebhardt Susanne
Kneitz Susanne
Mueller Thomas D.
Nickel Joachim
Schlunck Guenther
Sebald Walter
Seher Axel
ter Vehn Tobias Meyer
Publication venue: Molecular Vision
Publication date: 01/01/2011
Field of study

Purpose: The biologic relevance of human connective tissue growth factor (hCTGF) for primary human tenon fibroblasts (HTFs) was investigated by RNA expression profiling using affymetrix (TM) oligonucleotide array technology to identify genes that are regulated by hCTGF. Methods: Recombinant hCTGF was expressed in HEK293T cells and purified by affinity and gel chromatography. Specificity and biologic activity of hCTGF was confirmed by biosensor interaction analysis and proliferation assays. For RNA expression profiling HTFs were stimulated with hCTGF for 48h and analyzed using affymetrix (TM) oligonucleotide array technology. Results were validated by real time RT-PCR. Results: hCTGF induces various groups of genes responsible for a wound healing and inflammatory response in HTFs. A new subset of CTGF inducible inflammatory genes was discovered (e.g., chemokine [C-X-C motif] ligand 1 [CXCL1], chemokine [C-X-C motif] ligand 6 [CXCL6], interleukin 6 [IL6], and interleukin 8 [IL8]). We also identified genes that can transmit the known biologic functions initiated by CTGF such as proliferation and extracellular matrix remodelling. Of special interest is a group of genes, e.g., osteoglycin (OGN) and osteomodulin (OMD), which are known to play a key role in osteoblast biology. Conclusions: This study specifies the important role of hCTGF for primary tenon fibroblast function. The RNA expression profile yields new insights into the relevance of hCTGF in influencing biologic processes like wound healing, inflammation, proliferation, and extracellular matrix remodelling in vitro via transcriptional regulation of specific genes. The results suggest that CTGF potentially acts as a modulating factor in inflammatory and wound healing response in fibroblasts of the human eye

PubMed Central

Online-Publikations-Server der Universität Würzburg

Application of transcriptomics for predicting protein interaction networks, drug targets and drug candidates

Author: Kankanige Dulshani
Liyanage Liwan (R8073)
O'Connor Michael D. (R15206)
Publication venue: Switzerland, Frontiers Research Foundation
Publication date: 01/01/2022
Field of study

Protein interaction pathways and networks are critically-required for a vast range of biological processes. Improved discovery of candidate druggable proteins within specific cell, tissue and disease contexts will aid development of new treatments. Predicting protein interaction networks from gene expression data can provide valuable insights into normal and disease biology. For example, the resulting protein networks can be used to identify potentially druggable targets and drug candidates for testing in cell and animal disease models. The advent of whole-transcriptome expression profiling techniques—that catalogue protein-coding genes expressed within cells and tissues—has enabled development of individual algorithms for particular tasks. For example,: (i) gene ontology algorithms that predict gene/protein subsets involved in related cell processes; (ii) algorithms that predict intracellular protein interaction pathways; and (iii) algorithms that correlate druggable protein targets with known drugs and/or drug candidates. This review examines approaches, advantages and disadvantages of existing gene expression, gene ontology, and protein network prediction algorithms. Using this framework, we examine current efforts to combine these algorithms into pipelines to enable identification of druggable targets, and associated known drugs, using gene expression datasets. In doing so, new opportunities are identified for development of powerful algorithm pipelines, suitable for wide use by non-bioinformaticians, that can predict protein interaction networks, druggable proteins, and related drugs from user gene expression datase

PubMed Central

Western Sydney ResearchDirect