133 research outputs found

    KOBAS 2.0: a web server for annotation and identification of enriched pathways and diseases

    Get PDF
    High-throughput experimental technologies often identify dozens to hundreds of genes related to, or changed in, a biological or pathological process. From these genes one wants to identify biological pathways that may be involved and diseases that may be implicated. Here, we report a web server, KOBAS 2.0, which annotates an input set of genes with putative pathways and disease relationships based on mapping to genes with known annotations. It allows for both ID mapping and cross-species sequence similarity mapping. It then performs statistical tests to identify statistically significantly enriched pathways and diseases. KOBAS 2.0 incorporates knowledge across 1327 species from 5 pathway databases (KEGG PATHWAY, PID, BioCyc, Reactome and Panther) and 5 human disease databases (OMIM, KEGG DISEASE, FunDO, GAD and NHGRI GWAS Catalog). KOBAS 2.0 can be accessed at http://kobas.cbi.pku.edu.cn

    How to understand the cell by breaking it: network analysis of gene perturbation screens

    Get PDF
    Modern high-throughput gene perturbation screens are key technologies at the forefront of genetic research. Combined with rich phenotypic descriptors they enable researchers to observe detailed cellular reactions to experimental perturbations on a genome-wide scale. This review surveys the current state-of-the-art in analyzing perturbation screens from a network point of view. We describe approaches to make the step from the parts list to the wiring diagram by using phenotypes for network inference and integrating them with complementary data sources. The first part of the review describes methods to analyze one- or low-dimensional phenotypes like viability or reporter activity; the second part concentrates on high-dimensional phenotypes showing global changes in cell morphology, transcriptome or proteome.Comment: Review based on ISMB 2009 tutorial; after two rounds of revisio

    Gene expression patterns in anterior pituitary associated with quantitative measure of oestrous behaviour in dairy cows

    Get PDF
    Intensive selection for high milk yield in dairy cows has raised production levels substantially but at the cost of reduced fertility, which manifests in different ways including reduced expression of oestrous behaviour. The genomic regulation of oestrous behaviour in bovines remains largely unknown. Here, we aimed to identify and study those genes that were associated with oestrous behaviour among genes expressed in the bovine anterior pituitary either at the start of oestrous cycle or at the mid-cycle (around day 12 of cycle), or regardless of the phase of cycle. Oestrous behaviour was recorded in each of 28 primiparous cows from 30 days in milk onwards till the day of their sacrifice (between 77 and 139 days in milk) and quantified as heat scores. An average heat score value was calculated for each cow from heat scores observed during consecutive oestrous cycles excluding the cycle on the day of sacrifice. A microarray experiment was designed to measure gene expression in the anterior pituitary of these cows, 14 of which were sacrificed at the start of oestrous cycle (day 0) and 14 around day 12 of cycle (day 12). Gene expression was modelled as a function of the orthogonally transformed average heat score values using a Bayesian hierarchical mixed model on data from day 0 cows alone (analysis 1), day 12 cows alone (analysis 2) and the combined data from day 0 and day 12 cows (analysis 3). Genes whose expression patterns showed significant linear or non-linear relationships with average heat scores were identified in all three analyses (177, 142 and 118 genes, respectively). Gene ontology terms enriched among genes identified in analysis 1 revealed processes associated with expression of oestrous behaviour whereas the terms enriched among genes identified in analysis 2 and 3 were general processes which may facilitate proper expression of oestrous behaviour at the subsequent oestrus. Studying these genes will help to improve our understanding of the genomic regulation of oestrous behaviour, ultimately leading to better management strategies and tools to improve or monitor reproductive performance in bovines

    Application of transcriptomics for predicting protein interaction networks, drug targets and drug candidates

    Get PDF
    Protein interaction pathways and networks are critically-required for a vast range of biological processes. Improved discovery of candidate druggable proteins within specific cell, tissue and disease contexts will aid development of new treatments. Predicting protein interaction networks from gene expression data can provide valuable insights into normal and disease biology. For example, the resulting protein networks can be used to identify potentially druggable targets and drug candidates for testing in cell and animal disease models. The advent of whole-transcriptome expression profiling techniques—that catalogue protein-coding genes expressed within cells and tissues—has enabled development of individual algorithms for particular tasks. For example,: (i) gene ontology algorithms that predict gene/protein subsets involved in related cell processes; (ii) algorithms that predict intracellular protein interaction pathways; and (iii) algorithms that correlate druggable protein targets with known drugs and/or drug candidates. This review examines approaches, advantages and disadvantages of existing gene expression, gene ontology, and protein network prediction algorithms. Using this framework, we examine current efforts to combine these algorithms into pipelines to enable identification of druggable targets, and associated known drugs, using gene expression datasets. In doing so, new opportunities are identified for development of powerful algorithm pipelines, suitable for wide use by non-bioinformaticians, that can predict protein interaction networks, druggable proteins, and related drugs from user gene expression datase

    GOing Bayesian: model-based gene set analysis of genome-scale data

    Get PDF
    The interpretation of data-driven experiments in genomics often involves a search for biological categories that are enriched for the responder genes identified by the experiments. However, knowledge bases such as the Gene Ontology (GO) contain hundreds or thousands of categories with very high overlap between categories. Thus, enrichment analysis performed on one category at a time frequently returns large numbers of correlated categories, leaving the choice of the most relevant ones to the user's; interpretation

    Extensive differential protein phosphorylation as intraerythrocytic Plasmodium falciparumschizonts develop into extracellular invasive merozoites

    Get PDF
    Pathology of the most lethal form of malaria is caused by Plasmodium falciparum asexual blood stages and initiated by merozoite invasion of erythrocytes. We present a phosphoproteome analysis of extracellular merozoites revealing 1765 unique phosphorylation sites including 785 sites not previously detected in schizonts. All MS data have been deposited in the ProteomeXchange with identifier PXD001684 (http://proteomecentral.proteomexchange.org/dataset/PXD001684). The observed differential phosphorylation between extra and intraerythrocytic life-cycle stages was confirmed using both phospho-site and phospho-motif specific antibodies and is consistent with the core motif [K/R]xx[pS/pT] being highly represented in merozoite phosphoproteins. Comparative bioinformatic analyses highlighted protein sets and pathways with established roles in invasion. Within the merozoite phosphoprotein interaction network a subnetwork of 119 proteins with potential roles in cellular movement and invasion was identified and suggested that it is coregulated by a further small subnetwork of protein kinase A (PKA), two calcium-dependent protein kinases (CDPKs), a phosphatidyl inositol kinase (PI3K), and a GCN2-like elF2-kinase with a predicted role in translational arrest and associated changes in the ubquitinome. To test this notion experimentally, we examined the overall ubiquitination level in intracellular schizonts versus extracellular merozoites and found it highly upregulated in merozoites. We propose that alterations in the phosphoproteome and ubiquitinome reflect a starvation-induced translational arrest as intracellular schizonts transform into extracellular merozoites

    Strong functional patterns in the evolution of eukaryotic genomes revealed by the reconstruction of ancestral protein domain repertoires

    Get PDF
    Abstract Background Genome size and complexity, as measured by the number of genes or protein domains, is remarkably similar in most extant eukaryotes and generally exhibits no correlation with their morphological complexity. Underlying trends in the evolution of the functional content and capabilities of different eukaryotic genomes might be hidden by simultaneous gains and losses of genes. Results We reconstructed the domain repertoires of putative ancestral species at major divergence points, including the last eukaryotic common ancestor (LECA). We show that, surprisingly, during eukaryotic evolution domain losses in general outnumber domain gains. Only at the base of the animal and the vertebrate sub-trees do domain gains outnumber domain losses. The observed gain/loss balance has a distinct functional bias, most strikingly seen during animal evolution, where most of the gains represent domains involved in regulation and most of the losses represent domains with metabolic functions. This trend is so consistent that clustering of genomes according to their functional profiles results in an organization similar to the tree of life. Furthermore, our results indicate that metabolic functions lost during animal evolution are likely being replaced by the metabolic capabilities of symbiotic organisms such as gut microbes. Conclusions While protein domain gains and losses are common throughout eukaryote evolution, losses oftentimes outweigh gains and lead to significant differences in functional profiles. Results presented here provide additional arguments for a complex last eukaryotic common ancestor, but also show a general trend of losses in metabolic capabilities and gain in regulatory complexity during the rise of animals

    The impact of focused Gene Ontology curation of specific mammalian systems.

    Get PDF
    The Gene Ontology (GO) resource provides dynamic controlled vocabularies to provide an information-rich resource to aid in the consistent description of the functional attributes and subcellular locations of gene products from all taxonomic groups (www.geneontology.org). System-focused projects, such as the Renal and Cardiovascular GO Annotation Initiatives, aim to provide detailed GO data for proteins implicated in specific organ development and function. Such projects support the rapid evaluation of new experimental data and aid in the generation of novel biological insights to help alleviate human disease. This paper describes the improvement of GO data for renal and cardiovascular research communities and demonstrates that the cardiovascular-focused GO annotations, created over the past three years, have led to an evident improvement of microarray interpretation. The reanalysis of cardiovascular microarray datasets confirms the need to continue to improve the annotation of the human proteome. AVAILABILITY: GO ANNOTATION DATA IS FREELY AVAILABLE FROM: ftp://ftp.geneontology.org/pub/go/gene-associations

    Using predictive specificity to determine when gene set analysis is biologically meaningful

    Get PDF
    Gene set analysis, which translates gene lists into enriched functions, is among the most common bioinformatic methods. Yet few would advocate taking the results at face value. Not only is there no agreement on the algorithms themselves, there is no agreement on how to benchmark them. In this paper, we evaluate the robustness and uniqueness of enrichment results as a means of assessing methods even where correctness is unknown. We show that heavily annotated ('multifunctional') genes are likely to appear in genomics study results and drive the generation of biologically non-specific enrichment results as well as highly fragile significances. By providing a means of determining where enrichment analyses report non-specific and non-robust findings, we are able to assess where we can be confident in their use. We find significant progress in recent bias correction methods for enrichment and provide our own software implementation. Our approach can be readily adapted to any pre-existing package
    corecore