7,228 research outputs found

    Domain-mediated interactions for protein subfamily identification

    Get PDF
    Within a protein family, proteins with the same domain often exhibit different cellular functions, despite the shared evolutionary history and molecular function of the domain. We hypothesized that domain-mediated interactions (DMIs) may categorize a protein family into subfamilies because the diversified functions of a single domain often depend on interacting partners of domains. Here we systematically identified DMI subfamilies, in which proteins share domains with DMI partners, as well as with various functional and physical interaction networks in individual species. In humans, DMI subfamily members are associated with similar diseases, including cancers, and are frequently co-associated with the same diseases. DMI information relates to the functional and evolutionary subdivisions of human kinases. In yeast, DMI subfamilies contain proteins with similar phenotypic outcomes from specific chemical treatments. Therefore, the systematic investigation here provides insights into the diverse functions of subfamilies derived from a protein family with a link-centric approach and suggests a useful resource for annotating the functions and phenotypic outcomes of proteins.11Ysciescopu

    TF2Network : predicting transcription factor regulators and gene regulatory networks in Arabidopsis using publicly available binding site information

    Get PDF
    A gene regulatory network (GRN) is a collection of regulatory interactions between transcription factors (TFs) and their target genes. GRNs control different biological processes and have been instrumental to understand the organization and complexity of gene regulation. Although various experimental methods have been used to map GRNs in Arabidop-sis thaliana, their limited throughput combined with the large number of TFs makes that for many genes our knowledge about regulating TFs is incomplete. We introduce TF2Network, a tool that exploits the vast amount of TF binding site information and enables the delineation of GRNs by detecting potential regulators for a set of co-expressed or functionally related genes. Validation using two experimental benchmarks reveals that TF2Network predicts the correct regulator in 75-92% of the test sets. Furthermore, our tool is robust to noise in the input gene sets, has a low false discovery rate, and shows a better performance to recover correct regulators compared to other plant tools. TF2Network is accessible through a web interface where GRNs are interactively visualized and annotated with various types of experimental functional information. TF2Network was used to perform systematic functional and regulatory gene annotations, identifying new TFs involved in circadian rhythm and stress response

    Integration and mining of malaria molecular, functional and pharmacological data: how far are we from a chemogenomic knowledge space?

    Get PDF
    The organization and mining of malaria genomic and post-genomic data is highly motivated by the necessity to predict and characterize new biological targets and new drugs. Biological targets are sought in a biological space designed from the genomic data from Plasmodium falciparum, but using also the millions of genomic data from other species. Drug candidates are sought in a chemical space containing the millions of small molecules stored in public and private chemolibraries. Data management should therefore be as reliable and versatile as possible. In this context, we examined five aspects of the organization and mining of malaria genomic and post-genomic data: 1) the comparison of protein sequences including compositionally atypical malaria sequences, 2) the high throughput reconstruction of molecular phylogenies, 3) the representation of biological processes particularly metabolic pathways, 4) the versatile methods to integrate genomic data, biological representations and functional profiling obtained from X-omic experiments after drug treatments and 5) the determination and prediction of protein structures and their molecular docking with drug candidate structures. Progresses toward a grid-enabled chemogenomic knowledge space are discussed.Comment: 43 pages, 4 figures, to appear in Malaria Journa

    Conserved noncoding sequences highlight shared components of regulatory networks in dicotyledonous plants

    Get PDF
    Conserved noncoding sequences (CNSs) in DNA are reliable pointers to regulatory elements controlling gene expression. Using a comparative genomics approach with four dicotyledonous plant species (Arabidopsis thaliana, papaya [Carica papaya], poplar [Populus trichocarpa], and grape [Vitis vinifera]), we detected hundreds of CNSs upstream of Arabidopsis genes. Distinct positioning, length, and enrichment for transcription factor binding sites suggest these CNSs play a functional role in transcriptional regulation. The enrichment of transcription factors within the set of genes associated with CNS is consistent with the hypothesis that together they form part of a conserved transcriptional network whose function is to regulate other transcription factors and control development. We identified a set of promoters where regulatory mechanisms are likely to be shared between the model organism Arabidopsis and other dicots, providing areas of focus for further research

    Systems analysis of host-parasite interactions.

    Get PDF
    Parasitic diseases caused by protozoan pathogens lead to hundreds of thousands of deaths per year in addition to substantial suffering and socioeconomic decline for millions of people worldwide. The lack of effective vaccines coupled with the widespread emergence of drug-resistant parasites necessitates that the research community take an active role in understanding host-parasite infection biology in order to develop improved therapeutics. Recent advances in next-generation sequencing and the rapid development of publicly accessible genomic databases for many human pathogens have facilitated the application of systems biology to the study of host-parasite interactions. Over the past decade, these technologies have led to the discovery of many important biological processes governing parasitic disease. The integration and interpretation of high-throughput -omic data will undoubtedly generate extraordinary insight into host-parasite interaction networks essential to navigate the intricacies of these complex systems. As systems analysis continues to build the foundation for our understanding of host-parasite biology, this will provide the framework necessary to drive drug discovery research forward and accelerate the development of new antiparasitic therapies

    Gene prioritization in Type 2 Diabetes using domain interactions and network analysis

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Identification of disease genes for Type 2 Diabetes (T2D) by traditional methods has yielded limited success. Based on our previous observation that T2D may result from disturbed protein-protein interactions affected through disrupting modular domain interactions, here we have designed an approach to rank the candidates in the T2D linked genomic regions as plausible disease genes.</p> <p>Results</p> <p>Our approach integrates Weight value (Wv) method followed by prioritization using clustering coefficients derived from domain interaction network. Wv for each candidate is calculated based on the assumption that disease genes might be functionally related, mainly facilitated by interactions among domains of the interacting proteins. The benchmarking using a test dataset comprising of both known T2D genes and non-T2D genes revealed that Wv method had a sensitivity and specificity of 0.74 and 0.96 respectively with 9 fold enrichment. The candidate genes having a Wv > 0.5 were called High Weight Elements (HWEs). Further, we ranked HWEs by using the network property-the clustering coefficient (C<sub>i</sub>). Each HWE with a C<sub>i </sub>< 0.015 was prioritized as plausible disease candidates (HWEc) as previous studies indicate that disease genes tend to avoid dense clustering (with an average C<sub>i </sub>of 0.015). This method further prioritized the identified disease genes with a sensitivity of 0.32 and a specificity of 0.98 and enriched the candidate list by 6.8 fold. Thus, from the dataset of 4052 positional candidates the method ranked 435 to be most likely disease candidates. The gene ontology sharing for the candidates showed higher representation of metabolic and signaling processes. The approach also captured genes with unknown functions which were characterized by network motif analysis.</p> <p>Conclusions</p> <p>Prioritization of positional candidates is essential for cost-effective and an expedited discovery of disease genes. Here, we demonstrate a novel approach for disease candidate prioritization from numerous loci linked to T2D.</p
    corecore