50 research outputs found

    Beyond structural genomics: computational approaches for the identification of ligand binding sites in protein structures

    Get PDF
    t Structural genomics projects have revealed structures for a large number of proteins of unknown function. Understanding the interactions between these proteins and their ligands would provide an initial step in their functional characterization. Binding site identification methods are a fast and cost-effective way to facilitate the characterization of functionally important protein regions. In this review we describe our recently developed methods for binding site identification in the context of existing methods. The advantage of energy-based approaches is emphasized, since they provide flexibility in the identifi- cation and characterization of different types of binding site

    EASYMIFS and SITEHOUND: a toolkit for the identification of ligand-binding sites in protein structures

    Get PDF
    Summary: SITEHOUND uses Molecular Interaction Fields (MIFs) produced by EASYMIFS to identify protein structure regions that show a high propensity for interaction with ligands. The type of binding site identified depends on the probe atom used in the MIF calculation. The input to EASYMIFS is a PDB file of a protein structure; the output MIF serves as input to SITEHOUND, which in turn produces a list of putative binding sites. Extensive testing of SITEHOUND for the detection of binding sites for drug-like molecules and phosphorylated ligands has been carried out. Availability: EASYMIFS and SITEHOUND executables for Linux, Mac OS X, and MS Windows operating systems are freely available for download fromhttp://sitehound.sanchezlab.org/download.html

    Automated identification of binding sites forphosphorylated ligands in protein structures

    Get PDF
    Phosphorylation is a crucial step in many cellular processes, ranging from metabolic reactions involved in energy transformation to signaling cascades. In many instances, protein domains specifically recognize the phosphogroup. Knowledge of the binding site provides insights into the interaction, and it can also be exploited for therapeutic purposes. Previous studies have shown that proteins interacting with phosphogroups are highly heterogeneous, and no single property can be used to reliably identify the binding site. Here we present an energy-based computational procedure that exploits the protein three-dimensional structure to identify binding sites involved in the recognition of phosphogroups. The procedure is validated on three datasets containing more than 200 proteins binding to ATP, phosphopeptides, and phosphosugars. A comparison against other three generic binding site identification approaches shows higher accuracy values for our method, with a correct identification rate in the 80–90% range for the top three predicted sites. Addition of conservation information further improves the performance. The method presented here can be used as a first step in functional annotation or to guide mutagenesis experiments and further studies such as molecular docking

    Building towards precision medicine: empowering medical professionals for the next revolution

    Get PDF
    A new paradigm in disease classification, diagnosis and treatment is rapidly approaching. Known as precision medicine, this new healthcare model incorporates and integrates genetic information, microbiome data, and information on patients’ environment and lifestyle to better identify and classify disease processes, and to provide custom-tailored therapeutic solutions. In spite of its promises, precision medicine faces several challenges that need to be overcome to successfully implement this new healthcare model. In this paper we identify four main areas that require attention: data, tools and systems, regulations, and people. While there are important ongoing efforts for addressing the first three areas, we argue that the human factor needs to be taken into consideration as well. In particular, we discuss several studies that show how primary care physicians and clinicians in general feel underequipped to interpret genetic tests and direct-to-consumer genomic tests. Considering the importance of genetic information for precision medicine applications, this is a pressing issue that needs to be addressed. To increase the number of professionals with the necessary expertise to correctly interpret the genomics profiles of their patients, we propose several strategies that involve medical curriculum reforms, specialist training, and ongoing physician training

    Systematic assessment of accuracy of comparative model of proteins belonging to different structural fold classes

    Get PDF
    In the absence of experimental structures, comparative modeling continues to be the chosen method for retrieving structural information on target proteins. However, models lack the accuracy of experimental structures. Alignment error and structural divergence (between target and template) influence model accuracy the most. Here, we examine the potential additional impact of backbone geometry, as our previous studies have suggested that the structural class (all-α, αβ, all-β) of a protein may influence the accuracy of its model. In the twilight zone (sequence identity ≤ 30%) and at a similar level of target-template divergence, the accuracy of protein models does indeed follow the trend all-α \u3e αβ \u3e all-β. This is mainly because the alignment accuracy follows the same trend (all-α \u3e αβ \u3e all-β), with backbone geometry playing only a minor role. Differences in the diversity of sequences belonging to different structural classes leads to the observed accuracy differences, thus enabling the accuracy of alignments/models to be estimated a priori in a class-dependent manner. This study provides a systematic description of and quantifies the structural class-dependent effect in comparative modeling. The study also suggests that datasets for large-scale sequence/structure analyses should have equal representations of different structural classes to avoid class-dependent bias

    Two critical positions in zinc finger domains are heavily mutated in three human cancer types

    Get PDF
    A major goal of cancer genomics is to identify somatic mutations that play a role in tumor initiation or progression. Somatic mutations within transcription factors are of particular interest, as gene expression dysregulation is widespread in cancers. The substantial gene expression variation evident across tumors suggests that numerous regulatory factors are likely to be involved and that somatic mutations within them may not occur at high frequencies across patient cohorts, thereby complicating efforts to uncover which ones are cancerrelevant. Here we analyze somatic mutations within the largest family of human transcription factors, namely those that bind DNA via Cys2His2 zinc finger domains. Specifically, to hone in on important mutations within these genes, we aggregated somatic mutations across all of them by their positions within Cys2His2 zinc finger domains. Remarkably, we found that for three classes of cancers profiled by The Cancer Genome Atlas (TCGA)ÐUterine Corpus Endometrial Carcinoma, Colon and Rectal Adenocarcinomas, and Skin Cutaneous Melanoma Ðtwo specific, functionally important positions within zinc finger domains are mutated significantly more often than expected by chance, with alterations in 18%, 10% and 43% of tumors, respectively. Numerous zinc finger genes are affected, with those containing KruÈppel- associated box (KRAB) repressor domains preferentially targeted by these mutations. Further, the genes with these mutations also have high overall missense mutation rates, are expressed at levels comparable to those of known cancer genes, and together have biological process annotations that are consistent with roles in cancers. Altogether, we introduce evidence broadly implicating mutations within a diverse set of zinc finger proteins as relevant for cancer, and propose that they contribute to the widespread transcriptional dysregulation observed in cancer cells

    Identifying enriched drug fragments as possible candidates for metabolic engineering

    Get PDF
    Background: Fragment-based approaches have now become an important component of the drug discovery process. At the same time, pharmaceutical chemists are more often turning to the natural world and its extremely large and diverse collection of natural compounds to discover new leads that can potentially be turned into drugs. In this study we introduce and discuss a computational pipeline to automatically extract statistically overrepresented chemical fragments in therapeutic classes, and search for similar fragments in a large database of natural products. By systematically identifying enriched fragments in therapeutic groups, we are able to extract and focus on few fragments that are likely to be active or structurally important. Results: We show that several therapeutic classes (including antibacterial, antineoplastic, and drugs active on the cardiovascular system, among others) have enriched fragments that are also found in many natural compounds. Further, our method is able to detect fragments shared by a drug and a natural product even when the global similarity between the two molecules is generally low. Conclusions: A further development of this computational pipeline is to help predict putative therapeutic activities of natural compounds, and to help identify novel leads for drug discovery
    corecore