5,414 research outputs found

    A biophysical approach to large-scale protein-DNA binding data

    Get PDF
    About this book * Cutting-edge genome analysis methods from leading bioinformaticians An accurate description of current scientific developments in the field of bioinformatics and computational implementation is presented by research of the BioSapiens Network of Excellence. Bioinformatics is essential for annotating the structure and function of genes, proteins and the analysis of complete genomes and to molecular biology and biochemistry. Included is an overview of bioinformatics, the full spectrum of genome annotation approaches including; genome analysis and gene prediction, gene regulation analysis and expression, genome variation and QTL analysis, large scale protein annotation of function and structure, annotation and prediction of protein interactions, and the organization and annotation of molecular networks and biochemical pathways. Also covered is a technical framework to organize and represent genome data using the DAS technology and work in the annotation of two large genomic sets: HIV/HCV viral genomes and splicing alternatives potentially encoded in 1% of the human genome

    FLORA: a novel method to predict protein function from structure in diverse superfamilies

    Get PDF
    Predicting protein function from structure remains an active area of interest, particularly for the structural genomics initiatives where a substantial number of structures are initially solved with little or no functional characterisation. Although global structure comparison methods can be used to transfer functional annotations, the relationship between fold and function is complex, particularly in functionally diverse superfamilies that have evolved through different secondary structure embellishments to a common structural core. The majority of prediction algorithms employ local templates built on known or predicted functional residues. Here, we present a novel method (FLORA) that automatically generates structural motifs associated with different functional sub-families (FSGs) within functionally diverse domain superfamilies. Templates are created purely on the basis of their specificity for a given FSG, and the method makes no prior prediction of functional sites, nor assumes specific physico-chemical properties of residues. FLORA is able to accurately discriminate between homologous domains with different functions and substantially outperforms (a 2–3 fold increase in coverage at low error rates) popular structure comparison methods and a leading function prediction method. We benchmark FLORA on a large data set of enzyme superfamilies from all three major protein classes (α, β, αβ) and demonstrate the functional relevance of the motifs it identifies. We also provide novel predictions of enzymatic activity for a large number of structures solved by the Protein Structure Initiative. Overall, we show that FLORA is able to effectively detect functionally similar protein domain structures by purely using patterns of structural conservation of all residues

    Integration and mining of malaria molecular, functional and pharmacological data: how far are we from a chemogenomic knowledge space?

    Get PDF
    The organization and mining of malaria genomic and post-genomic data is highly motivated by the necessity to predict and characterize new biological targets and new drugs. Biological targets are sought in a biological space designed from the genomic data from Plasmodium falciparum, but using also the millions of genomic data from other species. Drug candidates are sought in a chemical space containing the millions of small molecules stored in public and private chemolibraries. Data management should therefore be as reliable and versatile as possible. In this context, we examined five aspects of the organization and mining of malaria genomic and post-genomic data: 1) the comparison of protein sequences including compositionally atypical malaria sequences, 2) the high throughput reconstruction of molecular phylogenies, 3) the representation of biological processes particularly metabolic pathways, 4) the versatile methods to integrate genomic data, biological representations and functional profiling obtained from X-omic experiments after drug treatments and 5) the determination and prediction of protein structures and their molecular docking with drug candidate structures. Progresses toward a grid-enabled chemogenomic knowledge space are discussed.Comment: 43 pages, 4 figures, to appear in Malaria Journa

    Comprehensive structural classification of ligand binding motifs in proteins

    Get PDF
    Comprehensive knowledge of protein-ligand interactions should provide a useful basis for annotating protein functions, studying protein evolution, engineering enzymatic activity, and designing drugs. To investigate the diversity and universality of ligand binding sites in protein structures, we conducted the all-against-all atomic-level structural comparison of over 180,000 ligand binding sites found in all the known structures in the Protein Data Bank by using a recently developed database search and alignment algorithm. By applying a hybrid top-down-bottom-up clustering analysis to the comparison results, we determined approximately 3000 well-defined structural motifs of ligand binding sites. Apart from a handful of exceptions, most structural motifs were found to be confined within single families or superfamilies, and to be associated with particular ligands. Furthermore, we analyzed the components of the similarity network and enumerated more than 4000 pairs of ligand binding sites that were shared across different protein folds.Comment: 13 pages, 8 figure

    Effects of Spaceflight on Human Induced Pluripotent Stem Cell-Derived Cardiomyocyte Structure and Function.

    Get PDF
    With extended stays aboard the International Space Station (ISS) becoming commonplace, there is a need to better understand the effects of microgravity on cardiac function. We utilized human induced pluripotent stem cell-derived cardiomyocytes (hiPSC-CMs) to study the effects of microgravity on cell-level cardiac function and gene expression. The hiPSC-CMs were cultured aboard the ISS for 5.5 weeks and their gene expression, structure, and functions were compared with ground control hiPSC-CMs. Exposure to microgravity on the ISS caused alterations in hiPSC-CM calcium handling. RNA-sequencing analysis demonstrated that 2,635 genes were differentially expressed among flight, post-flight, and ground control samples, including genes involved in mitochondrial metabolism. This study represents the first use of hiPSC technology to model the effects of spaceflight on human cardiomyocyte structure and function

    Bridging topological and functional information in protein interaction networks by short loops profiling

    Get PDF
    Protein-protein interaction networks (PPINs) have been employed to identify potential novel interconnections between proteins as well as crucial cellular functions. In this study we identify fundamental principles of PPIN topologies by analysing network motifs of short loops, which are small cyclic interactions of between 3 and 6 proteins. We compared 30 PPINs with corresponding randomised null models and examined the occurrence of common biological functions in loops extracted from a cross-validated high-confidence dataset of 622 human protein complexes. We demonstrate that loops are an intrinsic feature of PPINs and that specific cell functions are predominantly performed by loops of different lengths. Topologically, we find that loops are strongly related to the accuracy of PPINs and define a core of interactions with high resilience. The identification of this core and the analysis of loop composition are promising tools to assess PPIN quality and to uncover possible biases from experimental detection methods. More than 96% of loops share at least one biological function, with enrichment of cellular functions related to mRNA metabolic processing and the cell cycle. Our analyses suggest that these motifs can be used in the design of targeted experiments for functional phenotype detection.This research was supported by the Biotechnology and Biological Sciences Research Council (BB/H018409/1 to AP, ACCC and FF, and BB/J016284/1 to NSBT) and by the Leukaemia & Lymphoma Research (to NSBT and FF). SSC is funded by a Leukaemia & Lymphoma Research Gordon Piller PhD Studentship

    De novo discovery of structural motifs in RNA 3D structures through clustering

    Get PDF
    As functional components in three-dimensional (3D) conformation of an RNA, the RNA structural motifs provide an easy way to associate the molecular architectures with their biological mechanisms. In the past years, many computational tools have been developed to search motif instances by using the existing knowledge of well-studied families. Recently, with the rapidly increasing number of resolved RNA 3D structures, there is an urgent need to discover novel motifs with the newly presented information. In this work, we classify all the loops in non-redundant RNA 3D structures to detect plausible RNA structural motif families by using a clustering pipeline. Compared with other clustering approaches, our method has two benefits: first, the underlying alignment algorithm is tolerant to the variations in 3D structures. Second, sophisticated downstream analysis has been performed to ensure the clusters are valid and easily applied to further research. The final clustering results contain many interesting new variants of known motif families, such as GNAA tetraloop, kink-turn, sarcin-ricin and T-loop. We have also discovered potential novel functional motifs conserved in ribosomal RNA, sgRNA, SRP RNA, riboswitch and ribozyme.National Institute of General Medical Sciences of the National Institutes of Health (NIH NIGMS) (R01GM102515)Funding for open access charge: NIH NIGMS [R01 GM102515
    • …
    corecore