13 research outputs found
Agglomerative clustering of fragment 3D structures based on pairwise RMSD
International audienceIn structural biology, many fragment-based 3D modeling methods require fragment libraries. They represent the whole set of possible 3D structures (conformations) observed experimentally for each fragment, with a chosen precision. In docking, for this precision, it is important to have as few prototypes as possible inside the libraries.One way to create a library is to cluster all observed conformations in order to retain only the representative prototypes. The most common measure of 3D similarity is the Root Mean Squared Deviation (RMSD) applied after a structural superposition. But this RMSD after alignment is not a metric, which means that distance-based clustering is not applicable.Current alternative methods, based on an approximation of the RMSD or internal coordinates, retrieve too many prototypes.We propose a new type of clustering which meets our needs, based on hierarchical agglomerative clustering. The linkage criterion for agglomerating two clusters is the radius of the minimal ball enclosing them. The prototypes are the centers of the balls at the end of theclustering process. They constitute a cover of all possible conformations within a given RMSD. We discuss the complexity issues associated with solving the quadratic programming problems that produce the minimal enclosing balls
Inferring epsilon-nets of Finite Sets in a RKHS
International audienceWe introduce a method to derive-nets of finite sets. It operates in a reproducing kernel Hilbert space. Its principle combines two well-known tools of empirical inference: the hierarchical agglomerative clustering and the computation of minimum enclosing balls. It produces-nets whose cardinalities are smaller than those obtained with state-ofthe-art methods
New clustering method to infer prototypes covering the 3D structures of nucleic acid fragments
International audienceIn structural biology, many fragment-based 3D modeling methods require fragment libraries.They represent the whole set of possible 3D structures (conformations) observed experimentally for each fragment, with a chosen precision.In docking, for this precision, it is important to have as few prototypes as possible inside the libraries.One way to create a library is to cluster all observed conformations in order to retain only the representative prototypes. The most common measure of 3D similarity is the Root Mean Squared Deviation (RMSD) applied after a structural superposition. But this RMSD after alignment is not a metric, which means that distance-based clustering is not applicable.Current alternative methods, based on an approximation of the RMSD or internal coordinates, retrieve too many prototypes.We propose a new type of clustering which meets our needs, based on hierarchical agglomerative clustering. The linkage criterion for agglomerating two clusters is the radius of the minimal ball enclosing them. The prototypes are the centers of the balls at the end of the clustering process. They constitute a cover of all possible conformations within a given RMSD. We discuss the complexity issues associated with solving the quadratic programming problems that produce the minimal enclosing balls
Docking of RNA Hairpin on Protein Using a Fragment-Based Method
International audienceWe introduce an extension of our fragment-based method for ssRNA-protein docking as it is still a challenging difficulty in docking. It is dedicated to hairpins and makes use of geometrical features of this secondary structure. An initial evaluation establishes that it is promising and could make it possible to overcome the limitations of the state-of-the-art fragment-based methods
Feature extraction for the clustering of small 3D structures: application to RNA fragments
International audienceStructural libraries of fragments are commonly used to model or design the 3D structure of biomolecules (drugs, peptides, nucleic acids). They typically approximate all possible local conformations of these molecules within a given precision, by a set of wellchosen representative fragments. Such a set can be obtained by clustering a larger set of fragments whose structures have been solved experimentally, using suitable clustering algorithm and measure of dissimilarity between fragments. A commonly used measure of dissimilarity in structural biology is the root mean square deviation (RMSD), whose exact computation requires a pairwise structural alignment. But this alignment is highly time-consuming and not applicable for a very large initial set of fragments. We propose here an approach based on feature extraction to perform an effective clustering, while avoiding a computationally expensive full pairwise alignment. Using as example poly-A RNA fragments of 3 nucleotides (3-nt), we searched for internal coordinates whose differences can best approximate the RMSD between two fragments without any superposition. We found that the simple differences of internal distances and angles can provide a lower bound on the RMSD, allowing us to filter out pairs of which the RMSD does not need to be computed. We can then compute the exact values for only the small RMSDs, and use it to apply more effective clustering methods. We present this strategy and its application on 39431 RNA 3-nt, which could be approximated by only 3258 representative prototypes with 1 Å accuracy
ProtNAff: Protein-bound Nucleic Acid filters and fragment libraries
International audienceMotivation: Atomistic models of Nucleic Acids (NA) fragments can be used to model the 3D structures of specific protein-NA interactions and address the problem of great NA flexibility, especially in their single-stranded regions. One way to obtain relevant NA fragments is to extract them from existing 3D structures corresponding to the targeted context (e.g. specific 2D structures, protein families, sequences) and to learn from them. Several databases exist for specific NA 3D motifs, especially in RNA, but none can handle the variety of possible contexts.Results: This paper presents protNAff, a new pipeline for the conception of searchable databases on the 2D and 3D structures of protein-bound NA, the selection of context-specific (regions of) NA structures by combinations of filters, and the creation of context-specific NA fragment libraries. The strength of this pipeline is its modularity, allowing users to adapt it to many specific modeling problems. As examples, the pipeline is applied to the quantitative analysis of (i) the sequence-specificity of trinucleotide conformations, (ii) the conformational diversity of RNA at several levels of resolution, (iii) the effect of protein binding on RNA local conformations, and (iv) the protein-binding propensity of RNA hairpin loops of various lengths.Availability: The source code is freely available for download at URL https://github.com/isaureCdB/protNAff. The database and the trinucleotide fragment library are downloadable at URL https://zenodo.org/record/6483823#.YmbVhFxByV4
NAfragDB: A Multi-Purpose Structural Database of Nucleic-Acid/Protein Complexes for Advances Users
International audienceMany structural bioinformatics databases support automated searches via a limited number of pre-defined criteria and their combinations. Here, we present NAFragDB, a structural database of nucleic-acid (NA) - protein complexes that supports arbitrarily advanced queries and any combination there of via python-written requests directly on the raw data
High throughput phenotyping for complex traits: case study for nitrogen response in wheat based on the PhénoBlé project
W305Crop response to abiotic stress is a complex trait, muddled by genotype by environment interactions, and multiple underlying traits. This is typically the case for response to nitrogen in wheat. High throughput phenotyping provides access to intermediate level traits that can help in understanding, screening, and ultimately ameliorating nitrogen response. We provide results on the use of proximal remote sensing technologies used to investigate the response of an elite panel of French bread wheats to nitrogen, obtained via a project entitled Ph noBl . These results have allowed us to identify certain remote sensing proxies for radiation interception and radiation use efficiency as promising traits for screening germplasm response to nitrogen. Based on these results, as well as multilocal trials screening the same panel for nitrogen response, we propose avenues for implementing these technologies, with a focus on plant breeding. We also present technological evolutions from the initial prototype system to facilitate wide adoption. Finally, we conclude by drawing parallels with the requirements for investigating complex crop-phytobiome interactions for improved tolerance to stresses
HSP70 sequestration by free α-globin promotes ineffective erythropoiesis in β-thalassaemia
International audienceβ-Thalassaemia major (β-TM) is an inherited haemoglobinopathy caused by a quantitative defect in the synthesis of β-globin chains of haemoglobin, leading to the accumulation of free α-globin chains that form toxic aggregates. Despite extensive knowledge of the molecular defects causing β-TM, little is known of the mechanisms responsible for the ineffective erythropoiesis observed in the condition, which is characterized by accelerated erythroid differentiation, maturation arrest and apoptosis at the polychromatophilic stage. We have previously demonstrated that normal human erythroid maturation requires a transient activation of caspase-3 at the later stages of maturation. Although erythroid transcription factor GATA-1, the master transcriptional factor of erythropoiesis, is a caspase-3 target, it is not cleaved during erythroid differentiation. We have shown that, in human erythroblasts, the chaperone heat shock protein70 (HSP70) is constitutively expressed and, at later stages of maturation, translocates into the nucleus and protects GATA-1 from caspase-3 cleavage. The primary role of this ubiquitous chaperone is to participate in the refolding of proteins denatured by cytoplasmic stress, thus preventing their aggregation. Here we show in vitro that during the maturation of human β-TM erythroblasts, HSP70 interacts directly with free α-globin chains. As a consequence, HSP70 is sequestrated in the cytoplasm and GATA-1 is no longer protected, resulting in end-stage maturation arrest and apoptosis. Transduction of a nuclear-targeted HSP70 mutant or a caspase-3-uncleavable GATA-1 mutant restores terminal maturation of β-TM erythroblasts, which may provide a rationale for new targeted therapies of β-T