7,837 research outputs found

    Modular architecture of nucleotide-binding pockets

    Get PDF
    Recently, modularity has emerged as a general attribute of complex biological systems. This is probably because modular systems lend themselves readily to optimization via random mutation followed by natural selection. Although they are not traditionally considered to evolve by this process, biological ligands are also modular, being composed of recurring chemical fragments, and moreover they exhibit similarities reminiscent of mutations (e.g. the few atoms differentiating adenine and guanine). Many ligands are also promiscuous in the sense that they bind to many different protein folds. Here, we investigated whether ligand chemical modularity is reflected in an underlying modularity of binding sites across unrelated proteins. We chose nucleotides as paradigmatic ligands, because they can be described as composed of well-defined fragments (nucleobase, ribose and phosphates) and are quite abundant both in nature and in protein structure databases. We found that nucleotide-binding sites do indeed show a modular organization and are composed of fragment-specific protein structural motifs, which parallel the modular structure of their ligands. Through an analysis of the distribution of these motifs in different proteins and in different folds, we discuss the evolutionary implications of these findings and argue that the structural features we observed can arise both as a result of divergence from a common ancestor or convergent evolution

    A structural classification of protein-protein interactions for detection of convergently evolved motifs and for prediction of protein binding sites on sequence level

    Get PDF
    BACKGROUND: A long-standing challenge in the post-genomic era of Bioinformatics is the prediction of protein-protein interactions, and ultimately the prediction of protein functions. The problem is intrinsically harder, when only amino acid sequences are available, but a solution is more universally applicable. So far, the problem of uncovering protein-protein interactions has been addressed in a variety of ways, both experimentally and computationally. MOTIVATION: The central problem is: How can protein complexes with solved threedimensional structure be utilized to identify and classify protein binding sites and how can knowledge be inferred from this classification such that protein interactions can be predicted for proteins without solved structure? The underlying hypothesis is that protein binding sites are often restricted to a small number of residues, which additionally often are well-conserved in order to maintain an interaction. Therefore, the signal-to-noise ratio in binding sites is expected to be higher than in other parts of the surface. This enables binding site detection in unknown proteins, when homology based annotation transfer fails. APPROACH: The problem is addressed by first investigating how geometrical aspects of domain-domain associations can lead to a rigorous structural classification of the multitude of protein interface types. The interface types are explored with respect to two aspects: First, how do interface types with one-sided homology reveal convergently evolved motifs? Second, how can sequential descriptors for local structural features be derived from the interface type classification? Then, the use of sequential representations for binding sites in order to predict protein interactions is investigated. The underlying algorithms are based on machine learning techniques, in particular Hidden Markov Models. RESULTS: This work includes a novel approach to a comprehensive geometrical classification of domain interfaces. Alternative structural domain associations are found for 40% of all family-family interactions. Evaluation of the classification algorithm on a hand-curated set of interfaces yielded a precision of 83% and a recall of 95%. For the first time, a systematic screen of convergently evolved motifs in 102.000 protein-protein interactions with structural information is derived. With respect to this dataset, all cases related to viral mimicry of human interface bindings are identified. Finally, a library of 740 motif descriptors for binding site recognition - encoded as Hidden Markov Models - is generated and cross-validated. Tests for the significance of motifs are provided. The usefulness of descriptors for protein-ligand binding sites is demonstrated for the case of "ATP-binding", where a precision of 89% is achieved, thus outperforming comparable motifs from PROSITE. In particular, a novel descriptor for a P-loop variant has been used to identify ATP-binding sites in 60 protein sequences that have not been annotated before by existing motif databases

    The RNA world: hypotheses, facts and experimental results.

    Get PDF
    A biochemical world that would have existed before the contemporary DNA-RNA-Protein world, and baptized in 1986 "The RNA World" by Walter Gilbert, such a world had already been proposed during the preceding decades by Carl Woese, Francis Crick and Leslie Orgel. By demonstrating the remarkable diversity of the RNA molecule, Molecular Biology proved these predictions. RNA present in all living cells, performs structural and metabolic functions many of which were unsuspected only a few years ago. A truly modern "RNA world" exists in each cell; it contains RNAs in various forms, short and long fragments, single and double-stranded, endowed with multiple roles (informational, catalytic, that can serve as templates, guides, defense), certain molecules being even capable of carrying out several of these functions

    Minimalistic Peptide-Based Supramolecular Systems Relevant to the Chemical Origin of Life

    Full text link
    All forms of life are based on biopolymers, which are made up of a selection of simple building blocks, such as amino acids, nucleotides, fatty acids and sugars. Their individual properties govern their interactions, giving rise to complex supramolecular structures with highly specialized functionality, including ligand recognition, catalysis and compartmentalization. In this thesis, we aim to answer the question whether short peptides could have acted as precursors of modern proteins during prebiotic evolution. Using a combination of experimental and computational techniques, we screened a large molecular search space for peptide sequences that are capable of forming supramolecular complexes with adenosine triphosphate (ATP), life’s ubiquitous energy currency, and uridine triphosphate (UTP). Our results demonstrate that peptides as short as heptamers can form dynamic supramolecular complexes, adapt their structure to a ligand upon binding, undergo phase-separation into spatially confined compartments and catalyze nucleotide-hydrolysis

    Genome information processing by the INO80 chromatin remodeler positions nucleosomes [preprint]

    Get PDF
    The fundamental molecular determinants by which ATP-dependent chromatin remodelers organize nucleosomes across eukaryotic genomes remain largely elusive. Here, chromatin reconstitutions on physiological, whole-genome templates reveal how remodelers read and translate genomic information into nucleosome positions. Using the yeast genome and the multi-subunit INO80 remodeler as a paradigm, we identify DNA shape/mechanics encoded signature motifs as sufficient for nucleosome positioning and distinct from known DNA sequence preferences of histones. INO80 processes such information through an allosteric interplay between its core- and Arp8-modules that probes mechanical properties of nucleosomal and linker DNA. At promoters, INO80 integrates this readout of DNA shape/mechanics with a readout of co-evolved sequence motifs via interaction with general regulatory factors bound to these motifs. Our findings establish a molecular mechanism for robust and yet adjustable +1 nucleosome positioning and, more generally, remodelers as information processing hubs that enable active organization and allosteric regulation of the first level of chromatin

    A genome-wide structure-based survey of nucleotide binding proteins in M. tuberculosis

    Get PDF
    Nucleoside tri-phosphates (NTP) form an important class of small molecule ligands that participate in, and are essential to a large number of biological processes. Here, we seek to identify the NTP binding proteome (NTPome) in M. tuberculosis (M.tb), a deadly pathogen. Identifying the NTPome is useful not only for gaining functional insights of the individual proteins but also for identifying useful drug targets. From an earlier study, we had structural models of M.tb at a proteome scale from which a set of 13,858 small molecule binding pockets were identified. We use a set of NTP binding sub-structural motifs derived from a previous study and scan the M.tb pocketome, and find that 1,768 proteins or 43% of the proteome can theoretically bind NTP ligands. Using an experimental proteomics approach involving dye-ligand affinity chromatography, we confirm NTP binding to 47 different proteins, of which 4 are hypothetical proteins. Our analysis also provides the precise list of binding site residues in each case, and the probable ligand binding pose. As the list includes a number of known and potential drug targets, the identification of NTP binding can directly facilitate structure-based drug design of these targets

    The cellular protein nucleolin preferentially binds long-looped G-quadruplex nucleic acids

    Get PDF
    open5noBACKGROUND: G-quadruplexes (G4s) are four-stranded nucleic acid structures that form in G-rich sequences. Nucleolin (NCL) is a cellular protein reported for its functions upon G4 recognition, such as induction of neurodegenerative diseases, tumor and virus mechanisms activation. We here aimed at defining NCL/G4 binding determinants. METHODS: Electrophoresis mobility shift assay was used to detect NCL/G4 binding; circular dichroism to assess G4 folding, topology and stability; dimethylsulfate footprinting to detect G bases involved in G4 folding. RESULTS: The purified full-length human NCL was initially tested on telomeric G4 target sequences to allow for modulation of loop, conformation, length, G-tract number, stability. G4s in promoter regions with more complex sequences were next employed. We found that NCL binding to G4s heavily relies on G4 loop length, independently of the conformation and oligonucleotide/loop sequence. Low stability G4s are preferred. When alternative G4 conformations are possible, those with longer loops are preferred upon binding to NCL, even if G-tracts need to be spared from G4 folding. CONCLUSIONS: Our data provide insight into how G4s and the associated proteins may control the ON/OFF molecular switch to several pathological processes, including neurodegeneration, tumor and virus activation. Understanding these regulatory determinants is the first step towards the development of targeted therapies. GENERAL SIGNIFICANCE: The indication that NCL binding preferentially stimulates and induces folding of G4s containing long loops suggests NCL ability to modify the overall structure and steric hindrance of the involved nucleic acid regions. This protein-induced modification of the G4 structure may represent a cellular mechanosensor mechanism to molecular signaling and disease pathogenesis.openLago, Sara; Tosoni, Elena; Nadai, Matteo; Palumbo, Manlio; Richter, Sara NLago, Sara; Tosoni, Elena; Nadai, Matteo; Palumbo, Manlio; Richter, Sar

    Decipering the subunit interaction in the crenarchaeal archaellum

    Get PDF
    The archaeal motility structure, the archaellum is an intriguing hybrid of the function and architecture of two distinct motility organelles, the bacterial flagellum and the T4P, respectively. This rotating T4P is an astonishing example of evolutionary adaptation and represents indeed a unique, third way to move. This microbial structure was however for long time ignored and while many bacterial structures have been already well characterized, the knowledge about the archaellum remains still scare. The so far performed studies were restricted to motility in Euryarchaeota and included physiological and genetic analyses of few species. Here we present a detailed systematic and structural analysis of the crenarchaeal archaellum using the thermoacidophile Sulfolobus acidocaldarius as model organism. S. acidocaldarius has the most minimalistic known archaellum system, composed of only seven Fla proteins. In-frame deletion strain analysis revealed all seven fla genes to be essential for proper archaellum assembly. All these mutants were non-motile, conclusively linking the archaellum of Crenarchaeota with their swimming motility. Moreover, using immunoblot analysis we found the archaella biosynthesis to be induced under nutrient depleting conditions. We could also demonstrate that despite that all the seven fla genes are clustered in one genomic locus, they are expressed in two different transcriptional units. Thus the archaellin FlaB and the remaining structural components FlaXHGFHIJ encoding genes are expressed separately. The main focus of this work was however the structural aspect of the S. acidocaldarius archaellum. Thus we are presenting here a detailed biochemical and structural characterization of the two cytosolic components of the archaellum: the RecA family protein FlaH and the ATPase FlaI. By elucidating the interaction network of FlaH and FlaI within the archaellum, we could place them together with FlaX and FlaJ as structural components of the basal body The ATPase FlaI was successfully crystallized in hexameric form and we could solve this structure at 2.0 Å resolution. FlaI hexamer forms a crown-like structure with subunits at three different conformational states, assembled together in a rare cross-subunit interacting fashion. Further analysis revealed also that the enzymatic activity and system specificity of FlaI are structurally separated, since the ATPase is restricted to the C-terminal domain, while the functional part is represented by the N-terminal domain. We demonstrated moreover that FlaI has a dual role and is involved in generating the energy necessary for both, the archaellum assembly and its rotation. The functions of FlaI could be uncoupled by deleting the first 29 amino acids of the N-terminus, resulting in archaellated, but not motile phenotype. FlaH was characterized as an ATP-binding protein, since no ATPase activity could be detected. It has a well conserved Walker A, but an incomplete Walker B motif and as we could show with in vivo and in vitro analysis both motifs are important for ATP binding and also were essential for archaella assembly and motility. The structure of FlaH was solved at 2.3 Å resolution, revealing the presence of a bound ATP molecule, supporting the hypothesis that FlaH does not hydrolyze ATP. Structural similarities to the CII domain of KaiC and a proved auto-phosphorylation activity, suggest that FlaH plays a regulatory role and controls the archaellum assembly/function in a phosphorylation dependent manner. Taking together, all the presented here data provide insights into the role of the archaellum of S. acidocaldarius, its genomic organization and unique molecular architecture. Furthermore our structural analysis revealed differences between the motor proteins within the archaellum and the related bacterial systems, elucidating the phenomenon of the rotating type IV pilus. However, many questions regarding the archaellum remain still open and present a challenge for further motility studies in Archaea
    corecore