22,627 research outputs found

    Inter-protein sequence co-evolution predicts known physical interactions in bacterial ribosomes and the trp operon

    Get PDF
    Interaction between proteins is a fundamental mechanism that underlies virtually all biological processes. Many important interactions are conserved across a large variety of species. The need to maintain interaction leads to a high degree of co-evolution between residues in the interface between partner proteins. The inference of protein-protein interaction networks from the rapidly growing sequence databases is one of the most formidable tasks in systems biology today. We propose here a novel approach based on the Direct-Coupling Analysis of the co-evolution between inter-protein residue pairs. We use ribosomal and trp operon proteins as test cases: For the small resp. large ribosomal subunit our approach predicts protein-interaction partners at a true-positive rate of 70% resp. 90% within the first 10 predictions, with areas of 0.69 resp. 0.81 under the ROC curves for all predictions. In the trp operon, it assigns the two largest interaction scores to the only two interactions experimentally known. On the level of residue interactions we show that for both the small and the large ribosomal subunit our approach predicts interacting residues in the system with a true positive rate of 60% and 85% in the first 20 predictions. We use artificial data to show that the performance of our approach depends crucially on the size of the joint multiple sequence alignments and analyze how many sequences would be necessary for a perfect prediction if the sequences were sampled from the same model that we use for prediction. Given the performance of our approach on the test data we speculate that it can be used to detect new interactions, especially in the light of the rapid growth of available sequence data

    Genesis of ancestral haplotypes: RNA modifications and reverse transcription–mediated polymorphisms

    Get PDF
    Understanding the genesis of the block haplotype structure of the genome is a major challenge. With the completion of the sequencing of the Human Genome and the initiation of the HapMap project the concept that the chromosomes of the mammalian genome are a mosaic, or patchwork, of conserved extended block haplotype sequences is now accepted by the mainstream genomics research community. Ancestral Haplotypes (AHs) can be viewed as a recombined string of smaller Polymorphic Frozen Blocks (PFBs). How have such variant extended DNA sequence tracts emerged in evolution? Here the relevant literature on the problem is reviewed from various fields of molecular and cell biology particularly molecular immunology and comparative and functional genomics. Based on our synthesis we then advance a testable molecular and cellular model. A critical part of the analysis concerns the origin of the strand biased mutation signatures in the transcribed regions of the human and higher primate genome, A-to-G versus T-to-C (ratio ~1.5 fold) and C-to-T versus G-to-A (≥1.5 fold). A comparison and evaluation of the current state of the fields of immunoglobulin Somatic Hypermutation (SHM) and Transcription-Coupled DNA Repair focused on how mutations in newly synthesized RNA might be copied back to DNA thus accounting for some of the genome-wide strand biases (e.g., the A-to-G vs T-to-C component of the strand biased spectrum). We hypothesize that the genesis of PFBs and extended AHs occurs during mutagenic episodes in evolution (e.g., retroviral infections) and that many of the critical DNA sequence diversifying events occur first at the RNA level, e.g., recombination between RNA strings resulting in tandem and dispersed RNA duplications (retroduplications), RNA mutations via adenosine-to-inosine pre-mRNA editing events as well as error prone RNA synthesis. These are then copied back into DNA by a cellular reverse transcription process (also likely to be error-prone) that we have called "reverse transcription-mediated long DNA conversion." Finally we suggest that all these activities and others can be envisaged as being brought physically under the umbrella of special sites in the nucleus involved in transcription known as "transcription factories."

    MicroRNA in control of gene expression: An overview of nuclear functions

    Get PDF
    The finding that small non-coding RNAs (ncRNAs) are able to control gene expression in a sequence specific manner has had a massive impact on biology. Recent improvements in high throughput sequencing and computational prediction methods have allowed the discovery and classification of several types of ncRNAs. Based on their precursor structures, biogenesis pathways and modes of action, ncRNAs are classified as small interfering RNAs (siRNAs), microRNAs (miRNAs), PIWI-interacting RNAs (piRNAs), endogenous small interfering RNAs (endo-siRNAs or esiRNAs), promoter associate RNAs (pRNAs), small nucleolar RNAs (snoRNAs) and sno-derived RNAs. Among these, miRNAs appear as important cytoplasmic regulators of gene expression. miRNAs act as post-transcriptional regulators of their messenger RNA (mRNA) targets via mRNA degradation and/or translational repression. However, it is becoming evident that miRNAs also have specific nuclear functions. Among these, the most studied and debated activity is the miRNA-guided transcriptional control of gene expression. Although available data detail quite precisely the effectors of this activity, the mechanisms by which miRNAs identify their gene targets to control transcription are still a matter of debate. Here, we focus on nuclear functions of miRNAs and on alternative mechanisms of target recognition, at the promoter lavel, by miRNAs in carrying out transcriptional gene silencing

    AIP1 is a novel Agenet/Tudor domain protein from Arabidopsis that interacts with regulators of DNA replication, transcription and chromatin remodeling

    Get PDF
    Background: DNA replication and transcription are dynamic processes regulating plant development that are dependent on the chromatin accessibility. Proteins belonging to the Agenet/Tudor domain family are known as histone modification "readers" and classified as chromatin remodeling proteins. Histone modifications and chromatin remodeling have profound effects on gene expression as well as on DNA replication, but how these processes are integrated has not been completely elucidated. It is clear that members of the Agenet/Tudor family are important regulators of development playing roles not well known in plants. Methods: Bioinformatics and phylogenetic analyses of the Agenet/Tudor Family domain in the plant kingdom were carried out with sequences from available complete genomes databases. 3D structure predictions of Agenet/Tudor domains were calculated by I-TASSER server. Protein interactions were tested in two-hybrid, GST pulldown, semi-in vivo pulldown and Tandem Affinity Purification assays. Gene function was studied in a T-DNA insertion GABI-line. Results: In the present work we analyzed the family of Agenet/Tudor domain proteins in the plant kingdom and we mapped the organization of this family throughout plant evolution. Furthermore, we characterized a member from Arabidopsis thaliana named AIP1 that harbors Agenet/Tudor and DUF724 domains. AIP1 interacts with ABAP1, a plant regulator of DNA replication licensing and gene transcription, with a plant histone modification "reader" (LHP1) and with non modified histones. AIP1 is expressed in reproductive tissues and its down-regulation delays flower development timing. Also, expression of ABAP1 and LHP1 target genes were repressed in flower buds of plants with reduced levels of AIP1. Conclusions: AIP1 is a novel Agenet/Tudor domain protein in plants that could act as a link between DNA replication, transcription and chromatin remodeling during flower development

    Red Queen Coevolution on Fitness Landscapes

    Full text link
    Species do not merely evolve, they also coevolve with other organisms. Coevolution is a major force driving interacting species to continuously evolve ex- ploring their fitness landscapes. Coevolution involves the coupling of species fit- ness landscapes, linking species genetic changes with their inter-specific ecological interactions. Here we first introduce the Red Queen hypothesis of evolution com- menting on some theoretical aspects and empirical evidences. As an introduction to the fitness landscape concept, we review key issues on evolution on simple and rugged fitness landscapes. Then we present key modeling examples of coevolution on different fitness landscapes at different scales, from RNA viruses to complex ecosystems and macroevolution.Comment: 40 pages, 12 figures. To appear in "Recent Advances in the Theory and Application of Fitness Landscapes" (H. Richter and A. Engelbrecht, eds.). Springer Series in Emergence, Complexity, and Computation, 201

    Combinatorial RNA Design: Designability and Structure-Approximating Algorithm

    Get PDF
    In this work, we consider the Combinatorial RNA Design problem, a minimal instance of the RNA design problem which aims at finding a sequence that admits a given target as its unique base pair maximizing structure. We provide complete characterizations for the structures that can be designed using restricted alphabets. Under a classic four-letter alphabet, we provide a complete characterization of designable structures without unpaired bases. When unpaired bases are allowed, we provide partial characterizations for classes of designable/undesignable structures, and show that the class of designable structures is closed under the stutter operation. Membership of a given structure to any of the classes can be tested in linear time and, for positive instances, a solution can be found in linear time. Finally, we consider a structure-approximating version of the problem that allows to extend bands (helices) and, assuming that the input structure avoids two motifs, we provide a linear-time algorithm that produces a designable structure with at most twice more base pairs than the input structure.Comment: CPM - 26th Annual Symposium on Combinatorial Pattern Matching, Jun 2015, Ischia Island, Italy. LNCS, 201

    Comprehensive structural classification of ligand binding motifs in proteins

    Get PDF
    Comprehensive knowledge of protein-ligand interactions should provide a useful basis for annotating protein functions, studying protein evolution, engineering enzymatic activity, and designing drugs. To investigate the diversity and universality of ligand binding sites in protein structures, we conducted the all-against-all atomic-level structural comparison of over 180,000 ligand binding sites found in all the known structures in the Protein Data Bank by using a recently developed database search and alignment algorithm. By applying a hybrid top-down-bottom-up clustering analysis to the comparison results, we determined approximately 3000 well-defined structural motifs of ligand binding sites. Apart from a handful of exceptions, most structural motifs were found to be confined within single families or superfamilies, and to be associated with particular ligands. Furthermore, we analyzed the components of the similarity network and enumerated more than 4000 pairs of ligand binding sites that were shared across different protein folds.Comment: 13 pages, 8 figure
    • …
    corecore