4,097 research outputs found

    Bayesian genome assembly and assessment by Markov Chain Monte Carlo sampling

    Full text link
    Most genome assemblers construct point estimates, choosing a genome sequence from among many alternative hypotheses that are supported by the data. We present a Markov Chain Monte Carlo approach to sequence assembly that instead generates distributions of assembly hypotheses with posterior probabilities, providing an explicit statistical framework for evaluating alternative hypotheses and assessing assembly uncertainty. We implement this approach in a prototype assembler and illustrate its application to the bacteriophage PhiX174.Comment: 17 pages, 5 figure

    Specificity Determination by paralogous winged helix-turn-helix transcription factors

    Get PDF
    Transcription factors (TFs) localize to regulatory regions throughout the genome, where they exert physical or enzymatic control over the transcriptional machinery and regulate expression of target genes. Despite the substantial diversity of TFs found across all kingdoms of life, most belong to a relatively small number of structural families characterized by homologous DNA-binding domains (DBDs). In homologous DBDs, highly-conserved DNA-contacting residues define a characteristic ‘recognition potential’, or the limited sequence space containing high-affinity binding sites. Specificity-determining residues (SDRs) alter DNA binding preferences to further delineate this sequence space between homologous TFs, enabling functional divergence through the recognition of distinct genomic binding sites. This thesis explores the divergent DNA-binding preferences among dimeric, winged helix-turn-helix (wHTH) TFs belonging to the OmpR sub-family. As the terminal effectors of orthogonal two-component signaling pathways in Escherichia coli, OmpR paralogs bind distinct genomic sequences and regulate the expression of largely non-overlapping gene networks. Using high-throughput SELEX, I discover multiple sources of variation in DNA-binding, including the spacing and orientation of monomer sites as well as a novel binding ‘mode’ with unique half-site preferences (but retaining dimeric architecture). Surprisingly, given the diversity of residues observed occupying positions in contact with DNA, there are only minor quantitative differences in sequence-specificity between OmpR paralogs. Combining phylogenetic, structural, and biological information, I then define a comprehensive set of putative SDRs, which, although distributed broadly across the protein:DNA interface, preferentially localize to the major groove of the DNA helix. Direct specificity profiling of SDR variants reveals that individual SDRs impact local base preferences as well as global structural properties of the protein:DNA complex. This study demonstrates clearly that OmpR family TFs possess multiple ‘axes of divergence’, including base recognition, dimeric architecture, and structural attributes of the protein:DNA complex. It also provides evidence for a common structural ‘code’ for DNA-binding by OmpR homologues, and demonstrates that surprisingly modest residue changes can enable recognition of highly divergent sequence motifs. Importantly, well-characterized genomic binding sites for many of the TFs in this study diverge substantially from the presented de novo models, and it is unclear how mutations may affect binding in more complex environments. Further analysis using native sequences is required to build combined models of cis- and trans-evolution of two-component regulatory networks

    Association of Protein Helices and Assembly of Foldamers: Stories in Membrane and Aqueous Environments

    Get PDF
    Solvents play an important role in association and assembly of molecules. Here we studied solvent effects on proteins and organic chemicals in different contexts. First, X-ray crystal structures show that helix dimers in membrane- and water-soluble proteins have distinct behaviors in packing and sequence selection. Transmembrane dimers are stabilized by compact packing and hydrogen bonding between small residues. Meanwhile, water-soluble dimers utilize hydrophobic residues for packing irrespective of the size of the interface and tight dimers are rare. Secondly, we apply the results learned above to a complex system in which a designed protein binds to single-walled carbon-nanotube in aqueous environments. Previous designs of the hexameric helical bundles utilized leucine and alanine residues to make two distinct helix-helix interfaces. Our molecular dynamics simulations showed that the alanine-comprising interface is much more labile than the leucine-comprising one. This result can be interpreted by the scarcity of tight soluble helix dimers as mentioned above. Thus more stable modular helix-helix interfaces have to be employed to design peptides binding to carbon-nanotubes with higher affinities. Lastly, we describe a serendipitous discovery of the crystalline framework structure by an amphiphilic triarylamide foldamer. Foldamers are peptide-like polymers of non-natural monomers arranged in defined sequence and chain length that are able to adopt protein-like secondary and tertiary structures. In contrast with traditional metal-organic and organic frameworks, which exploit strong directional coordination and hydrogen bonding for assembly in organic solvents, the crystal herein is built up from a combination of noncovalent hydrophobic, hydrogen-bonded, and electrostatic interactions in aqueous solution. The structure is in honeycomb geometry with each cubicle as a truncated octahedron. A new supramolecular synthon, in which hydrogen bonding and π-π stacking are encompassed, was discovered in the crystal structure. Through NMR experiments we probed the oligomeric states of the foldamer in the early stages prior to crystallization. The hierarchic crystal structure was discussed in terms of supramolecular synthons in crystal engineering

    Introduction to protein folding for physicists

    Get PDF
    The prediction of the three-dimensional native structure of proteins from the knowledge of their amino acid sequence, known as the protein folding problem, is one of the most important yet unsolved issues of modern science. Since the conformational behaviour of flexible molecules is nothing more than a complex physical problem, increasingly more physicists are moving into the study of protein systems, bringing with them powerful mathematical and computational tools, as well as the sharp intuition and deep images inherent to the physics discipline. This work attempts to facilitate the first steps of such a transition. In order to achieve this goal, we provide an exhaustive account of the reasons underlying the protein folding problem enormous relevance and summarize the present-day status of the methods aimed to solving it. We also provide an introduction to the particular structure of these biological heteropolymers, and we physically define the problem stating the assumptions behind this (commonly implicit) definition. Finally, we review the 'special flavor' of statistical mechanics that is typically used to study the astronomically large phase spaces of macromolecules. Throughout the whole work, much material that is found scattered in the literature has been put together here to improve comprehension and to serve as a handy reference.Comment: 53 pages, 18 figures, the figures are at a low resolution due to arXiv restrictions, for high-res figures, go to http://www.pabloechenique.co

    Replication-guided nucleosome packing and nucleosome breathing expedite the formation of dense arrays

    Get PDF
    The first level of genome packaging in eukaryotic cells involves the formation of dense nucleosome arrays, with DNA coverage near 90% in yeasts. How cells achieve such high coverage within a short time, e. g. after DNA replication, remains poorly understood. It is known that random sequential adsorption of impenetrable particles on a line reaches high density extremely slowly, due to a jamming phenomenon. The nucleosome-shifting action of remodeling enzymes has been proposed as a mechanism to resolve such jams. Here, we suggest two biophysical mechanisms which assist rapid filling of DNA with nucleosomes, and we quantitatively characterize these mechanisms within mathematical models. First, we show that the 'softness' of nucleosomes, due to nucleosome breathing and stepwise nucleosome assembly, significantly alters the filling behavior, speeding up the process relative to 'hard' particles with fixed, mutually exclusive DNA footprints. Second, we explore model scenarios in which the progression of the replication fork could eliminate nucleosome jamming, either by rapid filling in its wake or via memory of the parental nucleosome positions. Taken together, our results suggest that biophysical effects promote rapid nucleosome filling, making the reassembly of densely packed nucleosomes after DNA replication a simpler task for cells than was previously thought

    Insights into the Development and Evolution of Exaggerated Traits Using \u3ci\u3e De Novo \u3c/i\u3e Transcriptomes of Two Species of Horned Scarab Beetles

    Get PDF
    Scarab beetles exhibit an astonishing variety of rigid exo-skeletal outgrowths, known as ‘‘horns’’. These traits are often sexually dimorphic and vary dramatically across species in size, shape, location, and allometry with body size. In many species, the horn exhibits disproportionate growth resulting in an exaggerated allometric relationship with body size, as compared to other traits, such as wings, that grow proportionately with body size. Depending on the species, the smallest males either do not produce a horn at all, or they produce a disproportionately small horn for their body size. While the diversity of horn shapes and their behavioural ecology have been reasonably well studied, we know far less about the proximate mechanisms that regulate horn growth. Thus, using 454 pyrosequencing, we generated transcriptome profiles, during horn growth and development, in two different scarab beetle species: the Asian rhinoceros beetle, Trypoxylus dichotomus, and the dung beetle, Onthophagus nigriventris. We obtained over half a million reads for each species that were assembled into over 6,000 and 16,000 contigs respectively. We combined these data with previously published studies to look for signatures of molecular evolution. We found a small subset of genes with horn-biased expression showing evidence for recent positive selection, as is expected with sexual selection on horn size. We also found evidence of relaxed selection present in genes that demonstrated biased expression between horned and horn-less morphs, consistent with the theory of developmental decoupling of phenotypically plastic traits
    • …
    corecore