80 research outputs found

    Alignment-free local structural search by writhe decomposition

    Get PDF
    Motivation: Rapid methods for protein structure search enable biological discoveries based on flexibly defined structural similarity, unleashing the power of the ever greater number of solved protein structures. Projection methods show promise for the development of fast structural database search solutions. Projection methods map a structure to a point in a high-dimensional space and compare two structures by measuring distance between their projected points. These methods offer a tremendous increase in speed over residue-level structural alignment methods. However, current projection methods are not practical, partly because they are unable to identify local similarities

    Single DNA conformations and biological function

    Get PDF
    From a nanoscience perspective, cellular processes and their reduced in vitro imitations provide extraordinary examples for highly robust few or single molecule reaction pathways. A prime example are biochemical reactions involving DNA molecules, and the coupling of these reactions to the physical conformations of DNA. In this review, we summarise recent results on the following phenomena: We investigate the biophysical properties of DNA-looping and the equilibrium configurations of DNA-knots, whose relevance to biological processes are increasingly appreciated. We discuss how random DNA-looping may be related to the efficiency of the target search process of proteins for their specific binding site on the DNA molecule. And we dwell on the spontaneous formation of intermittent DNA nanobubbles and their importance for biological processes, such as transcription initiation. The physical properties of DNA may indeed turn out to be particularly suitable for the use of DNA in nanosensing applications.Comment: 53 pages, 45 figures. Slightly revised version of a review article, that is going to appear in the J. Comput. Theoret. Nanoscience; some typos correcte

    Efficient large-scale protein sequence comparison and gene matching to identify orthologs and co-orthologs

    Get PDF
    Broadly, computational approaches for ortholog assignment is a three steps process: (i) identify all putative homologs between the genomes, (ii) identify gene anchors and (iii) link anchors to identify best gene matches given their order and context. In this article, we engineer two methods to improve two important aspects of this pipeline [specifically steps (ii) and (iii)]. First, computing sequence similarity data [step (i)] is a computationally intensive task for large sequence sets, creating a bottleneck in the ortholog assignment pipeline. We have designed a fast and highly scalable sort-join method (afree) based on k-mer counts to rapidly compare all pairs of sequences in a large protein sequence set to identify putative homologs. Second, availability of complex genomes containing large gene families with prevalence of complex evolutionary events, such as duplications, has made the task of assigning orthologs and co-orthologs difficult. Here, we have developed an iterative graph matching strategy where at each iteration the best gene assignments are identified resulting in a set of orthologs and co-orthologs. We find that the afree algorithm is faster than existing methods and maintains high accuracy in identifying similar genes. The iterative graph matching strategy also showed high accuracy in identifying complex gene relationships. Standalone afree available from http://vbc.med.monash.edu.au/∼kmahmood/afree. EGM2, complete ortholog assignment pipeline (including afree and the iterative graph matching method) available from http://vbc.med.monash.edu.au/∼kmahmood/EGM2

    Computational Investigations of Biomolecular Motions and Interactions in Genomic Maintenance and Regulation

    Get PDF
    The most critical biochemistry in an organism supports the central dogma of molecular biology: transcription of DNA to RNA and translation of RNA to peptide sequence. Proteins are then responsible for catalyzing, regulating and ensuring the fidelity of transcription and translation. At the heart of these processes lie selective biomolecular interactions and specific dynamics that are necessary for complex formation and catalytic activity. Through advanced biophysical and computational methods, it has become possible to probe these macromolecular dynamics and interactions at the molecular and atomic levels to tease out their underlying physical bases. To the end of a more thorough understanding of these physical bases, we have performed studies to probe the motions and interactions intrinsic to the function of biomolecular complexes: modeling the dual-base flipping strategy of alkylpurine glycosylase D, dynamically tracing evolution and epistasis in the 3-ketosteroid family of nuclear receptors, discovering the allosteric and conformational aspects of transcription regulation in liver receptor homologue 1, leveraging specific contacts in tyrosyl-DNA phosphodiesterase 2 for the development of novel inhibitor scaffolds, and detailing the experimentally observed connection between solvation and sequence-specific binding affinity in PU.1-DNA complexes at the atomic level. While each study seeks to solve system-specific problems, the collection outlines a general and broadly applicable description of the biophysical motivations of biochemical processes

    Molecular Dynamics Study of Supercoiled DNA Minicircles Tightly Bent and Supercoiled DNA in Atomistic Resolution

    Get PDF
    Towards the complete understanding of the DNA response to superhelical stress, sequence dependence structural disruptions on the ~100 base pairs supercoiled DNA minicircles were examined through a series of atomistic MD simulations. The results showed the effects from some subtle structural characteristics of DNA on defect formation, including flexibility at base pair step level and anisotropy, whose dynamic information are available only from atomistic MD simulations. For longer supercoiled DNA minicircles (240-340 bp), the molecules adapt into their writhed conformations. Writhe can be calculated by a Gauss’ integral performed along the DNA central axis path. A new mathematical definition for the DNA central axis path was developed for the more exact writhe calculation. Finally, atomistic representation of supercoiled 336 base pairs minicircles was provided by fitting the DNA structure obtained by explicitly solvated MD simulations into the density maps from electron cryo-tomography. Structural data were analysed and provided a decent explanation for the mechanism of the sequence specific binding of the enzyme topoisomerase 1B onto the negatively supercoiled DNA

    Construction and dynamics of knotted fields in soft matter systems

    Get PDF
    Knotted fields are physical fields containing knotted, linked, or otherwise topologically interesting structure. They occur in a wide variety of physical systems — fluids, superfluids, electromagnetism, optics and high energy physics to name a few. Far from being passive structures, the occurrence of knotting in a physical field often modifies its overall properties, rendering their study interesting from both a theoretical and practical point of view. In this thesis, we focus on knotted fields in ‘soft matter’ systems, systems which may be loosely characterised as those in which geometry plays a fundamental role, and which undergo substantial deformations in response to external forces, changes in temperature etc. Such systems are often experimentally accessible, making them natural testbeds for exploring knotted fields in all their guises. After providing an introduction to knotted fields with a focus on soft matter in the first chapter, in the second we introduce a method of explicitly constructing such fields for any knotted curve based on Maxwell’s solid angle construction. We discuss its theory, emphasising a fundamental homotopy formula as unifying methods for computing the solid angle, as well as describing a naturally induced curve framing, which we show is related to the writhe of the curve before using it to characterise the local structure in the neighbourhood of the knot. We then discuss its practical implementation, giving examples of its use and providing C code. In subsequent chapters we use this methodology to initialise simulations in our study of knotted fields in two soft matter systems: excitable media and twist-bend nematics. In excitable media we provide a systematic survey of knot dynamics up to crossing number eight, finding generically unsteady behaviour driven by a wave-slapping mechanism. Nevertheless, we also find novel complex knotted structures and characterise their geometry and steady state motion, as well as greatly expanding upon previous evidence to demonstrate the ability of the dynamics to untangle geometries without reconnection. In twist-bend nematics we describe their fundamental geometry, that of bend. The zeros of bend are a set of lines with rich geometric and topological structure. We characterise their local structure, describe how they are canonically oriented and discuss a notion of their self-linking. We then describe their topological significance, showing that these zeros compute Skyrmion and Hopfion numbers, with accompanying simulations in twist-bend nematics

    Modelling the Extensionally Driven Transitions of DNA

    Get PDF
    Empirical measurements on DNA under tension show a jump by a factor of ≈ 1.5 − 1.7 in the relative extension at applied force of ≈ 65 − 70 pN, indi- cating a structural transition. The still ambiguously characterised stretched ‘phase’ is known as S-DNA. Using atomistic and coarse-grained Monte Carlo simulations we study DNA over-stretching in the presence of organic salts Ethidium Bromide (EtBr) and Arginine (an amino acid present in the RecA binding cleft). We present planar-stacked triplet disproportionated DNA as a solution phase of the double helix under tension, and dub it ‘Σ DNA’, with the three right-facing points of the Σ character serving as a mnemonic for the three grouped bases. Like unstretched Watson-Crick base paired DNA structures, the structure of the Σ phase is linked to function: the partitioning of bases into codons of three base-pairs each is the first stage of operation of recombinase enzymes such as RecA, facilitating alignment of homologous or near-homologous sequences for genetic exchange or repair. By showing that this process does not require any very sophisticated manipulation of the DNA, we position it as potentially appearing as an early step in the de- velopment of life, and correlate the postulated sequence of incorporation of amino acids (GADV then GADVESPLIT and then the full 20 residue set of canonical amino acids) into molecular biology with the ease of Σ-formation for sequences including the associated codons. To further investigate the de- pendence of stretching behaviour on the concentration of intercalating salt molecules, we present a physically motivated coarse-grained force-field for DNA under tension and use it to qualitatively reproduce regimes of force- extension behaviour which are not atomistically accessible

    Protein Design and Structure Determination at High-Precision

    Get PDF
    Due to the complementarity of the protein design and folding problems, progress on either front has consistently advanced the other. Although both problems remain major challenges, computational protein design has benefited amply from protein structure prediction methods. Likewise, the fields of structure prediction and structural biology have widely adopted techniques from the protein design field. The work I present here aims to put forward new protein design as well as structure determination strategies with the objective of achieving maximum precision. Both strategies capitalise on two posits: the first is that localising the sampling problem allows for exhaustive and finer granularity solution searching, while the second is that accelerated temporal dynamics can allow for directed and accurate exploration of energy landscapes. In the presented protein design projects, the level of precision was evaluated by comparing the coordinates from the experimental structures of the designs to their in silico models. Whereas in the structure determination projects, the precision was evaluated by how well a determined structure ensemble reproduces various experimental observables. Since all of the previous design work utilising conserved supersecondary structures has aimed at constructing repeat proteins from amplifying a single fragment, my first project aims at designing an asymmetric globular (i.e. non-repetitive) fold from two unrelated supersecondary structures. I thereby conceive an interface-driven strategy aiming at constructing a viable intramolecular interface across the participating supersecondary structures. I report the successful design of the target fold that agrees with the experimental NMR structure at atomic precision (backbone RMSD of 0.9 Å), where the designed protein was substantially more stable than its closest natural counterpart. Through the second project I aim to demonstrate the capacity of this interface-driven strategy to tackle the more difficult problem of novel fold design. The computational design of novel folds persists as a profound challenge, as in this case the association between structural and sequence features is absent a priori. This has kept most of the previous design efforts within the known fold space. I accordingly have expanded my interface-design methods, with the goal of achieving efficient sampling at maximum topological control. As a demonstration I conceive and design a novel corrugated protein architecture that does not exist in nature. The resulting NMR and X-ray structures for two different designs agree with the in silico models at atomic precision. On the third project I develop a new generalised method for mapping protein conformational populations from NMR data by unravelling the distribution of states that underlie the experimentally acquired average quantities. The CoMAND method does not only provide a quantitative mapping of the probabilities of the constituent microstates, but is also capable of extracting previously untapped structural information and solving structures de novo from a single NOESY experiment. I further present a detailed protocol that produces highly refined, dynamics-based ensembles without any recourse to heuristics or knowledge-based scoring. Finally, I validate the method’s precision by using the refined ensemble to quantitatively predict NMR observables that are orthogonal to the NOESY data
    corecore