12 research outputs found

    A Unified Dynamic Programming Framework for the Analysis of Interacting Nucleic Acid Strands: Enhanced Models, Scalability, and Speed

    Get PDF
    Dynamic programming algorithms within the NUPACK software suite enable analysis of nucleic acid sequences over complex and test tube ensembles containing arbitrary numbers of interacting strand species, serving the needs of researchers in molecular programming, nucleic acid nanotechnology, synthetic biology, and across the life sciences. Here, to enhance the underlying physical model, ensure scalability for large calculations, and achieve dramatic speedups when calculating diverse physical quantities over complex and test tube ensembles, we introduce a unified dynamic programming framework that combines three ingredients: (1) recursions that specify the dependencies between subproblems and incorporate the details of the structural ensemble and the free energy model, (2) evaluation algebras that define the mathematical form of each subproblem, (3) operation orders that specify the computational trajectory through the dependency graph of subproblems. The physical model is enhanced using new recursions that operate over the complex ensemble including coaxial and dangle stacking subensembles. The recursions are coded generically and then compiled with a quantity-specific evaluation algebra and operation order to generate an executable for each physical quantity: partition function, equilibrium base-pairing probabilities, MFE energy and proxy structure, suboptimal proxy structures, and Boltzmann sampled structures. For large complexes (e.g., 30 000 nt), scalability is achieved for partition function calculations using an overflow-safe evaluation algebra, and for equilibrium base-pairing probabilities using a backtrack-free operation order. A new blockwise operation order that treats subcomplex blocks for the complex species in a test tube ensemble enables dramatic speedups (e.g., 20–120× ) using vectorization and caching. With these performance enhancements, equilibrium analysis of substantial test tube ensembles can be performed in ≤ 1 min on a single computational core (e.g., partition function and equilibrium concentration for all complex species of up to six strands formed from two strand species of 300 nt each, or for all complex species of up to two strands formed from 80 strand species of 100 nt each). A new sampling algorithm simultaneously samples multiple structures from the complex ensemble to yield speedups of an order of magnitude or more as the number of structures increases above ≈10³. These advances are available within the NUPACK 4.0 code base (www.nupack.org) which can be flexibly scripted using the all-new NUPACK Python module

    Embedded Mean-Field Theory

    Get PDF
    We introduce embedded mean-field theory (EMFT), an approach that flexibly allows for the embedding of one mean-field theory in another without the need to specify or fix the number of particles in each subsystem. EMFT is simple, is well-defined without recourse to parameters, and inherits the simple gradient theory of the parent mean-field theories. In this paper, we report extensive benchmarking of EMFT for the case where the subsystems are treated using different levels of Kohn–Sham theory, using PBE or B3LYP/6-31G* in the high-level subsystem and LDA/STO-3G in the low-level subsystem; we also investigate different levels of density fitting in the two subsystems. Over a wide range of chemical problems, we find EMFT to perform accurately and stably, smoothly converging to the high-level of theory as the active subsystem becomes larger. In most cases, the performance is at least as good as that of ONIOM, but the advantages of EMFT are highlighted by examples that involve partitions across multiple bonds or through aromatic systems and by examples that involve more complicated electronic structure. EMFT is simple and parameter free, and based on the tests provided here, it offers an appealing new approach to a multiscale electronic structure

    A Unified Dynamic Programming Framework for the Analysis of Interacting Nucleic Acid Strands: Enhanced Models, Scalability, and Speed

    Get PDF
    Dynamic programming algorithms within the NUPACK software suite enable analysis of nucleic acid sequences over complex and test tube ensembles containing arbitrary numbers of interacting strand species, serving the needs of researchers in molecular programming, nucleic acid nanotechnology, synthetic biology, and across the life sciences. Here, to enhance the underlying physical model, ensure scalability for large calculations, and achieve dramatic speedups when calculating diverse physical quantities over complex and test tube ensembles, we introduce a unified dynamic programming framework that combines three ingredients: (1) recursions that specify the dependencies between subproblems and incorporate the details of the structural ensemble and the free energy model, (2) evaluation algebras that define the mathematical form of each subproblem, (3) operation orders that specify the computational trajectory through the dependency graph of subproblems. The physical model is enhanced using new recursions that operate over the complex ensemble including coaxial and dangle stacking subensembles. The recursions are coded generically and then compiled with a quantity-specific evaluation algebra and operation order to generate an executable for each physical quantity: partition function, equilibrium base-pairing probabilities, MFE energy and proxy structure, suboptimal proxy structures, and Boltzmann sampled structures. For large complexes (e.g., 30 000 nt), scalability is achieved for partition function calculations using an overflow-safe evaluation algebra, and for equilibrium base-pairing probabilities using a backtrack-free operation order. A new blockwise operation order that treats subcomplex blocks for the complex species in a test tube ensemble enables dramatic speedups (e.g., 20–120× ) using vectorization and caching. With these performance enhancements, equilibrium analysis of substantial test tube ensembles can be performed in ≤ 1 min on a single computational core (e.g., partition function and equilibrium concentration for all complex species of up to six strands formed from two strand species of 300 nt each, or for all complex species of up to two strands formed from 80 strand species of 100 nt each). A new sampling algorithm simultaneously samples multiple structures from the complex ensemble to yield speedups of an order of magnitude or more as the number of structures increases above ≈10³. These advances are available within the NUPACK 4.0 code base (www.nupack.org) which can be flexibly scripted using the all-new NUPACK Python module

    Third-generation in situ hybridization chain reaction: multiplexed, quantitative, sensitive, versatile, robust

    Get PDF
    In situ hybridization based on the mechanism of the hybridization chain reaction (HCR) has addressed multi-decade challenges that impeded imaging of mRNA expression in diverse organisms, offering a unique combination of multiplexing, quantitation, sensitivity, resolution and versatility. Here, with third-generation in situ HCR, we augment these capabilities using probes and amplifiers that combine to provide automatic background suppression throughout the protocol, ensuring that reagents will not generate amplified background even if they bind non-specifically within the sample. Automatic background suppression dramatically enhances performance and robustness, combining the benefits of a higher signal-to-background ratio with the convenience of using unoptimized probe sets for new targets and organisms. In situ HCR v3.0 enables three multiplexed quantitative analysis modes: (1) qHCR imaging – analog mRNA relative quantitation with subcellular resolution in the anatomical context of whole-mount vertebrate embryos; (2) qHCR flow cytometry – analog mRNA relative quantitation for high-throughput expression profiling of mammalian and bacterial cells; and (3) dHCR imaging – digital mRNA absolute quantitation via single-molecule imaging in thick autofluorescent samples

    Third-generation in situ hybridization chain reaction: multiplexed, quantitative, sensitive, versatile, robust

    Get PDF
    In situ hybridization based on the mechanism of the hybridization chain reaction (HCR) has addressed multi-decade challenges that impeded imaging of mRNA expression in diverse organisms, offering a unique combination of multiplexing, quantitation, sensitivity, resolution and versatility. Here, with third-generation in situ HCR, we augment these capabilities using probes and amplifiers that combine to provide automatic background suppression throughout the protocol, ensuring that reagents will not generate amplified background even if they bind non-specifically within the sample. Automatic background suppression dramatically enhances performance and robustness, combining the benefits of a higher signal-to-background ratio with the convenience of using unoptimized probe sets for new targets and organisms. In situ HCR v3.0 enables three multiplexed quantitative analysis modes: (1) qHCR imaging – analog mRNA relative quantitation with subcellular resolution in the anatomical context of whole-mount vertebrate embryos; (2) qHCR flow cytometry – analog mRNA relative quantitation for high-throughput expression profiling of mammalian and bacterial cells; and (3) dHCR imaging – digital mRNA absolute quantitation via single-molecule imaging in thick autofluorescent samples

    Embedded mean-field theory: Toward a large-scale ab-initio molecular dynamics

    Full text link
    Mitigating a trade-off between accuracy and computational costs is at the heart of quantum chem. A natural approach to this problem is a quantum embedding method which treats a small subset of a whole system at a high-level of theory while treating the rest at a low-level of theory. Although many attempts have been made in developing quantum embedding theories (esp. in the context of embedding a correlated wavefunction method into a mean-field method), there is no approach specialized for embedding mean-field theories without a priori user-level input for the no. of electrons in each subsystem. We introduce embedded meanfield theory (EMFT), an approach that allows for embedding of one mean-field theory in another without the need to specify or fix the no. of particles in each subsystem. Its gradient theory is notably simple as it merely inherits the gradient theory of the parent mean-field theories. We report extensive benchmark calcns. of EMFT for the case where the subsystems are treated using different levels of Kohn-Sham theory. In most cases, the performance is at least as good as that of ONIOM, a widely used embedding method, but the advantages of EMFT are highlighted by examples that involve partitions across multiple bonds or through arom. systems and by examples that involve more complicated electronic structure. Furthermore, another variant of EMFT, embedding of Kohn-Sham theory in D. Functional Tight-Binding (DFTB), is formulated. As DFTB avoids the evaluation of electron repulsion integrals which is a fundamental bottleneck in Kohn-Sham theory, this variant will reduce computational costs more substantially and hence is very appealing in multi-scale electronic structure theory

    Correction to Embedded Mean-Field Theory

    Full text link

    NUPACK: Analysis and Design of Nucleic Acid Structures, Devices, and Systems

    Full text link
    NUPACK is a growing software suite for the analysis and design of nucleic acid structures, devices, and systems serving the needs of researchers in the fields of nucleic acid nanotechnology, molecular programming, synthetic biology, and across the life sciences. NUPACK algorithms are unique in treating complex and test tube ensembles containing arbitrary numbers of interacting strand species, providing crucial tools for capturing concentration effects essential to analyzing and designing the intermolecular interactions that are a hallmark of these fields. The all-new NUPACK web app (nupack.org) has been re-architected for the cloud, leveraging a cluster that scales dynamically in response to user demand to enable rapid job submission and result inspection even at times of peak user demand. The web app exploits the all-new NUPACK 4 scientific code base as its backend, offering enhanced physical models (coaxial and dangle stacking subensembles), dramatic speedups (20-120x for test tube analysis), and increased scalability for large complexes. NUPACK 4 algorithms can also be run locally using the all-new NUPACK Python module

    Embedded Mean-Field Theory

    Full text link
    We introduce embedded mean-field theory (EMFT), an approach that flexibly allows for the embedding of one mean-field theory in another without the need to specify or fix the number of particles in each subsystem. EMFT is simple, is well-defined without recourse to parameters, and inherits the simple gradient theory of the parent mean-field theories. In this paper, we report extensive benchmarking of EMFT for the case where the subsystems are treated using different levels of Kohn–Sham theory, using PBE or B3LYP/6-31G* in the high-level subsystem and LDA/STO-3G in the low-level subsystem; we also investigate different levels of density fitting in the two subsystems. Over a wide range of chemical problems, we find EMFT to perform accurately and stably, smoothly converging to the high-level of theory as the active subsystem becomes larger. In most cases, the performance is at least as good as that of ONIOM, but the advantages of EMFT are highlighted by examples that involve partitions across multiple bonds or through aromatic systems and by examples that involve more complicated electronic structure. EMFT is simple and parameter free, and based on the tests provided here, it offers an appealing new approach to a multiscale electronic structure
    corecore