169 research outputs found
A Unified Dynamic Programming Framework for the Analysis of Interacting Nucleic Acid Strands: Enhanced Models, Scalability, and Speed
Dynamic programming algorithms within the NUPACK software suite enable analysis of nucleic acid sequences over complex and test tube ensembles containing arbitrary numbers of interacting strand species, serving the needs of researchers in molecular programming, nucleic acid nanotechnology, synthetic biology, and across the life sciences. Here, to enhance the underlying physical model, ensure scalability for large calculations, and achieve dramatic speedups when calculating diverse physical quantities over complex and test tube ensembles, we introduce a unified dynamic programming framework that combines three ingredients: (1) recursions that specify the dependencies between subproblems and incorporate the details of the structural ensemble and the free energy model, (2) evaluation algebras that define the mathematical form of each subproblem, (3) operation orders that specify the computational trajectory through the dependency graph of subproblems. The physical model is enhanced using new recursions that operate over the complex ensemble including coaxial and dangle stacking subensembles. The recursions are coded generically and then compiled with a quantity-specific evaluation algebra and operation order to generate an executable for each physical quantity: partition function, equilibrium base-pairing probabilities, MFE energy and proxy structure, suboptimal proxy structures, and Boltzmann sampled structures. For large complexes (e.g., 30 000 nt), scalability is achieved for partition function calculations using an overflow-safe evaluation algebra, and for equilibrium base-pairing probabilities using a backtrack-free operation order. A new blockwise operation order that treats subcomplex blocks for the complex species in a test tube ensemble enables dramatic speedups (e.g., 20–120× ) using vectorization and caching. With these performance enhancements, equilibrium analysis of substantial test tube ensembles can be performed in ≤ 1 min on a single computational core (e.g., partition function and equilibrium concentration for all complex species of up to six strands formed from two strand species of 300 nt each, or for all complex species of up to two strands formed from 80 strand species of 100 nt each). A new sampling algorithm simultaneously samples multiple structures from the complex ensemble to yield speedups of an order of magnitude or more as the number of structures increases above ≈10³. These advances are available within the NUPACK 4.0 code base (www.nupack.org) which can be flexibly scripted using the all-new NUPACK Python module
Cray performance data from five benchmarks
The five benchmark programs discussed in TM-88956, February 1987, were run on the CRAY X-MP/24 under different operating systems and compilers. Performance data is reported for runs under early versions of UNICOS and CFT77. The most recent data includes a system of configuration for a X-MP hardware upgrade. Performance figures for the Y-MP are shown for comparison. Differences in the figures are analyzed and discussed
Forecasting Time Series with VARMA Recursions on Graphs
Graph-based techniques emerged as a choice to deal with the dimensionality
issues in modeling multivariate time series. However, there is yet no complete
understanding of how the underlying structure could be exploited to ease this
task. This work provides contributions in this direction by considering the
forecasting of a process evolving over a graph. We make use of the
(approximate) time-vertex stationarity assumption, i.e., timevarying graph
signals whose first and second order statistical moments are invariant over
time and correlated to a known graph topology. The latter is combined with VAR
and VARMA models to tackle the dimensionality issues present in predicting the
temporal evolution of multivariate time series. We find out that by projecting
the data to the graph spectral domain: (i) the multivariate model estimation
reduces to that of fitting a number of uncorrelated univariate ARMA models and
(ii) an optimal low-rank data representation can be exploited so as to further
reduce the estimation costs. In the case that the multivariate process can be
observed at a subset of nodes, the proposed models extend naturally to Kalman
filtering on graphs allowing for optimal tracking. Numerical experiments with
both synthetic and real data validate the proposed approach and highlight its
benefits over state-of-the-art alternatives.Comment: submitted to the IEEE Transactions on Signal Processin
A Unified Dynamic Programming Framework for the Analysis of Interacting Nucleic Acid Strands: Enhanced Models, Scalability, and Speed
Dynamic programming algorithms within the NUPACK software suite enable analysis of nucleic acid sequences over complex and test tube ensembles containing arbitrary numbers of interacting strand species, serving the needs of researchers in molecular programming, nucleic acid nanotechnology, synthetic biology, and across the life sciences. Here, to enhance the underlying physical model, ensure scalability for large calculations, and achieve dramatic speedups when calculating diverse physical quantities over complex and test tube ensembles, we introduce a unified dynamic programming framework that combines three ingredients: (1) recursions that specify the dependencies between subproblems and incorporate the details of the structural ensemble and the free energy model, (2) evaluation algebras that define the mathematical form of each subproblem, (3) operation orders that specify the computational trajectory through the dependency graph of subproblems. The physical model is enhanced using new recursions that operate over the complex ensemble including coaxial and dangle stacking subensembles. The recursions are coded generically and then compiled with a quantity-specific evaluation algebra and operation order to generate an executable for each physical quantity: partition function, equilibrium base-pairing probabilities, MFE energy and proxy structure, suboptimal proxy structures, and Boltzmann sampled structures. For large complexes (e.g., 30 000 nt), scalability is achieved for partition function calculations using an overflow-safe evaluation algebra, and for equilibrium base-pairing probabilities using a backtrack-free operation order. A new blockwise operation order that treats subcomplex blocks for the complex species in a test tube ensemble enables dramatic speedups (e.g., 20–120× ) using vectorization and caching. With these performance enhancements, equilibrium analysis of substantial test tube ensembles can be performed in ≤ 1 min on a single computational core (e.g., partition function and equilibrium concentration for all complex species of up to six strands formed from two strand species of 300 nt each, or for all complex species of up to two strands formed from 80 strand species of 100 nt each). A new sampling algorithm simultaneously samples multiple structures from the complex ensemble to yield speedups of an order of magnitude or more as the number of structures increases above ≈10³. These advances are available within the NUPACK 4.0 code base (www.nupack.org) which can be flexibly scripted using the all-new NUPACK Python module
Distributed Recursive Least-Squares: Stability and Performance Analysis
The recursive least-squares (RLS) algorithm has well-documented merits for
reducing complexity and storage requirements, when it comes to online
estimation of stationary signals as well as for tracking slowly-varying
nonstationary processes. In this paper, a distributed recursive least-squares
(D-RLS) algorithm is developed for cooperative estimation using ad hoc wireless
sensor networks. Distributed iterations are obtained by minimizing a separable
reformulation of the exponentially-weighted least-squares cost, using the
alternating-minimization algorithm. Sensors carry out reduced-complexity tasks
locally, and exchange messages with one-hop neighbors to consent on the
network-wide estimates adaptively. A steady-state mean-square error (MSE)
performance analysis of D-RLS is conducted, by studying a stochastically-driven
`averaged' system that approximates the D-RLS dynamics asymptotically in time.
For sensor observations that are linearly related to the time-invariant
parameter vector sought, the simplifying independence setting assumptions
facilitate deriving accurate closed-form expressions for the MSE steady-state
values. The problems of mean- and MSE-sense stability of D-RLS are also
investigated, and easily-checkable sufficient conditions are derived under
which a steady-state is attained. Without resorting to diminishing step-sizes
which compromise the tracking ability of D-RLS, stability ensures that per
sensor estimates hover inside a ball of finite radius centered at the true
parameter vector, with high-probability, even when inter-sensor communication
links are noisy. Interestingly, computer simulations demonstrate that the
theoretical findings are accurate also in the pragmatic settings whereby
sensors acquire temporally-correlated data.Comment: 30 pages, 4 figures, submitted to IEEE Transactions on Signal
Processin
On the Learning Behavior of Adaptive Networks - Part I: Transient Analysis
This work carries out a detailed transient analysis of the learning behavior
of multi-agent networks, and reveals interesting results about the learning
abilities of distributed strategies. Among other results, the analysis reveals
how combination policies influence the learning process of networked agents,
and how these policies can steer the convergence point towards any of many
possible Pareto optimal solutions. The results also establish that the learning
process of an adaptive network undergoes three (rather than two) well-defined
stages of evolution with distinctive convergence rates during the first two
stages, while attaining a finite mean-square-error (MSE) level in the last
stage. The analysis reveals what aspects of the network topology influence
performance directly and suggests design procedures that can optimize
performance by adjusting the relevant topology parameters. Interestingly, it is
further shown that, in the adaptation regime, each agent in a sparsely
connected network is able to achieve the same performance level as that of a
centralized stochastic-gradient strategy even for left-stochastic combination
strategies. These results lead to a deeper understanding and useful insights on
the convergence behavior of coupled distributed learners. The results also lead
to effective design mechanisms to help diffuse information more thoroughly over
networks.Comment: to appear in IEEE Transactions on Information Theory, 201
- …