6,829 research outputs found

    Least-biased correction of extended dynamical systems using observational data

    Full text link
    We consider dynamical systems evolving near an equilibrium statistical state where the interest is in modelling long term behavior that is consistent with thermodynamic constraints. We adjust the distribution using an entropy-optimizing formulation that can be computed on-the- fly, making possible partial corrections using incomplete information, for example measured data or data computed from a different model (or the same model at a different scale). We employ a thermostatting technique to sample the target distribution with the aim of capturing relavant statistical features while introducing mild dynamical perturbation (thermostats). The method is tested for a point vortex fluid model on the sphere, and we demonstrate both convergence of equilibrium quantities and the ability of the formulation to balance stationary and transient- regime errors.Comment: 27 page

    Symbol Synchronization for SDR Using a Polyphase Filterbank Based on an FPGA

    Get PDF
    This paper is devoted to the proposal of a highly efficient symbol synchronization subsystem for Software Defined Radio. The proposed feedback phase-locked loop timing synchronizer is suitable for parallel implementation on an FPGA. The polyphase FIR filter simultaneously performs matched-filtering and arbitrary interpolation between acquired samples. Determination of the proper sampling instant is achieved by selecting a suitable polyphase filterbank using a derived index. This index is determined based on the output either the Zero-Crossing or Gardner Timing Error Detector. The paper will extensively focus on simulation of the proposed synchronization system. On the basis of this simulation, a complete, fully pipelined VHDL description model is created. This model is composed of a fully parallel polyphase filterbank based on distributed arithmetic, timing error detector and interpolation control block. Finally, RTL synthesis on an Altera Cyclone IV FPGA is presented and resource utilization in comparison with a conventional model is analyzed

    Stochastic rounding and reduced-precision fixed-point arithmetic for solving neural ordinary differential equations

    Get PDF
    Although double-precision floating-point arithmetic currently dominates high-performance computing, there is increasing interest in smaller and simpler arithmetic types. The main reasons are potential improvements in energy efficiency and memory footprint and bandwidth. However, simply switching to lower-precision types typically results in increased numerical errors. We investigate approaches to improving the accuracy of reduced-precision fixed-point arithmetic types, using examples in an important domain for numerical computation in neuroscience: the solution of Ordinary Differential Equations (ODEs). The Izhikevich neuron model is used to demonstrate that rounding has an important role in producing accurate spike timings from explicit ODE solution algorithms. In particular, fixed-point arithmetic with stochastic rounding consistently results in smaller errors compared to single precision floating-point and fixed-point arithmetic with round-to-nearest across a range of neuron behaviours and ODE solvers. A computationally much cheaper alternative is also investigated, inspired by the concept of dither that is a widely understood mechanism for providing resolution below the least significant bit (LSB) in digital signal processing. These results will have implications for the solution of ODEs in other subject areas, and should also be directly relevant to the huge range of practical problems that are represented by Partial Differential Equations (PDEs).Comment: Submitted to Philosophical Transactions of the Royal Society

    Power-efficient design of 16-bit mixed-operand multipliers

    Get PDF
    Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2004.Includes bibliographical references (p. 53).Multiplication is an expensive and slow arithmetic operation, which plays an important role in many DSP algorithms. It usually lies in the critical-delay paths, having an effect on performance of the system as well as consuming large power. Consequently, significant improvements in both power and performance can be achieved in the overall DSP system by carefully designing and optimizing power and performance of the multiplier. This thesis explores several circuit-level techniques for power-efficiently designing multipliers, including supply voltage reduction, efficient multiplication algorithms, low power circuit logic styles, and transistor sizing using dynamic and static tuners. Based on these techniques, several 16-bit multipliers have been successfully designed and implemented in 0.13[micro]m CMOS technology at the supply voltage of 1.5V and 0.9V. The multipliers are modified to handle multiplications of two 16-bit operands in which each can be either signed magnitude or two's complement formats. Examining power-performance characteristics of these multipliers reveals that both array and tree structures are feasible solutions for designing 16-bit multipliers, and complementary CMOS and single-ended CPL-TG logics are promising candidates for power-efficient design. The appropriate choices of structures and logic styles depend on power and performance constraints of the particular design.by Sataporn Pornpromlikit.M.Eng

    Diffuse interface models of locally inextensible vesicles in a viscous fluid

    Full text link
    We present a new diffuse interface model for the dynamics of inextensible vesicles in a viscous fluid. A new feature of this work is the implementation of the local inextensibility condition in the diffuse interface context. Local inextensibility is enforced by using a local Lagrange multiplier, which provides the necessary tension force at the interface. To solve for the local Lagrange multiplier, we introduce a new equation whose solution essentially provides a harmonic extension of the local Lagrange multiplier off the interface while maintaining the local inextensibility constraint near the interface. To make the method more robust, we develop a local relaxation scheme that dynamically corrects local stretching/compression errors thereby preventing their accumulation. Asymptotic analysis is presented that shows that our new system converges to a relaxed version of the inextensible sharp interface model. This is also verified numerically. Although the model does not depend on dimension, we present numerical simulations only in 2D. To solve the 2D equations numerically, we develop an efficient algorithm combining an operator splitting approach with adaptive finite elements where the Navier-Stokes equations are implicitly coupled to the diffuse interface inextensibility equation. Numerical simulations of a single vesicle in a shear flow at different Reynolds numbers demonstrate that errors in enforcing local inextensibility may accumulate and lead to large differences in the dynamics in the tumbling regime and differences in the inclination angle of vesicles in the tank-treading regime. The local relaxation algorithm is shown to effectively prevent this accumulation by driving the system back to its equilibrium state when errors in local inextensibility arise.Comment: 25 page

    On the initial estimate of interface forces in FETI methods

    Full text link
    The Balanced Domain Decomposition (BDD) method and the Finite Element Tearing and Interconnecting (FETI) method are two commonly used non-overlapping domain decomposition methods. Due to strong theoretical and numerical similarities, these two methods are generally considered as being equivalently efficient. However, for some particular cases, such as for structures with strong heterogeneities, FETI requires a large number of iterations to compute the solution compared to BDD. In this paper, the origin of the bad efficiency of FETI in these particular cases is traced back to poor initial estimates of the interface stresses. To improve the estimation of interface forces a novel strategy for splitting interface forces between neighboring substructures is proposed. The additional computational cost incurred is not significant. This yields a new initialization for the FETI method and restores numerical efficiency which makes FETI comparable to BDD even for problems where FETI was performing poorly. Various simple test problems are presented to discuss the efficiency of the proposed strategy and to illustrate the so-obtained numerical equivalence between the BDD and FETI solvers

    Low-Power, Low-Cost, & High-Performance Digital Designs : Multi-bit Signed Multiplier design using 32nm CMOS Technology

    Get PDF
    Binary multipliers are ubiquitous in digital hardware. Digital multipliers along with the adders play a major role in computing, communicating, and controlling devices. Multipliers are used majorly in the areas of digital signal and image processing, central processing unit (CPU) of the computers, high-performance and parallel scientific computing, machine learning, physical layer design of the communication equipment, etc. The predominant presence and increasing demand for low-power, low-cost, and high-performance digital hardware led to this work of developing optimized multiplier designs. Two optimized designs are proposed in this work. One is an optimized 8 x 8 Booth multiplier architecture which is implemented using 32nm CMOS technology. Synthesis (pre-layout) and post-layout results show that the delay is reduced by 24.7% and 25.6% respectively, the area is reduced by 5.5% and 15% respectively, the power consumption is reduced by 21.5% and 26.6% respectively, and the area-delay-product is reduced by 28.8% and 36.8% respectively when compared to the performance results obtained for the state-of-the-art 8 x 8 Booth multiplier designed using 32nm CMOS technology with 1.05 V supply voltage at 500 MHz input frequency. Another is a novel radix-8 structure with 3-bit grouping to reduce the number of partial products along with the effective partial product reduction schemes for 8 x 8, 16 x 16, 32 x 32, and 64 x 64 signed multipliers. Comparing the performance results of the (synthesized, post-layout) designs of sizes 32 x 32, and 64 x 64 based on the simple novel radix-8 structure with the estimated performance measurements for the optimized Booth multiplier design presented in this work, reduction in delay by (2.64%, 0.47%) and (2.74%, 18.04%) respectively, and reduction in area-delay-product by (12.12%, -5.17%) and (17.82%, 12.91%) respectively can be observed. With the use of the higher radix structure, delay, area, and power consumption can be further reduced. Appropriate adder deployment, further exploring the optimized grouping or compression strategies, and applying more low-power design techniques such as power-gating, multi-Vt MOS transistor utilization, multi-VDD domain creation, etc., help, along with the higher radix structures, realizing the more efficient multiplier designs
    • …
    corecore