190 research outputs found

    Finite State Machines With Input Multiplexing: A Performance Study

    Get PDF
    Finite state machines with input multiplexing (FSMIMs) have been proposed in previous works as a technique for efficient mapping FSMs into ROM memory. In this paper, we propose a new architecture for implementing FSMIMs, called FSMIM with state-based input selection, whose goal is to achieve a further reduction in memory usage. This paper also describes in detail the algorithms for generating FSMIMs used by the tool FSMIM-Gen, which has been developed and made available on the Internet for free public use. A comparative study in terms of speed and area between FSMIM approaches and other field programmable gate array-based techniques is presented. The results show that the FSMIM approaches obtain huge reductions in the look-up table (LUT) usage by using a small number of embedded memory blocks. In addition, speed improvements over conventional LUT-based implementations have been obtained in many cases

    ROM-Based Finite State Machine Implementation in Low Cost FPGAs

    Get PDF
    This work presents a technique for the resource optimization of input multiplexed ROM-based Finite State Machines. This technique exploits the don't care value of the inputs to reduce the memory size as well as multiplexer complexity. This technique has been applied to a publicly available FSM benchmarks and implemented in a low-cost FPGA. Results have been compared with tools supported ROM and standard logic cells implementations. In a significant number of test cases, the proposed technique is the best design alternative, both in resource requirements and speed

    Generic low power reconfigurable distributed arithmetic processor

    Get PDF
    Higher performance, lower cost, increasingly minimizing integrated circuit components, and higher packaging density of chips are ongoing goals of the microelectronic and computer industry. As these goals are being achieved, however, power consumption and flexibility are increasingly becoming bottlenecks that need to be addressed with the new technology in Very Large-Scale Integrated (VLSI) design. For modern systems, more energy is required to support the powerful computational capability which accords with the increasing requirements, and these requirements cause the change of standards not only in audio and video broadcasting but also in communication such as wireless connection and network protocols. Powerful flexibility and low consumption are repellent, but their combination in one system is the ultimate goal of designers. A generic domain-specific low-power reconfigurable processor for the distributed arithmetic algorithm is presented in this dissertation. This domain reconfigurable processor features high efficiency in terms of area, power and delay, which approaches the performance of an ASIC design, while retaining the flexibility of programmable platforms. The architecture not only supports typical distributed arithmetic algorithms which can be found in most still picture compression standards and video conferencing standards, but also offers implementation ability for other distributed arithmetic algorithms found in digital signal processing, telecommunication protocols and automatic control. In this processor, a simple reconfigurable low power control unit is implemented with good performance in area, power and timing. The generic characteristic of the architecture makes it applicable for any small and medium size finite state machines which can be used as control units to implement complex system behaviour and can be found in almost all engineering disciplines. Furthermore, to map target applications efficiently onto the proposed architecture, a new algorithm is introduced for searching for the best common sharing terms set and it keeps the area and power consumption of the implementation at low level. The software implementation of this algorithm is presented, which can be used not only for the proposed architecture in this dissertation but also for all the implementations with adder-based distributed arithmetic algorithms. In addition, some low power design techniques are applied in the architecture, such as unsymmetrical design style including unsymmetrical interconnection arranging, unsymmetrical PTBs selection and unsymmetrical mapping basic computing units. All these design techniques achieve extraordinary power consumption saving. It is believed that they can be extended to more low power designs and architectures. The processor presented in this dissertation can be used to implement complex, high performance distributed arithmetic algorithms for communication and image processing applications with low cost in area and power compared with the traditional methods

    Decomposition tool targeting FPGA architectures

    Full text link
    The growing interest in the field of logic synthesis targeting Field Programmable Gate Arrays (FPGA) and the active research carried out by a number of research groups in the area of functional decomposition is the prime motivation for this thesis. Logic synthesis has been an area of interest in many universities all over the world. The work involves the study and implementation of techniques and methods in logic synthesis. In this work, a logic synthesis tool has been developed implementing the aspects of general and complete Decomposition method based on functional decomposition techniques [4]. The tool is aimed at producing outputs faster and more efficient than the available software. C++ Standard template library is used to develop this tool. The output of this tool is designed to be compatible with the available vendor software. The tool has been tested on MCNC benchmarks and those created keeping in mind the industry requirements

    Hardware implementation of non-bonded forces in molecular dynamics simulations

    Get PDF
    Molecular Dynamics is a computational method based on classical mechanics to describe the behavior of a molecular system. This method is used in biomolecular simulations, which are intended to contribute to the study and advance of nanotechnology, medicine, chemistry and biology. Software implementations of Molecular Dynamics simulations can spend most of time computing the non-bonded interactions. This work presents the design and implementation of an FPGA-based coprocessor that accelerates MD simulations by computing in parallel the non-bonded interactions, specifically, the van der Waals and the electrostatic interactions. These interactions are modeled as the Lennard-Jones 6-12 potential and the direct-space Ewald summation, respectively. In addition, this work introduces a novel variable transformation of the potential energy functions, and a novel interpolation method with pseudo-floating-point representation to compute the short-range forces. Also, it uses a combination of fixed-point and floating-point arithmetic to obtain the best of both representations. The FPGA coprocessor is a memory-mapped system connected to a host by PCI Express, and is provided with interruption capabilities to improve parallelization. Its main block is based on a single functional pipeline, and is connected via Avalon Bus to other peripherals such as the PCIe Hard-IP and the SG-DMA. It is implemented on an Altera¿s EP2AGX125EF35C4 device, can process 16k particles, and is configured to store up to 16 different types of particles. Simulations in a custom C-application for MD that only computes non-bonded forces become up to 12.5x faster using the FPGA coprocessor when considering 12500 atoms.PregradoINGENIERO(A) EN ELECTRÓNIC
    corecore