Search CORE

1,195 research outputs found

Hiding State in CλaSH Hardware Descriptions

Author: Baaij Christiaan
Gerards Marco
Kooijman Matthijs
Kuper Jan
Publication venue: Utrecht University
Publication date: 01/01/2010
Field of study

Synchronous hardware can be modelled as a mapping from input and state to output and a new state, such mappings are referred to as transition functions. It is natural to use a functional language to implement transition functions. The CaSH compiler is capable of translating transition functions to VHDL. Modelling hardware using multiple components is convenient. Components in CaSH can be considered as instantiations of functions. To avoid packing and unpacking state when composing components, functions are lifted to arrows. By using arrows the chance of making errors will decrease as it is not required to manually (un)pack the state. Furthermore, the Haskell do-syntax for arrows increases the readability of hardware designs. This is demonstrated using a realistic example of a circuit which consists of multiple components

University of Twente Research Information

A Dual Digital Signal Processor VME Board For Instrumentation And Control Applications

Author: Dong H.
Flood R.
Hovater C.
Musson J.
Publication venue
Publication date: 01/11/2001
Field of study

A Dual Digital Signal Processing VME Board was developed for the Continuous Electron Beam Accelerator Facility (CEBAF) Beam Current Monitor (BCM) system at Jefferson Lab. It is a versatile general-purpose digital signal processing board using an open architecture, which allows for adaptation to various applications. The base design uses two independent Texas Instrument (TI) TMS320C6711, which are 900 MFLOPS floating-point digital signal processors (DSP). Applications that require a fixed point DSP can be implemented by replacing the baseline DSP with the pin-for-pin compatible TMS320C6211. The design can be manufactured with a reduced chip set without redesigning the printed circuit board. For example it can be implemented as a single-channel DSP with no analog I/O.Comment: 3 PDF page

arXiv.org e-Print Archive

UNT Digital Library

Design, development and use of the finite element machine

Author: Adams L. M.
Voigt R. C.
Publication venue
Publication date
Field of study

Some of the considerations that went into the design of the Finite Element Machine, a research asynchronous parallel computer are described. The present status of the system is also discussed along with some indication of the type of results that were obtained

NASA Technical Reports Server

Reducing Power Consumption Using Customized Numerical Representations in Digital Hearing Aids

Author: Hemmeter Eric E.
Publication venue: Washington University Open Scholarship
Publication date: 01/05/2003
Field of study

This thesis examines the eﬀects of changing the numerical representation of audio signals in digital hearing aids to minimize power consumption. Within the hearing aid design a majority of the power used is consumed in the many ﬁnite impulse response ﬁlters. The main processing involved in these ﬁlters is a multiply-accumulate function. We examine the power consumption of 12 diﬀerent multiply-accumulate units that use the following numerical representations: a 16-bit linear representation, a 9-bit logarithmic representation, and 10 diﬀerent ﬂoating-point rep-representations ranging from 9 to 13 bits. A selection of the multiply-accumulators are simulated using a continuous-circuit simulator. The power estimates from this are compared with signal transition counts from a discrete event simulator to quantify the relationship between transition counts and power consumption. This relationship is then used to examine other numerical representations

Washington University St. Louis: Open Scholarship

A standard-cell self-timed multiplier for energy and area critical synchronous systems

Author: Killpack Kip C.
Myers Chris J.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2001
Field of study

Journal ArticleThis paper describes the design of a standard-cell self-timed multiplier for use in energy and area critical synchronous systems. The area of this multiplier is bounded by N rather than N2 as seen in more traditional combinational parallel array designs, where N is the word size. Energy has a polynomial growth with word size, but has a coefficient that is much smaller than that seen in a combinational array design. Although the multiplier is self-tamed, it can be embedded in a synchronous system appearing as a combinational element. This paper presents latency, area, and energy estimates for the multiplier implemented at various word sizes, and compares these numbers with a traditional combinational array multiplier. The self-timed multiplier uses 1/3 the energy and 1/7 the area of the combinational design fora 24-bit word size

The University of Utah: J. Willard Marriott Digital Library

Low-Power and Reconfigurable Asynchronous ASIC Design Implementing Recurrent Neural Networks

Author: Nelson Spencer
Publication venue: ScholarWorks@UARK
Publication date: 01/05/2021
Field of study

Artificial intelligence (AI) has experienced a tremendous surge in recent years, resulting in high demand for a wide array of implementations of algorithms in the field. With the rise of Internet-of-Things devices, the need for artificial intelligence algorithms implemented in hardware with tight design restrictions has become even more prevalent. In terms of low power and area, ASIC implementations have the best case. However, these implementations suffer from high non-recurring engineering costs, long time-to-market, and a complete lack of flexibility, which significantly hurts their appeal in an environment where time-to-market is so critical. The time-to-market gap can be shortened through the use of reconfigurable solutions, such as FPGAs, but these come with high cost per unit and significant power and area deficiencies over their ASIC counterparts. To bridge these gaps, this dissertation work develops two methodologies to improve the usability of ASIC implementations of neural networks in these applications. The first method demonstrates a method for substantial reductions in design time for asynchronous implementations of a set of AI algorithms known as Recurrent Neural Networks (RNN) by analyzing the possible architectures and implementing a library of generic or easily altered components that can be used to quickly implement a chosen RNN architecture. A tapeout of this method was completed using as few as 112 hours of labor by the designer from RNN selection to a DRC/LVS clean chip layout ready for fabrication. The second method develops a flow to implement a set of RNNs in a single reconfigurable ASIC, offering a middle ground between fully reconfigurable solutions and completely application-specific implementations. This reconfigurable design is capable of representing thousands of possible RNN configurations in a single IC. A tapeout of this design was also completed, with both tapeouts using the TSMC 65nm bulk CMOS process

ScholarWorks@UARK

UARK (University of Arkansas )