Search CORE

747 research outputs found

A New Family of High.Performance Parallel Decimal Multipliers

Author
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Improved Combined Binary/Decimal Fixed-Point Multipliers

Author: Brian Hickmann
Mark Erle
Michael Schulte
Publication venue
Publication date: 03/04/2020
Field of study

Abstract-Decimal multiplication is important in many commercial applications including banking, tax calculation, currency conversion, and other financial areas. This paper presents several combined binary/decimal fixed-point multipliers that use the BCD-4221 recoding for the decimal digits. This allows the use of binary carry-save hardware to perform decimal addition with a small correction. Our proposed designs contain several novel improvements over previously published designs. These include an improved reduction tree organization to reduce the area and delay of the multiplier and improved reduction tree components that leverage the redundant decimal encodings to help reduce delay. A novel split reduction tree architecture is also introduced that reduces the delay of the binary product with only a small increase in total area. Area and delay estimates are presented that show that the proposed designs have significant area improvements over separate binary and decimal multipliers while still maintaining similar latencies for both decimal and binary operations

CiteSeerX

Optimal noise-canceling networks

Author: Dunkel Jörn
Ronellenfitsch Henrik
Wilczek Michael
Publication venue: 'American Physical Society (APS)'
Publication date: 03/10/2018
Field of study

Natural and artificial networks, from the cerebral cortex to large-scale power grids, face the challenge of converting noisy inputs into robust signals. The input fluctuations often exhibit complex yet statistically reproducible correlations that reflect underlying internal or environmental processes such as synaptic noise or atmospheric turbulence. This raises the practically and biophysically relevant of question whether and how noise-filtering can be hard-wired directly into a network's architecture. By considering generic phase oscillator arrays under cost constraints, we explore here analytically and numerically the design, efficiency and topology of noise-canceling networks. Specifically, we find that when the input fluctuations become more correlated in space or time, optimal network architectures become sparser and more hierarchically organized, resembling the vasculature in plants or animals. More broadly, our results provide concrete guiding principles for designing more robust and efficient power grids and sensor networks.Comment: 6 pages, 3 figures, supplementary materia

arXiv.org e-Print Archive

DSpace@MIT

MPG.PuRe

Area and Power efficient booth’s Multipliers Based on Non-Redundant Radix-4Signed-Digit Encoding

Author: Gopalaswamy P
Sankaram R Bheema
Publication venue: Kakinada Institute of Engineering and Technology for Women
Publication date: 01/01/2017
Field of study

In this paper, we introduce an architecture of pre-encoded multipliers for Digital Signal Processing applications based on off-line encoding of coefficients. To this extend, the Non-Redundant radix-4 Signed-Digit (NR4SD) encoding technique, which uses the digit values {-1, 0, +1, +2} or {-2,-1,0,+1}, is proposed leading to a multiplier design with less complex partial products implementation. Extensive experimental analysis verifies that the proposed pre-encoded NR4SD multipliers, including the coefficients memory, are more area and power efficient than the conventional Modified Booth scheme

International Journal of Science Engineering and Advance Technology (IJSEAT)

KAVUAKA: a low-power application-specific processor architecture for digital hearing aids

Author: Gerlach Lukas
Publication venue: Hannover : Institutionelles Repositorium der Leibniz Universität Hannover
Publication date: 01/01/2021
Field of study

The power consumption of digital hearing aids is very restricted due to their small physical size and the available hardware resources for signal processing are limited. However, there is a demand for more processing performance to make future hearing aids more useful and smarter. Future hearing aids should be able to detect, localize, and recognize target speakers in complex acoustic environments to further improve the speech intelligibility of the individual hearing aid user. Computationally intensive algorithms are required for this task. To maintain acceptable battery life, the hearing aid processing architecture must be highly optimized for extremely low-power consumption and high processing performance.The integration of application-specific instruction-set processors (ASIPs) into hearing aids enables a wide range of architectural customizations to meet the stringent power consumption and performance requirements. In this thesis, the application-specific hearing aid processor KAVUAKA is presented, which is customized and optimized with state-of-the-art hearing aid algorithms such as speaker localization, noise reduction, beamforming algorithms, and speech recognition. Specialized and application-specific instructions are designed and added to the baseline instruction set architecture (ISA). Among the major contributions are a multiply-accumulate (MAC) unit for real- and complex-valued numbers, architectures for power reduction during register accesses, co-processors and a low-latency audio interface. With the proposed MAC architecture, the KAVUAKA processor requires 16 % less cycles for the computation of a 128-point fast Fourier transform (FFT) compared to related programmable digital signal processors. The power consumption during register file accesses is decreased by 6 %to 17 % with isolation and by-pass techniques. The hardware-induced audio latency is 34 %lower compared to related audio interfaces for frame size of 64 samples.The final hearing aid system-on-chip (SoC) with four KAVUAKA processor cores and ten co-processors is integrated as an application-specific integrated circuit (ASIC) using a 40 nm low-power technology. The die size is 3.6 mm2. Each of the processors and co-processors contains individual customizations and hardware features with a varying datapath width between 24-bit to 64-bit. The core area of the 64-bit processor configuration is 0.134 mm2. The processors are organized in two clusters that share memory, an audio interface, co-processors and serial interfaces. The average power consumption at a clock speed of 10 MHz is 2.4 mW for SoC and 0.6 mW for the 64-bit processor.Case studies with four reference hearing aid algorithms are used to present and evaluate the proposed hardware architectures and optimizations. The program code for each processor and co-processor is generated and optimized with evolutionary algorithms for operation merging,instruction scheduling and register allocation. The KAVUAKA processor architecture is com-pared to related processor architectures in terms of processing performance, average power consumption, and silicon area requirements

Institutionelles Repositorium der Leibniz Universität Hannover

Applications of satellite technology for regional organizations (Project ASTRO)

Author: Schilling D. L.
Wecker S. C.
Publication venue
Publication date
Field of study

The direct arithmetic processing of adaptive delta modulation (ADM) encoded signals, conversion from ADM encoded signals to pulse code modulation (PCM) encoded signals, and conversion from PCM to ADM encoded signals are discussed. It is shown that signals which are ADM encoded can be arithmetically processed directly, without first decoding. Operating on the DM bit stream, and employing only standard digital hardware, the sum, difference and product can be obtained in PCM and ADM format

NASA Technical Reports Server

MIDAS, prototype Multivariate Interactive Digital Analysis System for large area earth resources surveys. Volume 1: System description

Author: Christenson D.
Gordon M.
Kistler R.
Kriegler F.
Lampert S.
Marshall R.
Mclaughlin R.
Publication venue
Publication date
Field of study

A third-generation, fast, low cost, multispectral recognition system (MIDAS) able to keep pace with the large quantity and high rates of data acquisition from large regions with present and projected sensots is described. The program can process a complete ERTS frame in forty seconds and provide a color map of sixteen constituent categories in a few minutes. A principle objective of the MIDAS program is to provide a system well interfaced with the human operator and thus to obtain large overall reductions in turn-around time and significant gains in throughput. The hardware and software generated in the overall program is described. The system contains a midi-computer to control the various high speed processing elements in the data path, a preprocessor to condition data, and a classifier which implements an all digital prototype multivariate Gaussian maximum likelihood or a Bayesian decision algorithm. Sufficient software was developed to perform signature extraction, control the preprocessor, compute classifier coefficients, control the classifier operation, operate the color display and printer, and diagnose operation

NASA Technical Reports Server

Approximation Opportunities in Edge Computing Hardware : A Systematic Literature Review

Author: Damsgaard Hans Jakob
Nurmi Jari
Ometov Aleksandr
Publication venue
Publication date: 29/11/2022
Field of study

With the increasing popularity of the Internet of Things and massive Machine Type Communication technologies, the number of connected devices is rising. However, while enabling valuable effects to our lives, bandwidth and latency constraints challenge Cloud processing of their associated data amounts. A promising solution to these challenges is the combination of Edge and approximate computing techniques that allows for data processing nearer to the user. This paper aims to survey the potential benefits of these paradigms’ intersection. We provide a state-of-the-art review of circuit-level and architecture-level hardware techniques and popular applications. We also outline essential future research directions.publishedVersionPeer reviewe

ZENODO

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Trepo - Institutional Repository of Tampere University

Recommended from our members

Machine division

Author: Earl E. Swartzlander Jr.
Inwook Kong
Publication venue: United States Patent and Trademark Office
Publication date: 21/05/2010
Field of study

Techniques are generally described that include methods, devices, systems and/or apparatus for dividing a numerator by a denominator. Some example methods may include selecting a first numerical factor stored in an electronic storage media. The first numerical factor may be multiplied by a numerator at least in part using a first logic circuit configured to perform multiplication. The first numerical factor may also be multiplied by a denominator. A second numerical factor may be calculated based, at least in part, on an approximation of a square of the difference between unity and the product of the first numerical factor and the denominator. The second numerical factor may be multiplied by the product of the numerator and the first numerical factor at least in part using the first logic circuit, to generate an approximation of a quotient of the numerator and the denominator.Board of Regents, University of Texas Syste

Texas ScholarWorks

A novel technique for fast multiplication

Author: Beckhoff G. F.
Farooqui Aamir A.
Sait Sadiq M.
Publication venue
Publication date
Field of study

In this paper we present the design of a new high-speed multiplication unit. The design is based on non-overlapped scanning of 3-bit ® elds of the multiplier. In this technique the partial products of the multiplicand and three bits of the multiplier are pre-calculated using only hardwired shifts. These partial products are then added using a tree of carry-save-adders, and ® nally the sum and carry vectors are added using a carry-lookahead adder. In the case of 2 s complement multiplication the tree of carry-save-adders also receives a correction output produced in parallel with the partial products. The algorithm is modelled in a hardware description language and its VLSI chip implemented. The performance of the new design is comparedwith that of other recent ones proposed in literature