Search CORE

2,044 research outputs found

Residue Number Systems: a Survey

Author: Nannarelli Alberto
Re Marco
Publication venue: Technical University of Denmark, DTU Informatics, Building 321
Publication date: 01/01/2008
Field of study

Application-Specific Number Representation

Author: Fu Haohuan
Fu Haohuan
Publication venue: Computing, Imperial College London
Publication date: 01/02/2009
Field of study

Reconfigurable devices, such as Field Programmable Gate Arrays (FPGAs), enable application- specific number representations. Well-known number formats include fixed-point, floating- point, logarithmic number system (LNS), and residue number system (RNS). Such different number representations lead to different arithmetic designs and error behaviours, thus produc- ing implementations with different performance, accuracy, and cost. To investigate the design options in number representations, the first part of this thesis presents a platform that enables automated exploration of the number representation design space. The second part of the thesis shows case studies that optimise the designs for area, latency or throughput from the perspective of number representations. Automated design space exploration in the first part addresses the following two major issues: ² Automation requires arithmetic unit generation. This thesis provides optimised arithmetic library generators for logarithmic and residue arithmetic units, which support a wide range of bit widths and achieve significant improvement over previous designs. ² Generation of arithmetic units requires specifying the bit widths for each variable. This thesis describes an automatic bit-width optimisation tool called R-Tool, which combines dynamic and static analysis methods, and supports different number systems (fixed-point, floating-point, and LNS numbers). Putting it all together, the second part explores the effects of application-specific number representation on practical benchmarks, such as radiative Monte Carlo simulation, and seismic imaging computations. Experimental results show that customising the number representations brings benefits to hardware implementations: by selecting a more appropriate number format, we can reduce the area cost by up to 73.5% and improve the throughput by 14.2% to 34.1%; by performing the bit-width optimisation, we can further reduce the area cost by 9.7% to 17.3%. On the performance side, hardware implementations with customised number formats achieve 5 to potentially over 40 times speedup over software implementations

Spiral - Imperial College Digital Repository

Fast scaling in the residue number system

Author: Kong Y.
Phillips B.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2009
Field of study

Copyright © 2009 IEEEA new scheme for precisely scaling numbers in the residue number system (RNS) is presented. The scale factor K can be any number coprime to the RNS moduli. Lookup table implementations are used as a basis for comparisons between the new scheme and scaling schemes from the literature. It is shown that new scheme decreases hardware complexity compared to previous schemes without affecting time complexity.Yinan Kong and Braden Phillip

Adelaide Research & Scholarship

Macquarie University ResearchOnline

Towards the AlexNet Moment for Homomorphic Encryption: HCNN, theFirst Homomorphic CNN on Encrypted Data with GPUs

Author: Aung Khin Mi Mi
Badawi Ahmad Al
Chandrasekhar Vijay Ramaseshan
Chao Jin
Lin Jie
Mun Chan Fook
Nan Xiao
Sim Jun Jie
Tan Benjamin Hong Meng
Publication venue
Publication date: 18/08/2020
Field of study

Deep Learning as a Service (DLaaS) stands as a promising solution for cloud-based inference applications. In this setting, the cloud has a pre-learned model whereas the user has samples on which she wants to run the model. The biggest concern with DLaaS is user privacy if the input samples are sensitive data. We provide here an efficient privacy-preserving system by employing high-end technologies such as Fully Homomorphic Encryption (FHE), Convolutional Neural Networks (CNNs) and Graphics Processing Units (GPUs). FHE, with its widely-known feature of computing on encrypted data, empowers a wide range of privacy-concerned applications. This comes at high cost as it requires enormous computing power. In this paper, we show how to accelerate the performance of running CNNs on encrypted data with GPUs. We evaluated two CNNs to classify homomorphically the MNIST and CIFAR-10 datasets. Our solution achieved a sufficient security level (> 80 bit) and reasonable classification accuracy (99%) and (77.55%) for MNIST and CIFAR-10, respectively. In terms of latency, we could classify an image in 5.16 seconds and 304.43 seconds for MNIST and CIFAR-10, respectively. Our system can also classify a batch of images (> 8,000) without extra overhead

arXiv.org e-Print Archive

Cryptology ePrint Archive

Number Systems for Deep Neural Network Architectures: A Survey

Author: Al-Qutayri Mahmoud
Alsuhli Ghada
Mohammad Baker
Sakellariou Vasileios
Saleh Hani
Stouraitis Thanos
Publication venue
Publication date: 11/07/2023
Field of study

Deep neural networks (DNNs) have become an enabling component for a myriad of artificial intelligence applications. DNNs have shown sometimes superior performance, even compared to humans, in cases such as self-driving, health applications, etc. Because of their computational complexity, deploying DNNs in resource-constrained devices still faces many challenges related to computing complexity, energy efficiency, latency, and cost. To this end, several research directions are being pursued by both academia and industry to accelerate and efficiently implement DNNs. One important direction is determining the appropriate data representation for the massive amount of data involved in DNN processing. Using conventional number systems has been found to be sub-optimal for DNNs. Alternatively, a great body of research focuses on exploring suitable number systems. This article aims to provide a comprehensive survey and discussion about alternative number systems for more efficient representations of DNN data. Various number systems (conventional/unconventional) exploited for DNNs are discussed. The impact of these number systems on the performance and hardware design of DNNs is considered. In addition, this paper highlights the challenges associated with each number system and various solutions that are proposed for addressing them. The reader will be able to understand the importance of an efficient number system for DNN, learn about the widely used number systems for DNN, understand the trade-offs between various number systems, and consider various design aspects that affect the impact of number systems on DNN performance. In addition, the recent trends and related research opportunities will be highlightedComment: 28 page

arXiv.org e-Print Archive

TOA Estimation of Chirp Signal in Dense Multipath Environment for Low-Cost Acoustic Ranging

Author: Chen Minlin
Wang Xinheng
Wang Zhi
Zhang Lei
Publication venue: IEEE
Publication date: 25/07/2018
Field of study

In this paper, a novel time of arrival (TOA) estimation method is proposed based on an iterative cleaning process to extract the first path signal. The purpose is to address the challenge in dense multipath indoor environments that the power of the first path component is normally smaller than other multipath components, where the traditional match filtering (MF)-based TOA estimator causes huge errors. Along with parameter estimation, the proposed process is trying to detect and extract the first path component by eliminating the strongest multipath component using a band-elimination filter in fractional Fourier domain at each iterative procedure. To further improve the stability, a slack threshold and a strict threshold are introduced. Six simple and easily calculated termination criteria are proposed to monitor the iterative process. When the iterative 'cleaning' process is done, the outputs include the enhanced first path component and its estimated parameters. Based on these outputs, an optimal reference signal for the MF estimator can be constructed, and a more accurate TOA estimation can be conveniently obtained. The results from numerical simulations and experimental investigations verified that, for acoustic chirp signal TOA estimation, the accuracy of the proposed method is superior to those obtained by the conventional MF estimators

University of Liverpool Repository

UWL Repository

Residue Number System Based Building Blocks for Applications in Digital Signal Processing

Author: Younes Dina
Publication venue: Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií
Publication date: 01/01/2013
Field of study

Předkládaná disertační práce se zabývá návrhem základních bloků v systému zbytkových tříd pro zvýšení výkonu aplikací určených pro digitální zpracování signálů (DSP). Systém zbytkových tříd (RNS) je neváhová číselná soustava, jež umožňuje provádět paralelizovatelné, vysokorychlostní, bezpečné a proti chybám odolné aritmetické operace, které jsou zpracovávány bez přenosu mezi řády. Tyto vlastnosti jej činí značně perspektivním pro použití v DSP aplikacích náročných na výpočetní výkon a odolných proti chybám. Typický RNS systém se skládá ze tří hlavních částí: převodníku z binárního kódu do RNS, který počítá ekvivalent vstupních binárních hodnot v systému zbytkových tříd, dále jsou to paralelně řazené RNS aritmetické jednotky, které provádějí aritmetické operace s operandy již převedenými do RNS. Poslední část pak tvoří převodník z RNS do binárního kódu, který převádí výsledek zpět do výchozího binárního kódu. Hlavním cílem této disertační práce bylo navrhnout nové struktury základních bloků výše zmiňovaného systému zbytkových tříd, které mohou být využity v aplikacích DSP. Tato disertační práce předkládá zlepšení a návrhy nových struktur komponent RNS, simulaci a také ověření jejich funkčnosti prostřednictvím implementace v obvodech FPGA. Kromě návrhů nové struktury základních komponentů RNS je prezentován také podrobný výzkum různých sad modulů, který je srovnává a determinuje nejefektivnější sadu pro různé dynamické rozsahy. Dalším z klíčových přínosů disertační práce je objevení a ověření podmínky určující výběr optimální sady modulů, která umožňuje zvýšit výkonnost aplikací DSP. Dále byla navržena aplikace pro zpracování obrazu využívající RNS, která má vůči klasické binární implementanci nižší spotřebu a vyšší maximální pracovní frekvenci. V závěru práce byla vyhodnocena hlavní kritéria při rozhodování, zda je vhodnější pro danou aplikaci využít binární číselnou soustavu nebo RNS.This doctoral thesis deals with designing residue number system based building blocks to enhance the performance of digital signal processing applications. The residue number system (RNS) is a non-weighted number system that provides carry-free, parallel, high speed, secure and fault tolerant arithmetic operations. These features make it very attractive to be used in high-performance and fault tolerant digital signal processing (DSP) applications. A typical RNS system consists of three main components; the first one is the binary to residue converter that computes the RNS equivalent of the inputs represented in the binary number system. The second component in this system is parallel residue arithmetic units that perform arithmetic operations on the operands already represented in RNS. The last component is the residue to binary converter, which converts the outputs back into their binary representation. The main aim of this thesis was to propose novel structures of the basic components of this system in order to be later used as fundamental units in DSP applications. This thesis encloses improving and designing novel structures of these components, simulating and verifying their efficiency via FPGA implementation. In addition to suggesting novel structures of basic RNS components, a detailed study on different moduli sets that compares and determines the most efficient one for different dynamic range requirements is also presented. One of the main outcomes of this thesis is concluding and verifying the main condition that should be met when choosing a moduli set, in order to improve the timing performance of a DSP application. An RNS-based image processing application is also proposed. Its efficiency, in terms of timing performance and power consumption, is proved via comparing it with a binary-based one. Finally, the main considerations that should be taken into account when choosing to use the binary number system or RNS are also discussed in details.

Digital library of Brno University of Technology

National Repository of Grey Literature

Gaussian Message Passing for Overloaded Massive MIMO-NOMA

Author: Guan Yong Liang
Huang Chongwen
Li Ying
Liu Lei
Yuen Chau
Publication venue
Publication date: 01/01/2018
Field of study

This paper considers a low-complexity Gaussian Message Passing (GMP) scheme for a coded massive Multiple-Input Multiple-Output (MIMO) systems with Non-Orthogonal Multiple Access (massive MIMO-NOMA), in which a base station with

N_s

antennas serves

N_u

sources simultaneously in the same frequency. Both

N_u

and

N_s

are large numbers, and we consider the overloaded cases with

N_u>N_s

. The GMP for MIMO-NOMA is a message passing algorithm operating on a fully-connected loopy factor graph, which is well understood to fail to converge due to the correlation problem. In this paper, we utilize the large-scale property of the system to simplify the convergence analysis of the GMP under the overloaded condition. First, we prove that the \emph{variances} of the GMP definitely converge to the mean square error (MSE) of Linear Minimum Mean Square Error (LMMSE) multi-user detection. Secondly, the \emph{means} of the traditional GMP will fail to converge when

N_u/N_s< (\sqrt{2}-1)^{-2}\approx5.83

. Therefore, we propose and derive a new convergent GMP called scale-and-add GMP (SA-GMP), which always converges to the LMMSE multi-user detection performance for any

N_u/N_s>1

, and show that it has a faster convergence speed than the traditional GMP with the same complexity. Finally, numerical results are provided to verify the validity and accuracy of the theoretical results presented.Comment: Accepted by IEEE TWC, 16 pages, 11 figure

arXiv.org e-Print Archive

DR-NTU (Digital Repository of NTU)

Obstacle avoidance and distance measurement for unmanned aerial vehicles using monocular vision

Author: N Aswini
S V Uma
Publication venue: 'Institute of Advanced Engineering and Science'
Publication date: 01/10/2019
Field of study

Unmanned Aerial Vehicles or commonly known as drones are better suited for "dull, dirty, or dangerous" missions than manned aircraft. The drone can be either remotely controlled or it can travel as per predefined path using complex automation algorithm built during its development. In general, Unmanned Aerial Vehicle (UAV) is the combination of Drone in the air and control system on the ground. Design of an UAV means integrating hardware, software, sensors, actuators, communication systems and payloads into a single unit for the application involved. To make it completely autonomous, the most challenging problem faced by UAVs is obstacle avoidance. In this paper, a novel method to detect frontal obstacles using monocular camera is proposed. Computer Vision algorithms like Scale Invariant Feature Transform (SIFT) and Speeded Up Robust Feature (SURF) are used to detect frontal obstacles and then distance of the obstacle from camera is calculated. To meet the defined objectives, designed system is tested with self-developed videos which are captured by DJI Phantom 4 pro

ZENODO

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Institute of Advanced Engineering and Science

Searching for continuous gravitational wave sources in binary systems

Author: Dhurandhar Sanjeev V.
Vecchio Alberto
Publication venue: 'American Physical Society (APS)'
Publication date: 24/11/2000
Field of study

We consider the problem of searching for continuous gravitational wave sources orbiting a companion object. This issue is of particular interest because the LMXB's, and among them Sco X-1, might be marginally detectable with 2 years coherent observation time by the Earth-based laser interferometers expected to come on line by 2002, and clearly observable by the second generation of detectors. Moreover, several radio pulsars, which could be deemed to be CW sources, are found to orbit a companion star or planet, and the LIGO/VIRGO/GEO network plans to continuously monitor such systems. We estimate the computational costs for a search launched over the additional five parameters describing generic elliptical orbits using match filtering techniques. These techniques provide the optimal signal-to-noise ratio and also a very clear and transparent theoretical framework. We provide ready-to-use analytical expressions for the number of templates required to carry out the searches in the astrophysically relevant regions of the parameter space, and how the computational cost scales with the ranges of the parameters. We also determine the critical accuracy to which a particular parameter must be known, so that no search is needed for it. In order to disentangle the computational burden involved in the orbital motion of the CW source, from the other source parameters (position in the sky and spin-down), and reduce the complexity of the analysis, we assume that the source is monochromatic and its location in the sky is exactly known. The orbital elements, on the other hand, are either assumed to be completely unknown or only partly known. We apply our theoretical analysis to Sco X-1 and the neutron stars with binary companions which are listed in the radio pulsar catalogue.Comment: 31 pages, LaTeX, 6 eps figures, submitted to PR

arXiv.org e-Print Archive

University of Birmingham Research Portal

CERN Document Server

MPG.PuRe