Search CORE

20 research outputs found

Energy efficient hardware acceleration of multimedia processing tools

Author: Kinane Andrew
Publication venue: Dublin City University. School of Electronic Engineering
Publication date: 01/01/2006
Field of study

The world of mobile devices is experiencing an ongoing trend of feature enhancement and generalpurpose multimedia platform convergence. This trend poses many grand challenges, the most pressing being their limited battery life as a consequence of delivering computationally demanding features. The envisaged mobile application features can be considered to be accelerated by a set of underpinning hardware blocks Based on the survey that this thesis presents on modem video compression standards and their associated enabling technologies, it is concluded that tight energy and throughput constraints can still be effectively tackled at algorithmic level in order to design re-usable optimised hardware acceleration cores. To prove these conclusions, the work m this thesis is focused on two of the basic enabling technologies that support mobile video applications, namely the Shape Adaptive Discrete Cosine Transform (SA-DCT) and its inverse, the SA-IDCT. The hardware architectures presented in this work have been designed with energy efficiency in mind. This goal is achieved by employing high level techniques such as redundant computation elimination, parallelism and low switching computation structures. Both architectures compare favourably against the relevant pnor art in the literature. The SA-DCT/IDCT technologies are instances of a more general computation - namely, both are Constant Matrix Multiplication (CMM) operations. Thus, this thesis also proposes an algorithm for the efficient hardware design of any general CMM-based enabling technology. The proposed algorithm leverages the effective solution search capability of genetic programming. A bonus feature of the proposed modelling approach is that it is further amenable to hardware acceleration. Another bonus feature is an early exit mechanism that achieves large search space reductions .Results show an improvement on state of the art algorithms with future potential for even greater savings

Irish Universities

DCU Online Research Access Service

Review of Rounding Based Approximate Multiplier (ROBA) For Digital Signal Processing

Author: Malviya Monika
Vyas Prof. Pankaj
Publication venue: 'Auricle Technologies, Pvt., Ltd.'
Publication date: 15/04/2019
Field of study

The fundamental idea of adjusting put together estimated multiplier depends with respect to adjusting of numbers. This multiplier can be connected for both marked and unsigned numbers. In this paper contemplated an Rounding Based Approximate Multiplier that is fast yet vitality effective. The methodology is to round the operands to the closest example of two. Along these lines the computational concentrated piece of the augmentation is excluded improving rate and vitality utilization at the cost of a little mistake. This methodology is appropriate to both marked and unsigned augmentations. The productivity of the ROBA multiplier is assessed by contrasting its execution and those of some rough and precise multipliers utilizing distinctive plan parameters

International Journal on Recent and Innovation Trends in Computing and Communication

Design and Implementation of Complexity Reduced Digital Signal Processors for Low Power Biomedical Applications

Author: Eminaga Y.
Eminaga Y.
Publication venue
Publication date: 01/01/2019
Field of study

Wearable health monitoring systems can provide remote care with supervised, inde-pendent living which are capable of signal sensing, acquisition, local processing and transmission. A generic biopotential signal (such as Electrocardiogram (ECG), and Electroencephalogram (EEG)) processing platform consists of four main functional components. The signals acquired by the electrodes are ampliﬁed and preconditioned by the (1) Analog-Front-End (AFE) which are then digitized via the (2) Analog-to-Digital Converter (ADC) for further processing. The local digital signal processing is usually handled by a custom designed (3) Digital Signal Processor (DSP) which is responsible for either anyone or combination of signal processing algorithms such as noise detection, noise/artefact removal, feature extraction, classiﬁcation and compres-sion. The digitally processed data is then transmitted via the (4) transmitter which is renown as the most power hungry block in the complete platform. All the afore-mentioned components of the wearable systems are required to be designed and ﬁtted into an integrated system where the area and the power requirements are stringent. Therefore, hardware complexity and power dissipation of each functional component are crucial aspects while designing and implementing a wearable monitoring platform. The work undertaken focuses on reducing the hardware complexity of a biosignal DSP and presents low hardware complexity solutions that can be employed in the aforemen-tioned wearable platforms. A typical state-of-the-art system utilizes Sigma Delta (Σ∆) ADCs incorporating a Σ∆ modulator and a decimation ﬁlter whereas the state-of-the-art decimation ﬁlters employ linear phase Finite-Impulse-Response (FIR) ﬁlters with high orders that in-crease the hardware complexity [1–5]. In this thesis, the novel use of minimum phase Inﬁnite-Impulse-Response (IIR) decimators is proposed where the hardware complexity is massively reduced compared to the conventional FIR decimators. In addition, the non-linear phase eﬀects of these ﬁlters are also investigated since phase non-linearity may distort the time domain representation of the signal being ﬁltered which is un-desirable eﬀect for biopotential signals especially when the ﬁducial characteristics carry diagnostic importance. In the case of ECG monitoring systems the eﬀect of the IIR ﬁlter phase non-linearity is minimal which does not aﬀect the diagnostic accuracy of the signals. The work undertaken also proposes two methods for reducing the hardware complexity of the popular biosignal processing tool, Discrete Wavelet Transform (DWT). General purpose multipliers are known to be hardware and power hungry in terms of the number of addition operations or their underlying building blocks like full adders or half adders required. Higher number of adders leads to an increase in the power consumption which is directly proportional to the clock frequency, supply voltage, switching activity and the resources utilized. A typical Field-Programmable-Gate-Array’s (FPGA) resources are Look-up Tables (LUTs) whereas a custom Digital Signal Processor’s (DSP) are gate-level cells of standard cell libraries that are used to build adders [6]. One of the proposed methods is the replacement of the hardware and power hungry general pur-pose multipliers and the coeﬃcient memories with reconﬁgurable multiplier blocks that are composed of simple shift-add networks and multiplexers. This method substantially reduces the resource utilization as well as the power consumption of the system. The second proposed method is the design and implementation of the DWT ﬁlter banks using IIR ﬁlters which employ less number of arithmetic operations compared to the state-of-the-art FIR wavelets. This reduces the hardware complexity of the analysis ﬁlter bank of the DWT and can be employed in applications where the reconstruction is not required. However, the synthesis ﬁlter bank for the IIR wavelet transform has a higher computational complexity compared to the conventional FIR wavelet synthesis ﬁlter banks since re-indexing of the ﬁltered data sequence is required that can only be achieved via the use of extra registers. Therefore, this led to the proposal of a novel design which replaces the complex IIR based synthesis ﬁlter banks with FIR ﬁl-ters which are the approximations of the associated IIR ﬁlters. Finally, a comparative study is presented where the hybrid IIR/FIR and FIR/FIR wavelet ﬁlter banks are de-ployed in a typical noise reduction scenario using the wavelet thresholding techniques. It is concluded that the proposed hybrid IIR/FIR wavelet ﬁlter banks provide better denoising performance, reduced computational complexity and power consumption in comparison to their IIR/IIR and FIR/FIR counterparts

WestminsterResearch

Techniques for Efficient Implementation of FIR and Particle Filtering

Author
Publication venue: 'Linkoping University Electronic Press'
Publication date
Field of study

Crossref

System on fabrics utilising distributed computing

Author: Partheepan Kandaswamy (7202843)
Publication venue
Publication date: 01/01/2018
Field of study

The main vision of wearable computing is to make electronic systems an important part of everyday clothing in the future which will serve as intelligent personal assistants. Wearable devices have the potential to be wearable computers and not mere input/output devices for the human body. The present thesis focuses on introducing a new wearable computing paradigm, where the processing elements are closely coupled with the sensors that are distributed using Instruction Systolic Array (ISA) architecture. The thesis describes a novel, multiple sensor, multiple processor system architecture prototype based on the Instruction Systolic Array paradigm for distributed computing on fabrics. The thesis introduces new programming model to implement the distributed computer on fabrics. The implementation of the concept has been validated using parallel algorithms. A real-time shape sensing and reconstruction application has been implemented on this architecture and has demonstrated a physical design for a wearable system based on the ISA concept constructed from off-the-shelf microcontrollers and sensors. Results demonstrate that the real time application executes on the prototype ISA implementation thus confirming the viability of the proposed architecture for fabric-resident computing devices

Loughborough University Institutional Repository

Intelligent Sensor Networks

Author
Publication venue: 'Informa UK Limited'
Publication date
Field of study

In the last decade, wireless or wired sensor networks have attracted much attention. However, most designs target general sensor network issues including protocol stack (routing, MAC, etc.) and security issues. This book focuses on the close integration of sensing, networking, and smart signal processing via machine learning. Based on their world-class research, the authors present the fundamentals of intelligent sensor networks. They cover sensing and sampling, distributed signal processing, and intelligent signal learning. In addition, they present cutting-edge research results from leading experts

OAPEN Library

The Fifth NASA Symposium on VLSI Design

Author
Publication venue
Publication date
Field of study

The fifth annual NASA Symposium on VLSI Design had 13 sessions including Radiation Effects, Architectures, Mixed Signal, Design Techniques, Fault Testing, Synthesis, Signal Processing, and other Featured Presentations. The symposium provides insights into developments in VLSI and digital systems which can be used to increase data systems performance. The presentations share insights into next generation advances that will serve as a basis for future VLSI design

NASA Technical Reports Server

Esprit '89. Proceedings of the 6th annual Esprit conference. Brussels, 27 November - 1 December 1989. EUR 12512 EN

Author
Publication venue
Publication date: 01/01/1989
Field of study

Archive of European Integration

Recommended from our members

Adaptive Coded Modulation Classification and Spectrum Sensing for Cognitive Radio Systems. Adaptive Coded Modulation Techniques for Cognitive Radio Using Kalman Filter and Interacting Multiple Model Methods

Author: Al-Juboori Ahmed O.A.S.
Publication venue: Faculty of Engineering and Informatics
Publication date: 01/01/2018
Field of study

The current and future trends of modern wireless communication systems place heavy demands on fast data transmissions in order to satisfy end users’ requirements anytime, anywhere. Such demands are obvious in recent applications such as smart phones, long term evolution (LTE), 4 & 5 Generations (4G & 5G), and worldwide interoperability for microwave access (WiMAX) platforms, where robust coding and modulations are essential especially in streaming on-line video material, social media and gaming. This eventually resulted in extreme exhaustion imposed on the frequency spectrum as a rare natural resource due to stagnation in current spectrum management policies. Since its advent in the late 1990s, cognitive radio (CR) has been conceived as an enabling technology aiming at the efficient utilisation of frequency spectrum that can lead to potential direct spectrum access (DSA) management. This is mainly attributed to its internal capabilities inherited from the concept of software defined radio (SDR) to sniff its surroundings, learn and adapt its operational parameters accordingly. CR systems (CRs) may commonly comprise one or all of the following core engines that characterise their architectures; namely, adaptive coded modulation (ACM), automatic modulation classification (AMC) and spectrum sensing (SS). Motivated by the above challenges, this programme of research is primarily aimed at the design and development of new paradigms to help improve the adaptability of CRs and thereby achieve the desirable signal processing tasks at the physical layer of the above core engines. Approximate modelling of Rayleigh and finite state Markov channels (FSMC) with a new concept borrowed from econometric studies have been approached. Then insightful channel estimation by using Kalman filter (KF) augmented with interacting multiple model (IMM) has been examined for the purpose of robust adaptability, which is applied for the first time in wireless communication systems. Such new IMM-KF combination has been facilitated in the feedback channel between wireless transmitter and receiver to adjust the transmitted power, by using a water-filling (WF) technique, and constellation pattern and rate in the ACM algorithm. The AMC has also benefited from such IMM-KF integration to boost the performance against conventional parametric estimation methods such as maximum likelihood estimate (MLE) for channel interrogation and the estimated parameters of both inserted into the ML classification algorithm. Expectation-maximisation (EM) has been applied to examine unknown transmitted modulation sequences and channel parameters in tandem. Finally, the non-parametric multitaper method (MTM) has been thoroughly examined for spectrum estimation (SE) and SS, by relying on Neyman-Pearson (NP) detection principle for hypothesis test, to allow licensed primary users (PUs) to coexist with opportunistic unlicensed secondary users (SUs) in the same frequency bands of interest without harmful effects. The performance of the above newly suggested paradigms have been simulated and assessed under various transmission settings and revealed substantial improvements

Bradford Scholars

Compiler techniques for scalable performance of stream programs on multicore architectures

Author: Gordon Michael I. (Michael Ian), 1978-
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2010
Field of study

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2010.Cataloged from PDF version of thesis.Includes bibliographical references (p. 211-222).Given the ubiquity of multicore processors, there is an acute need to enable the development of scalable parallel applications without unduly burdening programmers. Currently, programmers are asked not only to explicitly expose parallelism but also concern themselves with issues of granularity, load-balancing, synchronization, and communication. This thesis demonstrates that when algorithmic parallelism is expressed in the form of a stream program, a compiler can effectively and automatically manage the parallelism. Our compiler assumes responsibility for low-level architectural details, transforming implicit algorithmic parallelism into a mapping that achieves scalable parallel performance for a given multicore target. Stream programming is characterized by regular processing of sequences of data, and it is a natural expression of algorithms in the areas of audio, video, digital signal processing, networking, and encryption. Streaming computation is represented as a graph of independent computation nodes that communicate explicitly over data channels. Our techniques operate on contiguous regions of the stream graph where the input and output rates of the nodes are statically determinable. Within a static region, the compiler first automatically adjusts the granularity and then exploits data, task, and pipeline parallelism in a holistic fashion. We introduce techniques that data-parallelize nodes that operate on overlapping sliding windows of their input, translating serializing state into minimal and parametrized inter-core communication. Finally, for nodes that cannot be data-parallelized due to state, we are the first to automatically apply software-pipelining techniques at a coarse granularity to exploit pipeline parallelism between stateful nodes. Our framework is evaluated in the context of the StreamIt programming language. StreamIt is a high-level stream programming language that has been shown to improve programmer productivity in implementing streaming algorithms. We employ the StreamIt Core benchmark suite of 12 real-world applications to demonstrate the effectiveness of our techniques for varying multicore architectures. For a 16-core distributed memory multicore, we achieve a 14.9x mean speedup. For benchmarks that include sliding-window computation, our sliding-window data-parallelization techniques are required to enable scalable performance for a 16-core SMP multicore (14x mean speedup) and a 64-core distributed shared memory multicore (52x mean speedup).by Michael I. Gordon.Ph.D

DSpace@MIT