Search CORE

1,077 research outputs found

Algorithms and programming tools for image processing on the MPP:3

Author: Reeves Anthony P.
Publication venue
Publication date
Field of study

This is the third and final report on the work done for NASA Grant 5-403 on Algorithms and Programming Tools for Image Processing on the MPP:3. All the work done for this grant is summarized in the introduction. Work done since August 1986 is reported in detail. Research for this grant falls under the following headings: (1) fundamental algorithms for the MPP; (2) programming utilities for the MPP; (3) the Parallel Pascal Development System; and (4) performance analysis. In this report, the results of two efforts are reported: region growing, and performance analysis of important characteristic algorithms. In each case, timing results from MPP implementations are included. A paper is included in which parallel algorithms for region growing on the MPP is discussed. These algorithms permit different sized regions to be merged in parallel. Details on the implementation and peformance of several important MPP algorithms are given. These include a number of standard permutations, the FFT, convolution, arbitrary data mappings, image warping, and pyramid operations, all of which have been implemented on the MPP. The permutation and image warping functions have been included in the standard development system library

NASA Technical Reports Server

High-Throughput DTW accelerator with minimum area in AMD FPGA by HLS.

Author: Hormigo Aguilar Francisco Javier
Hormigo-Jimenez Marco
Publication venue
Publication date: 01/01/2023
Field of study

Dynamic Time Warping (DTW) is a dynamic programming algorithm that is known to be one of the best methods to measure the similarities between two signals, even if there are variations in the speed of those. It is extensively used in many machine learning algorithms, especially for pattern recognition and classification. U nfortunately, i t h as a q uadratic complexity, which results in very high computational costs. Furthermore, its data dependency made it also very difficult t o parallelize. Special attention has been paid to computing DTW on the edge, as a way to reduce the load of communication on Internet-of- Thing applications. In this work, we propose a minimum area implementation of the DTW algorithm in AMD FPGAs with optimal use of the resources. That is achieved by maximizing the use time of the resources and taking advantage of the inner structure of the AMD FPGAs. This architecture could be used in small devices or as a base for a multi-core implementation with very high-throughput.MCIN/AEI/10.13039/501100011033and European Union Next Generation EU/PRTR under Project TED2021- 131527B-I00; by the Fondo Europeo de Desarrollo Regional (UMA20-FEDERJA-059); and by AMD™(Xilinx™) University Program Universidad de Málaga. Campus de Excelencia Internacional Andalucía Tech

Repositorio Institucional Universidad de Málaga

A programmable display layer for virtual reality system architectures

Author: Fröhlich B. (Bernd)
Liere R. (Robert) van
Smit F.A. (Ferdi)
Publication venue: I.E.E.E. Computer Society Press
Publication date: 01/01/2010
Field of study

CWI's Institutional Repository

System-level design of energy-efficient sensor-based human activity recognition systems: a model-based approach

Author: Grützmacher Florian (gnd: 1246041081)
Publication venue: Universität Rostock Rostock
Publication date
Field of study

This thesis contributes an evaluation of state-of-the-art dataflow models of computation regarding their suitability for a model-based design and analysis of human activity recognition systems, in terms of expressiveness and analyzability, as well as model accuracy. Different aspects of state-of-the-art human activity recognition systems have been modeled and analyzed. Based on existing methods, novel analysis approaches have been developed to acquire extra-functional properties like processor utilization, data communication rates, and finally energy consumption of the system

Rostocker Dokumentenserver

An efficient implementation of lattice-ladder multilayer perceptrons in field programmable gate arrays

Author: Sledevič Tomyslav
Publication venue
Publication date: 05/05/2016
Field of study

The implementation efficiency of electronic systems is a combination of conflicting requirements, as increasing volumes of computations, accelerating the exchange of data, at the same time increasing energy consumption forcing the researchers not only to optimize the algorithm, but also to quickly implement in a specialized hardware. Therefore in this work, the problem of efficient and straightforward implementation of operating in a real-time electronic intelligent systems on field-programmable gate array (FPGA) is tackled. The object of research is specialized FPGA intellectual property (IP) cores that operate in a real-time. In the thesis the following main aspects of the research object are investigated: implementation criteria and techniques. The aim of the thesis is to optimize the FPGA implementation process of selected class dynamic artificial neural networks. In order to solve stated problem and reach the goal following main tasks of the thesis are formulated: rationalize the selection of a class of Lattice-Ladder Multi-Layer Perceptron (LLMLP) and its electronic intelligent system test-bed – a speaker dependent Lithuanian speech recognizer, to be created and investigated; develop dedicated technique for implementation of LLMLP class on FPGA that is based on specialized efficiency criteria for a circuitry synthesis; develop and experimentally affirm the efficiency of optimized FPGA IP cores used in Lithuanian speech recognizer. The dissertation contains: introduction, four chapters and general conclusions. The first chapter reveals the fundamental knowledge on computer-aideddesign, artificial neural networks and speech recognition implementation on FPGA. In the second chapter the efficiency criteria and technique of LLMLP IP cores implementation are proposed in order to make multi-objective optimization of throughput, LLMLP complexity and resource utilization. The data flow graphs are applied for optimization of LLMLP computations. The optimized neuron processing element is proposed. The IP cores for features extraction and comparison are developed for Lithuanian speech recognizer and analyzed in third chapter. The fourth chapter is devoted for experimental verification of developed numerous LLMLP IP cores. The experiments of isolated word recognition accuracy and speed for different speakers, signal to noise ratios, features extraction and accelerated comparison methods were performed. The main results of the thesis were published in 12 scientific publications: eight of them were printed in peer-reviewed scientific journals, four of them in a Thomson Reuters Web of Science database, four articles – in conference proceedings. The results were presented in 17 scientific conferences

Vilniaus Gedimino Technikos Universitetas: VGTU Talpykla / Vilnius Gediminas Technical University: VGTU Repository

Advanced avionics concepts: Autonomous spacecraft control

Author
Publication venue
Publication date
Field of study

A large increase in space operations activities is expected because of Space Station Freedom (SSF) and long range Lunar base missions and Mars exploration. Space operations will also increase as a result of space commercialization (especially the increase in satellite networks). It is anticipated that the level of satellite servicing operations will grow tenfold from the current level within the next 20 years. This growth can be sustained only if the cost effectiveness of space operations is improved. Cost effectiveness is operational efficiency with proper effectiveness. A concept is presented of advanced avionics, autonomous spacecraft control, that will enable the desired growth, as well as maintain the cost effectiveness (operational efficiency) in satellite servicing operations. The concept of advanced avionics that allows autonomous spacecraft control is described along with a brief description of each component. Some of the benefits of autonomous operations are also described. A technology utilization breakdown is provided in terms of applications

NASA Technical Reports Server

Interactive space-variant image filtering

Author: Moore Kevin W
Publication venue
Publication date: 18/07/2018
Field of study

The Australian National University

Multi-transputer based isolated word speech recognition system.

Author
Publication venue: Department of Cultural and Religious Studies, The Chinese University of Hong Kong
Publication date: 01/01/1996
Field of study

by Francis Cho-yiu Chik.Thesis (M.Phil.)--Chinese University of Hong Kong, 1996.Includes bibliographical references (leaves 129-135).Chapter 1 --- Introduction --- p.1Chapter 1.1 --- Automatic speech recognition and its applications --- p.1Chapter 1.1.1 --- Artificial Neural Network (ANN) approach --- p.3Chapter 1.2 --- Motivation --- p.5Chapter 1.3 --- Background --- p.6Chapter 1.3.1 --- Speech recognition --- p.6Chapter 1.3.2 --- Parallel processing --- p.7Chapter 1.3.3 --- Parallel architectures --- p.10Chapter 1.3.4 --- Transputer --- p.12Chapter 1.4 --- Thesis outline --- p.13Chapter 2 --- Speech Signal Pre-processing --- p.14Chapter 2.1 --- Determine useful signal --- p.14Chapter 2.1.1 --- End point detection using energy --- p.15Chapter 2.1.2 --- End point detection enhancement using zero crossing rate --- p.18Chapter 2.2 --- Pre-emphasis filter --- p.19Chapter 2.3 --- Feature extraction --- p.20Chapter 2.3.1 --- Filter-bank spectrum analysis model --- p.22Chapter 2.3.2 --- Linear Predictive Coding (LPC) coefficients --- p.25Chapter 2.3.3 --- Cepstral coefficients --- p.27Chapter 2.3.4 --- Zero crossing rate and energy --- p.27Chapter 2.3.5 --- Pitch (fundamental frequency) detection --- p.28Chapter 2.4 --- Discussions --- p.30Chapter 3 --- Speech Recognition Methods --- p.32Chapter 3.1 --- Template matching using Dynamic Time Warping (DTW) --- p.32Chapter 3.2 --- Hidden Markov Model (HMM) --- p.37Chapter 3.2.1 --- Vector Quantization (VQ) --- p.38Chapter 3.2.2 --- Description of a discrete HMM --- p.41Chapter 3.2.3 --- Probability evaluation --- p.42Chapter 3.2.4 --- Estimation technique for model parameters --- p.46Chapter 3.2.5 --- State sequence for the observation sequence --- p.48Chapter 3.3 --- 2-dimensional Hidden Markov Model (2dHMM) --- p.49Chapter 3.3.1 --- Calculation for a 2dHMM --- p.50Chapter 3.4 --- Discussions --- p.56Chapter 4 --- Implementation --- p.59Chapter 4.1 --- Transputer based multiprocessor system --- p.59Chapter 4.1.1 --- Transputer Development System (TDS) --- p.60Chapter 4.1.2 --- System architecture --- p.61Chapter 4.1.3 --- Transtech TMB16 mother board --- p.62Chapter 4.1.4 --- Farming technique --- p.64Chapter 4.2 --- Farming technique on extracting spectral amplitude feature --- p.68Chapter 4.3 --- Feature extraction for LPC --- p.73Chapter 4.4 --- DTW based recognition --- p.77Chapter 4.4.1 --- Feature extraction --- p.77Chapter 4.4.2 --- Training and matching --- p.78Chapter 4.5 --- HMM based recognition --- p.80Chapter 4.5.1 --- Feature extraction --- p.80Chapter 4.5.2 --- Model training and matching --- p.81Chapter 4.6 --- 2dHMM based recognition --- p.83Chapter 4.6.1 --- Feature extraction --- p.83Chapter 4.6.2 --- Training --- p.83Chapter 4.6.3 --- Recognition --- p.87Chapter 4.7 --- Training convergence in HMM and 2dHMM --- p.88Chapter 4.8 --- Discussions --- p.91Chapter 5 --- Experimental Results --- p.92Chapter 5.1 --- "Comparison of DTW, HMM and 2dHMM" --- p.93Chapter 5.2 --- Comparison between HMM and 2dHMM --- p.98Chapter 5.2.1 --- Recognition test on 20 English words --- p.98Chapter 5.2.2 --- Recognition test on 10 Cantonese syllables --- p.102Chapter 5.3 --- Recognition test on 80 Cantonese syllables --- p.113Chapter 5.4 --- Speed matching --- p.118Chapter 5.5 --- Computational performance --- p.119Chapter 5.5.1 --- Training performance --- p.119Chapter 5.5.2 --- Recognition performance --- p.120Chapter 6 --- Discussions and Conclusions --- p.126Bibliography --- p.129Chapter A --- An ANN Model for Speech Recognition --- p.136Chapter B --- A Speech Signal Represented in Fequency Domain (Spectrogram) --- p.138Chapter C --- Dynamic Programming --- p.144Chapter D --- Markov Process --- p.145Chapter E --- Maximum Likelihood (ML) --- p.146Chapter F --- Multiple Training --- p.149Chapter F.1 --- HMM --- p.150Chapter F.2 --- 2dHMM --- p.150Chapter G --- IMS T800 Transputer --- p.152Chapter G.1 --- IMS T800 architecture --- p.152Chapter G.2 --- Instruction encoding --- p.153Chapter G.3 --- Floating point instructions --- p.155Chapter G.4 --- Optimizing use of the stack --- p.157Chapter G.5 --- Concurrent operation of FPU and CPU --- p.15

CUHK Digital Repository

Recommended from our members

Investigation into the wafer-scale integration of fine-grain parallel processing computer systems

Author: Jones Simon Richard
Publication venue: Brunel University
Publication date: 01/01/1986
Field of study

This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.This thesis investigates the potential of wafer-scale integration (WSI) for the implementation of low-cost fine-grain parallel processing computer systems. As WSI is a relatively new subject, there was little work on which to base investigations. Indeed, most WSI architectures existed only as untried and sometimes vague proposals. Accordingly, the research strategy approached this problem by identifying a representative WSI structure and architecture on which to base investigations. An analysis of architectural proposals identified associative memory to be general purpose parallel processing component used in a wide range of WSI architectures. Furthermore, this analysis provided a set of WSI-level design requirements to evaluate the sustainability of different architectures as research vehicles. The WSI-ASP (WASP) device, which has a large associative memory as its main component is shown to meet these requirements and hence was chosen as the research vehicle. Consequently, this thesis addresses WSI potential through an in-depth investigation into the feasibility of implementing a large associative memory for the WASP device that meets the demanding technological constraints of WSI. Overall, the thesis concludes that WSI offers significant potential for the implementation of low-cost fine-grain parallel processing computer systems. However, due to the dual constraints of thermal management and the area required for the power distribution network, power density is a major design constraint in WSI. Indeed, it is shown that WSI power densities need to be an order of magnitude lower than VLSI power densities. The thesis demonstrates that for associative memories at least, VLSI designs are unsuited to implementation in WSI. Rather, it is shown that WSI circuits must be closely matched to the operational environment to assure suitable power densities. These circuits are significantly larger than their VLSI equivalents. Nonetheless, the thesis demonstrates that by concentrating on the most power intensive circuits, it is possible to achieve acceptable power densities with only a modest increase in area overheads.SER

Brunel University Research Archive