1,077 research outputs found
Algorithms and programming tools for image processing on the MPP:3
This is the third and final report on the work done for NASA Grant 5-403 on Algorithms and Programming Tools for Image Processing on the MPP:3. All the work done for this grant is summarized in the introduction. Work done since August 1986 is reported in detail. Research for this grant falls under the following headings: (1) fundamental algorithms for the MPP; (2) programming utilities for the MPP; (3) the Parallel Pascal Development System; and (4) performance analysis. In this report, the results of two efforts are reported: region growing, and performance analysis of important characteristic algorithms. In each case, timing results from MPP implementations are included. A paper is included in which parallel algorithms for region growing on the MPP is discussed. These algorithms permit different sized regions to be merged in parallel. Details on the implementation and peformance of several important MPP algorithms are given. These include a number of standard permutations, the FFT, convolution, arbitrary data mappings, image warping, and pyramid operations, all of which have been implemented on the MPP. The permutation and image warping functions have been included in the standard development system library
High-Throughput DTW accelerator with minimum area in AMD FPGA by HLS.
Dynamic Time Warping (DTW) is a dynamic programming
algorithm that is known to be one of the best methods
to measure the similarities between two signals, even if there are
variations in the speed of those. It is extensively used in many
machine learning algorithms, especially for pattern recognition
and classification. U nfortunately, i t h as a q uadratic complexity,
which results in very high computational costs. Furthermore,
its data dependency made it also very difficult t o parallelize.
Special attention has been paid to computing DTW on the edge,
as a way to reduce the load of communication on Internet-of-
Thing applications. In this work, we propose a minimum area
implementation of the DTW algorithm in AMD FPGAs with
optimal use of the resources. That is achieved by maximizing
the use time of the resources and taking advantage of the inner
structure of the AMD FPGAs. This architecture could be used in
small devices or as a base for a multi-core implementation with
very high-throughput.MCIN/AEI/10.13039/501100011033and European Union Next Generation EU/PRTR under Project TED2021-
131527B-I00; by the Fondo Europeo de Desarrollo Regional (UMA20-FEDERJA-059); and by AMD™(Xilinx™) University Program
Universidad de Málaga. Campus de Excelencia Internacional Andalucía Tech
System-level design of energy-efficient sensor-based human activity recognition systems: a model-based approach
This thesis contributes an evaluation of state-of-the-art dataflow models of computation regarding their suitability for a model-based design and analysis of human activity recognition systems, in terms of expressiveness and analyzability, as well as model accuracy. Different aspects of state-of-the-art human activity recognition systems have been modeled and analyzed. Based on existing methods, novel analysis approaches have been developed to acquire extra-functional properties like processor utilization, data communication rates, and finally energy consumption of the system
An efficient implementation of lattice-ladder multilayer perceptrons in field programmable gate arrays
The implementation efficiency of electronic systems is a combination of conflicting requirements, as increasing volumes of computations, accelerating the exchange of data, at the same time increasing energy consumption forcing the researchers not only to optimize the algorithm, but also to quickly implement in a specialized hardware. Therefore in this work, the problem of efficient and straightforward implementation of operating in a real-time electronic intelligent systems on field-programmable gate array (FPGA) is tackled. The object of research is specialized FPGA intellectual property (IP) cores that operate in a real-time. In the thesis the following main aspects of the research object are investigated: implementation criteria and techniques.
The aim of the thesis is to optimize the FPGA implementation process of selected class dynamic artificial neural networks. In order to solve stated problem and reach the goal following main tasks of the thesis are formulated: rationalize the selection of a class of Lattice-Ladder Multi-Layer Perceptron (LLMLP) and its electronic intelligent system test-bed – a speaker dependent Lithuanian speech recognizer, to be created and investigated; develop dedicated technique for implementation of LLMLP class on FPGA that is based on specialized efficiency criteria for a circuitry synthesis; develop and experimentally affirm the efficiency of optimized FPGA IP cores used in
Lithuanian speech recognizer.
The dissertation contains: introduction, four chapters and general conclusions. The first chapter reveals the fundamental knowledge on computer-aideddesign, artificial neural networks and speech recognition implementation on FPGA. In the second chapter the efficiency criteria and technique of LLMLP IP cores implementation are proposed in order to make multi-objective optimization of throughput, LLMLP complexity and resource utilization. The data flow graphs are applied for optimization of LLMLP computations. The optimized neuron processing element is proposed. The IP cores for features extraction and comparison are developed for Lithuanian speech recognizer and analyzed in third chapter. The fourth chapter is devoted for experimental verification of developed numerous LLMLP IP cores. The experiments of isolated word recognition accuracy and speed for different speakers, signal to noise ratios, features extraction and accelerated comparison methods were performed.
The main results of the thesis were published in 12 scientific publications: eight of them were printed in peer-reviewed scientific journals, four of them in a Thomson Reuters Web of Science database, four articles – in conference proceedings. The results were presented in 17 scientific conferences
Advanced avionics concepts: Autonomous spacecraft control
A large increase in space operations activities is expected because of Space Station Freedom (SSF) and long range Lunar base missions and Mars exploration. Space operations will also increase as a result of space commercialization (especially the increase in satellite networks). It is anticipated that the level of satellite servicing operations will grow tenfold from the current level within the next 20 years. This growth can be sustained only if the cost effectiveness of space operations is improved. Cost effectiveness is operational efficiency with proper effectiveness. A concept is presented of advanced avionics, autonomous spacecraft control, that will enable the desired growth, as well as maintain the cost effectiveness (operational efficiency) in satellite servicing operations. The concept of advanced avionics that allows autonomous spacecraft control is described along with a brief description of each component. Some of the benefits of autonomous operations are also described. A technology utilization breakdown is provided in terms of applications
Multi-transputer based isolated word speech recognition system.
by Francis Cho-yiu Chik.Thesis (M.Phil.)--Chinese University of Hong Kong, 1996.Includes bibliographical references (leaves 129-135).Chapter 1 --- Introduction --- p.1Chapter 1.1 --- Automatic speech recognition and its applications --- p.1Chapter 1.1.1 --- Artificial Neural Network (ANN) approach --- p.3Chapter 1.2 --- Motivation --- p.5Chapter 1.3 --- Background --- p.6Chapter 1.3.1 --- Speech recognition --- p.6Chapter 1.3.2 --- Parallel processing --- p.7Chapter 1.3.3 --- Parallel architectures --- p.10Chapter 1.3.4 --- Transputer --- p.12Chapter 1.4 --- Thesis outline --- p.13Chapter 2 --- Speech Signal Pre-processing --- p.14Chapter 2.1 --- Determine useful signal --- p.14Chapter 2.1.1 --- End point detection using energy --- p.15Chapter 2.1.2 --- End point detection enhancement using zero crossing rate --- p.18Chapter 2.2 --- Pre-emphasis filter --- p.19Chapter 2.3 --- Feature extraction --- p.20Chapter 2.3.1 --- Filter-bank spectrum analysis model --- p.22Chapter 2.3.2 --- Linear Predictive Coding (LPC) coefficients --- p.25Chapter 2.3.3 --- Cepstral coefficients --- p.27Chapter 2.3.4 --- Zero crossing rate and energy --- p.27Chapter 2.3.5 --- Pitch (fundamental frequency) detection --- p.28Chapter 2.4 --- Discussions --- p.30Chapter 3 --- Speech Recognition Methods --- p.32Chapter 3.1 --- Template matching using Dynamic Time Warping (DTW) --- p.32Chapter 3.2 --- Hidden Markov Model (HMM) --- p.37Chapter 3.2.1 --- Vector Quantization (VQ) --- p.38Chapter 3.2.2 --- Description of a discrete HMM --- p.41Chapter 3.2.3 --- Probability evaluation --- p.42Chapter 3.2.4 --- Estimation technique for model parameters --- p.46Chapter 3.2.5 --- State sequence for the observation sequence --- p.48Chapter 3.3 --- 2-dimensional Hidden Markov Model (2dHMM) --- p.49Chapter 3.3.1 --- Calculation for a 2dHMM --- p.50Chapter 3.4 --- Discussions --- p.56Chapter 4 --- Implementation --- p.59Chapter 4.1 --- Transputer based multiprocessor system --- p.59Chapter 4.1.1 --- Transputer Development System (TDS) --- p.60Chapter 4.1.2 --- System architecture --- p.61Chapter 4.1.3 --- Transtech TMB16 mother board --- p.62Chapter 4.1.4 --- Farming technique --- p.64Chapter 4.2 --- Farming technique on extracting spectral amplitude feature --- p.68Chapter 4.3 --- Feature extraction for LPC --- p.73Chapter 4.4 --- DTW based recognition --- p.77Chapter 4.4.1 --- Feature extraction --- p.77Chapter 4.4.2 --- Training and matching --- p.78Chapter 4.5 --- HMM based recognition --- p.80Chapter 4.5.1 --- Feature extraction --- p.80Chapter 4.5.2 --- Model training and matching --- p.81Chapter 4.6 --- 2dHMM based recognition --- p.83Chapter 4.6.1 --- Feature extraction --- p.83Chapter 4.6.2 --- Training --- p.83Chapter 4.6.3 --- Recognition --- p.87Chapter 4.7 --- Training convergence in HMM and 2dHMM --- p.88Chapter 4.8 --- Discussions --- p.91Chapter 5 --- Experimental Results --- p.92Chapter 5.1 --- "Comparison of DTW, HMM and 2dHMM" --- p.93Chapter 5.2 --- Comparison between HMM and 2dHMM --- p.98Chapter 5.2.1 --- Recognition test on 20 English words --- p.98Chapter 5.2.2 --- Recognition test on 10 Cantonese syllables --- p.102Chapter 5.3 --- Recognition test on 80 Cantonese syllables --- p.113Chapter 5.4 --- Speed matching --- p.118Chapter 5.5 --- Computational performance --- p.119Chapter 5.5.1 --- Training performance --- p.119Chapter 5.5.2 --- Recognition performance --- p.120Chapter 6 --- Discussions and Conclusions --- p.126Bibliography --- p.129Chapter A --- An ANN Model for Speech Recognition --- p.136Chapter B --- A Speech Signal Represented in Fequency Domain (Spectrogram) --- p.138Chapter C --- Dynamic Programming --- p.144Chapter D --- Markov Process --- p.145Chapter E --- Maximum Likelihood (ML) --- p.146Chapter F --- Multiple Training --- p.149Chapter F.1 --- HMM --- p.150Chapter F.2 --- 2dHMM --- p.150Chapter G --- IMS T800 Transputer --- p.152Chapter G.1 --- IMS T800 architecture --- p.152Chapter G.2 --- Instruction encoding --- p.153Chapter G.3 --- Floating point instructions --- p.155Chapter G.4 --- Optimizing use of the stack --- p.157Chapter G.5 --- Concurrent operation of FPU and CPU --- p.15
Recommended from our members
Investigation into the wafer-scale integration of fine-grain parallel processing computer systems
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.This thesis investigates the potential of wafer-scale integration (WSI) for the implementation of low-cost fine-grain parallel processing computer systems. As WSI is a relatively new subject, there was little work on which to base investigations. Indeed, most WSI architectures existed only as untried and sometimes vague proposals. Accordingly, the research strategy approached this problem by identifying a representative WSI structure and architecture on which to base investigations. An analysis of architectural proposals identified associative memory to be general purpose parallel processing component used in a wide range of WSI architectures. Furthermore, this analysis provided a set of WSI-level design requirements to evaluate the sustainability of different architectures as research vehicles. The WSI-ASP (WASP) device, which has a large associative memory as its main component is shown to meet these requirements and hence was chosen as the research vehicle. Consequently, this thesis addresses WSI potential through an in-depth investigation into the feasibility of implementing a large associative memory for the WASP device that meets the demanding technological constraints of WSI. Overall, the thesis concludes that WSI offers significant potential for the implementation of low-cost fine-grain parallel processing computer systems. However, due to the dual constraints of thermal management and the area required for the power distribution network, power density is a major design constraint in WSI. Indeed, it is shown that WSI power densities need to be an order of magnitude lower than VLSI power densities. The thesis demonstrates that for associative memories at least, VLSI designs are unsuited to implementation in WSI. Rather, it is shown that WSI circuits must be closely matched to the operational environment to assure suitable power densities. These circuits are significantly larger than their VLSI equivalents. Nonetheless, the thesis demonstrates that by concentrating on the most power intensive circuits, it is possible to achieve acceptable power densities with only a modest increase in area overheads.SER
- …