492 research outputs found

    Acceleration Techniques for Sparse Recovery Based Plane-wave Decomposition of a Sound Field

    Get PDF
    Plane-wave decomposition by sparse recovery is a reliable and accurate technique for plane-wave decomposition which can be used for source localization, beamforming, etc. In this work, we introduce techniques to accelerate the plane-wave decomposition by sparse recovery. The method consists of two main algorithms which are spherical Fourier transformation (SFT) and sparse recovery. Comparing the two algorithms, the sparse recovery is the most computationally intensive. We implement the SFT on an FPGA and the sparse recovery on a multithreaded computing platform. Then the multithreaded computing platform could be fully utilized for the sparse recovery. On the other hand, implementing the SFT on an FPGA helps to flexibly integrate the microphones and improve the portability of the microphone array. For implementing the SFT on an FPGA, we develop a scalable FPGA design model that enables the quick design of the SFT architecture on FPGAs. The model considers the number of microphones, the number of SFT channels and the cost of the FPGA and provides the design of a resource optimized and cost-effective FPGA architecture as the output. Then we investigate the performance of the sparse recovery algorithm executed on various multithreaded computing platforms (i.e., chip-multiprocessor, multiprocessor, GPU, manycore). Finally, we investigate the influence of modifying the dictionary size on the computational performance and the accuracy of the sparse recovery algorithms. We introduce novel sparse-recovery techniques which use non-uniform dictionaries to improve the performance of the sparse recovery on a parallel architecture

    Real-time sound synthesis on a multi-processor platform

    Get PDF
    Real-time sound synthesis means that the calculation and output of each sound sample for a channel of audio information must be completed within a sample period. At a broadcasting standard, a sampling rate of 32,000 Hz, the maximum period available is 31.25 μsec. Such requirements demand a large amount of data processing power. An effective solution for this problem is a multi-processor platform; a parallel and distributed processing system. The suitability of the MIDI [Music Instrument Digital Interface] standard, published in 1983, as a controller for real-time applications is examined. Many musicians have expressed doubts on the decade old standard's ability for real-time performance. These have been investigated by measuring timing in various musical gestures, and by comparing these with the subjective characteristics of human perception. An implementation and its optimisation of real-time additive synthesis programs on a multi-transputer network are described. A prototype 81-polyphonic-note- organ configuration was implemented. By devising and deploying monitoring processes, the network's performance was measured and enhanced, leading to an efficient usage; the 88-note configuration. Since 88 simultaneous notes are rarely necessary in most performances, a scheduling program for dynamic note allocation was then introduced to achieve further efficiency gains. Considering calculation redundancies still further, a multi-sampling rate approach was applied as a further step to achieve an optimal performance. The theories underlining sound granulation, as a means of constructing complex sounds from grains, and the real-time implementation of this technique are outlined. The idea of sound granulation is quite similar to the quantum-wave theory, "acoustic quanta". Despite the conceptual simplicity, the signal processing requirements set tough demands, providing a challenge for this audio synthesis engine. Three issues arising from the results of the implementations above are discussed; the efficiency of the applications implemented, provisions for new processors and an optimal network architecture for sound synthesis

    Media gateway utilizando um GPU

    Get PDF
    Mestrado em Engenharia de Computadores e Telemátic

    Doctor of Philosophy

    Get PDF
    dissertationThe embedded system space is characterized by a rapid evolution in the complexity and functionality of applications. In addition, the short time-to-market nature of the business motivates the use of programmable devices capable of meeting the conflicting constraints of low-energy, high-performance, and short design times. The keys to achieving these conflicting constraints are specialization and maximally extracting available application parallelism. General purpose processors are flexible but are either too power hungry or lack the necessary performance. Application-specific integrated circuits (ASICS) efficiently meet the performance and power needs but are inflexible. Programmable domain-specific architectures (DSAs) are an attractive middle ground, but their design requires significant time, resources, and expertise in a variety of specialties, which range from application algorithms to architecture and ultimately, circuit design. This dissertation presents CoGenE, a design framework that automates the design of energy-performance-optimal DSAs for embedded systems. For a given application domain and a user-chosen initial architectural specification, CoGenE consists of a a Compiler to generate execution binary, a simulator Generator to collect performance/energy statistics, and an Explorer that modifies the current architecture to improve energy-performance-area characteristics. The above process repeats automatically until the user-specified constraints are achieved. This removes or alleviates the time needed to understand the application, manually design the DSA, and generate object code for the DSA. Thus, CoGenE is a new design methodology that represents a significant improvement in performance, energy dissipation, design time, and resources. This dissertation employs the face recognition domain to showcase a flexible architectural design methodology that creates "ASIC-like" DSAs. The DSAs are instruction set architecture (ISA)-independent and achieve good energy-performance characteristics by coscheduling the often conflicting constraints of data access, data movement, and computation through a flexible interconnect. This represents a significant increase in programming complexity and code generation time. To address this problem, the CoGenE compiler employs integer linear programming (ILP)-based 'interconnect-aware' scheduling techniques for automatic code generation. The CoGenE explorer employs an iterative technique to search the complete design space and select a set of energy-performance-optimal candidates. When compared to manual designs, results demonstrate that CoGenE produces superior designs for three application domains: face recognition, speech recognition and wireless telephony. While CoGenE is well suited to applications that exhibit a streaming behavior, multithreaded applications like ray tracing present a different but important challenge. To demonstrate its generality, CoGenE is evaluated in designing a novel multicore N-wide SIMD architecture, known as StreamRay, for the ray tracing domain. CoGenE is used to synthesize the SIMD execution cores, the compiler that generates the application binary, and the interconnection subsystem. Further, separating address and data computations in space reduces data movement and contention for resources, thereby significantly improving performance compared to existing ray tracing approaches

    34th Midwest Symposium on Circuits and Systems-Final Program

    Get PDF
    Organized by the Naval Postgraduate School Monterey California. Cosponsored by the IEEE Circuits and Systems Society. Symposium Organizing Committee: General Chairman-Sherif Michael, Technical Program-Roberto Cristi, Publications-Michael Soderstrand, Special Sessions- Charles W. Therrien, Publicity: Jeffrey Burl, Finance: Ralph Hippenstiel, and Local Arrangements: Barbara Cristi

    The Role of Computers in Research and Development at Langley Research Center

    Get PDF
    This document is a compilation of presentations given at a workshop on the role cf computers in research and development at the Langley Research Center. The objectives of the workshop were to inform the Langley Research Center community of the current software systems and software practices in use at Langley. The workshop was organized in 10 sessions: Software Engineering; Software Engineering Standards, methods, and CASE tools; Solutions of Equations; Automatic Differentiation; Mosaic and the World Wide Web; Graphics and Image Processing; System Design Integration; CAE Tools; Languages; and Advanced Topics

    High-level synthesis of VLSI circuits

    Get PDF
    • …
    corecore