8,437 research outputs found
Overview of Parallel Platforms for Common High Performance Computing
The paper deals with various parallel platforms used for high performance computing in the signal processing domain. More precisely, the methods exploiting the multicores central processing units such as message passing interface and OpenMP are taken into account. The properties of the programming methods are experimentally proved in the application of a fast Fourier transform and a discrete cosine transform and they are compared with the possibilities of MATLAB's built-in functions and Texas Instruments digital signal processors with very long instruction word architectures. New FFT and DCT implementations were proposed and tested. The implementation phase was compared with CPU based computing methods and with possibilities of the Texas Instruments digital signal processing library on C6747 floating-point DSPs. The optimal combination of computing methods in the signal processing domain and new, fast routines' implementation is proposed as well
Current and Nascent SETI Instruments
Here we describe our ongoing efforts to develop high-performance and
sensitive instrumentation for use in the search for extra-terrestrial
intelligence (SETI). These efforts include our recently deployed Search for
Extraterrestrial Emissions from Nearby Developed Intelligent Populations
Spectrometer (SERENDIP V.v) and two instruments currently under development;
the Heterogeneous Radio SETI Spectrometer (HRSS) for SETI observations in the
radio spectrum and the Optical SETI Fast Photometer (OSFP) for SETI
observations in the optical band. We will discuss the basic SERENDIP V.v
instrument design and initial analysis methodology, along with instrument
architectures and observation strategies for OSFP and HRSS. In addition, we
will demonstrate how these instruments may be built using low-cost, modular
components and programmed and operated by students using common languages, e.g.
ANSI C.Comment: 12 pages, 5 figures, Original version appears as Chapter 2 in "The
Proceedings of SETI Sessions at the 2010 Astrobiology Science Conference:
Communication with Extraterrestrial Intelligence (CETI)," Douglas A. Vakoch,
Edito
An Adaptive Design Methodology for Reduction of Product Development Risk
Embedded systems interaction with environment inherently complicates
understanding of requirements and their correct implementation. However,
product uncertainty is highest during early stages of development. Design
verification is an essential step in the development of any system, especially
for Embedded System. This paper introduces a novel adaptive design methodology,
which incorporates step-wise prototyping and verification. With each adaptive
step product-realization level is enhanced while decreasing the level of
product uncertainty, thereby reducing the overall costs. The back-bone of this
frame-work is the development of Domain Specific Operational (DOP) Model and
the associated Verification Instrumentation for Test and Evaluation, developed
based on the DOP model. Together they generate functionally valid test-sequence
for carrying out prototype evaluation. With the help of a case study 'Multimode
Detection Subsystem' the application of this method is sketched. The design
methodologies can be compared by defining and computing a generic performance
criterion like Average design-cycle Risk. For the case study, by computing
Average design-cycle Risk, it is shown that the adaptive method reduces the
product development risk for a small increase in the total design cycle time.Comment: 21 pages, 9 figure
Comprehensive Evaluation of OpenCL-based Convolutional Neural Network Accelerators in Xilinx and Altera FPGAs
Deep learning has significantly advanced the state of the art in artificial intelligence, gaining wide popularity from both industry and academia. Special interest is around Convolutional Neural Networks (CNN), which take inspiration from the hierarchical structure of the visual cortex, to form deep layers of convolutional operations, along with fully connected classifiers. Hardware implementations of these deep CNN architectures are challenged with memory bottlenecks that require many convolution and fully-connected layers demanding large amount of communication for parallel computation. Multi-core CPU based solutions have demonstrated their inadequacy for this problem due to the memory wall and low parallelism. Many-core GPU architectures show superior performance but they consume high power and also have memory constraints due to inconsistencies between cache and main memory. FPGA design solutions are also actively being explored, which allow implementing the memory hierarchy using embedded BlockRAM. This boosts the parallel use of shared memory elements between multiple processing units, avoiding data replicability and inconsistencies. This makes FPGAs potentially powerful solutions for real-time classification of CNNs. Both Altera and Xilinx have adopted OpenCL co-design framework from GPU for FPGA designs as a pseudo-automatic development solution. In this paper, a comprehensive evaluation and comparison of Altera and Xilinx OpenCL frameworks for a 5-layer deep CNN is presented. Hardware resources, temporal performance and the OpenCL architecture for CNNs are discussed. Xilinx demonstrates faster synthesis, better FPGA resource utilization and more compact boards. Altera provides multi-platforms tools, mature design community and better execution times
Digital signal processing: the impact of convergence on education, society and design flow
Design and development of real-time, memory and processor hungry digital signal processing systems has for decades been accomplished on general-purpose microprocessors. Increasing needs for high-performance DSP systems made these microprocessors unattractive for such implementations. Various attempts to improve the performance of these systems resulted in the use of dedicated digital signal processing devices like DSP processors and the former heavyweight champion of electronics design â Application Specific Integrated Circuits.
The advent of RAM-based Field Programmable Gate Arrays has changed the DSP design flow. Software algorithmic designers can now take their DSP algorithms right from inception to hardware implementation, thanks to the increasing availability of software/hardware design flow or hardware/software co-design. This has led to a demand in the industry for graduates with good skills in both Electrical Engineering and Computer Science. This paper evaluates the impact of technology on DSP-based designs, hardware design languages, and how graduate/undergraduate courses have changed to suit this transition
Time-efficient fault detection and diagnosis system for analog circuits
Time-efficient fault analysis and diagnosis of analog circuits are the most important prerequisites to achieve online health monitoring of electronic equipments, which are involving continuing challenges of ultra-large-scale integration, component tolerance, limited test points but multiple faults. This work reports an FPGA (field programmable gate array)-based analog fault diagnostic system by applying two-dimensional information fusion, two-port network analysis and interval math theory. The proposed system has three advantages over traditional ones. First, it possesses high processing speed and smart circuit size as the embedded algorithms execute parallel on FPGA. Second, the hardware structure has a good compatibility with other diagnostic algorithms. Third, the equipped Ethernet interface enhances its flexibility for remote monitoring and controlling. The experimental results obtained from two realistic example circuits indicate that the proposed methodology had yielded competitive performance in both diagnosis accuracy and time-effectiveness, with about 96% accuracy while within 60 ms computational time.Peer reviewedFinal Published versio
- âŠ