9,227 research outputs found
Coarse-grained reconfigurable array architectures
Coarse-Grained ReconïŹgurable Array (CGRA) architectures accelerate the same inner loops that beneïŹt from the high ILP support in VLIW architectures. By executing non-loop code on other cores, however, CGRAs can focus on such loops to execute them more efïŹciently. This chapter discusses the basic principles of CGRAs, and the wide range of design options available to a CGRA designer, covering a large number of existing CGRA designs. The impact of different options on ïŹexibility, performance, and power-efïŹciency is discussed, as well as the need for compiler support. The ADRES CGRA design template is studied in more detail as a use case to illustrate the need for design space exploration, for compiler support and for the manual ïŹne-tuning of source code
Low Power system Design techniques for mobile computers
Portable products are being used increasingly. Because these systems are battery powered, reducing power consumption is vital. In this report we give the properties of low power design and techniques to exploit them on the architecture of the system. We focus on: min imizing capacitance, avoiding unnecessary and wasteful activity, and reducing voltage and frequency. We review energy reduction techniques in the architecture and design of a hand-held computer and the wireless communication system, including error control, sys tem decomposition, communication and MAC protocols, and low power short range net works
Design techniques for low-power systems
Portable products are being used increasingly. Because these systems are battery powered, reducing power consumption is vital. In this report we give the properties of low-power design and techniques to exploit them on the architecture of the system. We focus on: minimizing capacitance, avoiding unnecessary and wasteful activity, and reducing voltage and frequency. We review energy reduction techniques in the architecture and design of a hand-held computer and the wireless communication system including error control, system decomposition, communication and MAC protocols, and low-power short range networks
Combining local regularity estimation and total variation optimization for scale-free texture segmentation
Texture segmentation constitutes a standard image processing task, crucial to
many applications. The present contribution focuses on the particular subset of
scale-free textures and its originality resides in the combination of three key
ingredients: First, texture characterization relies on the concept of local
regularity ; Second, estimation of local regularity is based on new multiscale
quantities referred to as wavelet leaders ; Third, segmentation from local
regularity faces a fundamental bias variance trade-off: In nature, local
regularity estimation shows high variability that impairs the detection of
changes, while a posteriori smoothing of regularity estimates precludes from
locating correctly changes. Instead, the present contribution proposes several
variational problem formulations based on total variation and proximal
resolutions that effectively circumvent this trade-off. Estimation and
segmentation performance for the proposed procedures are quantified and
compared on synthetic as well as on real-world textures
Efficient hardware architectures for MPEG-4 core profile
Efficient hardware acceleration architectures are proposed for the most demandingMPEG-4 core profile algorithms, namely; texture motion estimation (TME), binary motion estimation (BME)and the shape adaptive discrete cosine transform (SA-DCT). The proposed ME designs may also be used for H.264, since both architectures can handle variable block sizes. Both ME architectures employ early termination techniques that reduce latency and save needless memory accesses and power consumption. They also use a pixel subsampling technique to facilitate parallelism,
while balancing the computational load. The BME datapath also saves operations by using Run Length Coded (RLC) pixel addressing. The SA-DCT module has a re-configuring multiplier-less serial datapath using adders and multiplexers only to improve area and power. The SA-DCT packing steps are done using a minimal switching addressing scheme with guarded evaluation. All three modules have been synthesised targeting the WildCard-II FPGA benchmarking platform adopted by the MPEG-4 Part9 reference hardware group
Automatic Nested Loop Acceleration on FPGAs Using Soft CGRA Overlay
Session 1: HLS Toolingpostprin
Wavelets: mathematics and applications
The notion of wavelets is defined. It is briefly described {\it what} are
wavelets, {\it how} to use them, {\it when} we do need them, {\it why} they are
preferred and {\it where} they have been applied. Then one proceeds to the
multiresolution analysis and fast wavelet transform as a standard procedure for
dealing with discrete wavelets. It is shown which specific features of signals
(functions) can be revealed by this analysis, but can not be found by other
methods (e.g., by the Fourier expansion). Finally, some examples of practical
application are given (in particular, to analysis of multiparticle production}.
Rigorous proofs of mathematical statements are omitted, and the reader is
referred to the corresponding literature.Comment: 16 pages, 5 figures, Latex, Phys. Atom. Nuc
- âŠ