16,500 research outputs found
Recommended from our members
Branch prediction apparatus, systems, and methods
An apparatus and a system, as well as a method and article, may operate to predict a branch within a first operating context, such as a user context, using a first strategy; and to predict a branch within a second operating context, such as an operating system context, using a second strategy. In some embodiments, apparatus and systems may comprise one or more first storage locations to store branch history information associated with a first operating context, and one ore more second storage locations to store branch history information associated with a second operating context.Board of Regents, University of Texas Syste
The SDP-3 - A computer for use on board small scientific spacecraft
SDP three computer for use aboard small scientific spacecraf
A Study of Energy and Locality Effects using Space-filling Curves
The cost of energy is becoming an increasingly important driver for the
operating cost of HPC systems, adding yet another facet to the challenge of
producing efficient code. In this paper, we investigate the energy implications
of trading computation for locality using Hilbert and Morton space-filling
curves with dense matrix-matrix multiplication. The advantage of these curves
is that they exhibit an inherent tiling effect without requiring specific
architecture tuning. By accessing the matrices in the order determined by the
space-filling curves, we can trade computation for locality. The index
computation overhead of the Morton curve is found to be balanced against its
locality and energy efficiency, while the overhead of the Hilbert curve
outweighs its improvements on our test system.Comment: Proceedings of the 2014 IEEE International Parallel & Distributed
Processing Symposium Workshops (IPDPSW
A VLSI architecture of JPEG2000 encoder
Copyright @ 2004 IEEEThis paper proposes a VLSI architecture of JPEG2000 encoder, which functionally consists of two parts: discrete wavelet transform (DWT) and embedded block coding with optimized truncation (EBCOT). For DWT, a spatial combinative lifting algorithm (SCLA)-based scheme with both 5/3 reversible and 9/7 irreversible filters is adopted to reduce 50% and 42% multiplication computations, respectively, compared with the conventional lifting-based implementation (LBI). For EBCOT, a dynamic memory control (DMC) strategy of Tier-1 encoding is adopted to reduce 60% scale of the on-chip wavelet coefficient storage and a subband parallel-processing method is employed to speed up the EBCOT context formation (CF) process; an architecture of Tier-2 encoding is presented to reduce the scale of on-chip bitstream buffering from full-tile size down to three-code-block size and considerably eliminate the iterations of the rate-distortion (RD) truncation.This work was supported in part by the China National High Technologies Research Program (863) under Grant 2002AA1Z142
STV-based Video Feature Processing for Action Recognition
In comparison to still image-based processes, video features can provide rich and intuitive information about dynamic events occurred over a period of time, such as human actions, crowd behaviours, and other subject pattern changes. Although substantial progresses have been made in the last decade on image processing and seen its successful applications in face matching and object recognition, video-based event detection still remains one of the most difficult challenges in computer vision research due to its complex continuous or discrete input signals, arbitrary dynamic feature definitions, and the often ambiguous analytical methods. In this paper, a Spatio-Temporal Volume (STV) and region intersection (RI) based 3D shape-matching method has been proposed to facilitate the definition and recognition of human actions recorded in videos. The distinctive characteristics and the performance gain of the devised approach stemmed from a coefficient factor-boosted 3D region intersection and matching mechanism developed in this research. This paper also reported the investigation into techniques for efficient STV data filtering to reduce the amount of voxels (volumetric-pixels) that need to be processed in each operational cycle in the implemented system. The encouraging features and improvements on the operational performance registered in the experiments have been discussed at the end
- …