116 research outputs found
New Motion Estimation Algorithm and Its Block-Matching Criteria Using Low-Resolution Quantization
We propose a new motion estimation algorithm
and its block-matching criteria using low-resolution
quantization. The proposed algorithm reduces both the huge
computational cost of the full search algorithm and the
performance degradation of the fast algorithms by matching
the low-resolution images. Two search steps called the lowresolution
search and the full-resolution search are employed.
Simulation results show that the PSNR of the proposed
algorithm is superior to those of the 4:1 alternate subsampling
algorithm with less computational cost. Its computational cost
is 1/38.1 of the full search algorithm
Hardware implementation of inter-processor communication in MPSoCs for multimedia applications
In this paper we present a scalable and flexible architecture
that implements inter-processor communication (IPC) synchronization
among FIFO channels for multimedia applications. We also compare it
to the simple mail-box architecture, especially for tasks of finer
granularity. With experimental results we confirmed the proposed
architecture is suitable for various cases including a Motion JPEG
example
Implementation of an OpenVG Rasterizer with Configurable Anti-aliasing and Multi-window Scissoring
This paper describes an OpenVG-compliant hardware
rasterizer with configurable anti-aliasing and multi-window
scissoring. This rasterizer requires 129K logic gates with
2KB on-chip SRAM and provides satisfactory image quality
with a reasonable rasterizer speed at the operational
frequency of 100MHz. In this paper, we propose an optimized
scanline algorithm, which provides better performance
than the conventional scanline algorithm with supersampline
while maintaining the flexibility and the hardware
simplicity. We also propose a fast LUT-based scissoring algorithm,
which has zero-latency in most of the cases. The
hardware implementation of this rasterizer is explained in
detail
A C/C++-based functional verification framework using the SystemC verification library_
This paper describes SoCBase-VL, which is a C/C++
based integrated framework for SoC functional verification.
It has a layered architecture which provides easier testbench
description, automatic verification of bus interfaces
and seamless testbench migration. This framework does not
require verification engineers to learn other verification
languages as long as they have sufficient knowledge on
both C/C++ and SystemC. We have confirmed its usefulness
by applying it to a TFT-LCD Controller verification
Cache Optimization for H.264/AVC Motion Compensation
In this letter, we propose a cache organization that substantially
reduces the memory bandwidth of motion compensation (MC) in
the H.264/AVC decoders. To reduce duplicated memory accesses to P and
B pictures, we employ a four-way set-associative cache in which its index
bits are composed of horizontal and vertical address bits of the frame buffer
and each line stores an 8 Γ 2 pixel data in the reference frames. Moreover,
we alleviate the data fragmentation problem by selecting its line size that
equals the minimum access size of the DDR SDRAM. The bandwidth of
the optimized cache averaged over five QCIF IBBP image sequences requires
only 129% of the essential bandwidth of an H.264/AVC MC
High performance IPC hardware accelerator and communication network for MPSoCs
In this paper, we explain a configurable IPC module
for multimedia MPSoCs, which was implemented in a MPW chip
that include three ARM7 CPU cores. According to the test results
for an M-JPEG and a H.264 decoder, its IPC synchronization
overheads are not more than 1% when the synchronization
period is about 5000 cycles.This work was supported by the IC Design Education
Center (IDEC) in KAIST, and the Seoul R&BD Program
Fast design of reduced complexity nearest-neighbor classifiers using triangular inequality
In this paper, we propose a method of designing a reduced complexity nearest-neighbor (RCNN) classifier with near-minimal computational complexity from a given nearest-neighbor classifier that has high input dimensionality and a large number of class vectors. We applied our method to the classification problem of handwritten numerals in the NIST database. If the complexity of the RCNN classifier is normalized to that of the given classifier, the complexity of the derived classifier is 62 percent, 2 percent higher than that of the optimal classifier. This was found using the exhaustive search.Institute of Information Technology Assessment (IITA), Korea,
under research grant 96060-IT2-I2
Partial Bus-Invert Coding for Power Optimization of Application-Specific Systems
This paper presents two bus coding schemes for power optimization
of application-specific systems: Partial Bus-Invert coding and its
extension to Multiway Partial Bus-Invert coding. In the first scheme, only
a selected subgroup of bus lines is encoded to avoid unnecessary inversion
of relatively inactive and/or uncorrelated bus lines which are not included
in the subgroup. In the extended scheme, we partition a bus into multiple
subbuses by clustering highly correlated bus lines and then encode each
subbus independently. We describe a heuristic algorithm of partitioning a
bus into subbuses for each encoding scheme. Experimental results for various
examples indicate that both encoding schemes are highly efficient for
application-specific systems
Reusable Component IP Design using Refinement-based Design Environment
We propose a method of enhancing the reusability of
the component IPs by separating communication and
computation for a system function. In this approach, we assume
that the component designers describe mainly the computation
part of the component, and the system designer can construct
the communication part by using our refinement-based design
environment. Moreover, we introduced a concept of the
Communication Architecture Template Tree (CATree), which
helps IP designers to effectively separate computation and
communication for a system function. We confirmed that this
approach is effective by applying it to a H.264 decoder design
A mixed-level virtual prototyping environment for refinement-based design environment
The Communication Architecture Template Tree (CATtree)
is an abstraction of the specific range of communication
functions and architectures, which can facilitate system
function capture and communication architecture refinement.
In this paper, we explain a TLM-RTL-SW mixedlevel
simulation environment that is useful for the functional
verification of partially refined system models. We
employed SystemC, GNU Gdb and a HDL simulator for the
simulation of CATtree-based TLM, SW and HW, respectively.
We also employed a new operating system, DEOS so
that each SystemC-based TLMs can be cross-compiled to
be executed as software models on the target processors.
We evaluated the flexibility and simulation performance of
the virtual simulation environment with an H.264 decoder
design example
- β¦