1,526 research outputs found
Hardware/Software Co-design Applied to Reed-Solomon Decoding for the DMB Standard
This paper addresses the implementation of Reed-
Solomon decoding for battery-powered wireless
devices. The scope of this paper is constrained by the
Digital Media Broadcasting (DMB). The most critical
element of the Reed-Solomon algorithm is implemented
on two different reconfigurable hardware
architectures: an FPGA and a coarse-grained
architecture: the Montium, The remaining parts are
executed on an ARM processor. The results of this
research show that a co-design of the ARM together
with an FPGA or a Montium leads to a substantial
decrease in energy consumption. The energy
consumption of syndrome calculation of the Reed-
Solomon decoding algorithm is estimated for an FPGA
and a Montium by means of simulations. The Montium
proves to be more efficient
Functional diagnosability and recovery from massive faults in digital systems Quarterly progress reports, 17 May - 16 Nov. 1970 /final/
Diagnosability and recovery from massive faults in digital system
A Microprocessor based hybrid system for digital error correction
The design of a microprocessor based hybrid system for digital error correction is presented. It is shown that such a system allows for implementation of several cyclic codes at a variety of throughput rates providing variable degrees of error correction depending on current user requirements. The theoretical basis for encoding and decoding of binary BCH codes is reviewed. Design and implementation of system hardware and software are described. A method for injection of independent bit errors with controllable statistics into the system is developed, and its accuracy verified by computer simulation. This method of controllable error injection is used to test performance of the designed system. In analysis, these results demonstrate the flexibility of operation provided by the hybrid nature of the system. Finally, potential applications and modifications are presented to reinforce the wide applicability of the system described in this thesis
VLSI architecture design approaches for real-time video processing
This paper discusses the programmable and dedicated approaches for real-time video processing applications. Various VLSI architecture including the design examples of both approaches are reviewed. Finally, discussions of several practical designs in real-time video processing applications are then considered in VLSI architectures to provide significant guidelines to VLSI designers for any further real-time video processing design works
Survey on Combinatorial Register Allocation and Instruction Scheduling
Register allocation (mapping variables to processor registers or memory) and
instruction scheduling (reordering instructions to increase instruction-level
parallelism) are essential tasks for generating efficient assembly code in a
compiler. In the last three decades, combinatorial optimization has emerged as
an alternative to traditional, heuristic algorithms for these two tasks.
Combinatorial optimization approaches can deliver optimal solutions according
to a model, can precisely capture trade-offs between conflicting decisions, and
are more flexible at the expense of increased compilation time.
This paper provides an exhaustive literature review and a classification of
combinatorial optimization approaches to register allocation and instruction
scheduling, with a focus on the techniques that are most applied in this
context: integer programming, constraint programming, partitioned Boolean
quadratic programming, and enumeration. Researchers in compilers and
combinatorial optimization can benefit from identifying developments, trends,
and challenges in the area; compiler practitioners may discern opportunities
and grasp the potential benefit of applying combinatorial optimization
OpenForensics:a digital forensics GPU pattern matching approach for the 21st century
Pattern matching is a crucial component employed in many digital forensic (DF) analysis techniques, such as file-carving. The capacity of storage available on modern consumer devices has increased substantially in the past century, making pattern matching approaches of current generation DF tools increasingly ineffective in performing timely analyses on data seized in a DF investigation. As pattern matching is a trivally parallelisable problem, general purpose programming on graphic processing units (GPGPU) is a natural fit for this problem. This paper presents a pattern matching framework - OpenForensics - that demonstrates substantial performance improvements from the use of modern parallelisable algorithms and graphic processing units (GPUs) to search for patterns within forensic images and local storage devices
Homogeneous and heterogeneous MPSoC architectures with network-on-chip connectivity for low-power and real-time multimedia signal processing
Two multiprocessor system-on-chip (MPSoC) architectures are proposed and compared in the paper with reference to audio and video processing applications. One architecture exploits a homogeneous topology; it consists of 8 identical tiles, each made of a 32-bit RISC core enhanced by a 64-bit DSP coprocessor with local memory. The other MPSoC architecture exploits a heterogeneous-tile topology with on-chip distributed memory resources; the tiles act as application specific processors supporting a different class of algorithms. In both architectures, the multiple tiles are interconnected by a network-on-chip (NoC) infrastructure, through network interfaces and routers, which allows parallel operations of the multiple tiles. The functional performances and the implementation complexity of the NoC-based MPSoC architectures are assessed by synthesis results in submicron CMOS technology. Among the large set of supported algorithms, two case studies are considered: the real-time implementation of an H.264/MPEG AVC video codec and of a low-distortion digital audio amplifier. The heterogeneous architecture ensures a higher power efficiency and a smaller area occupation and is more suited for low-power multimedia processing, such as in mobile devices. The homogeneous scheme allows for a higher flexibility and easier system scalability and is more suited for general-purpose DSP tasks in power-supplied devices
Astrophysical Supercomputing with GPUs: Critical Decisions for Early Adopters
General purpose computing on graphics processing units (GPGPU) is
dramatically changing the landscape of high performance computing in astronomy.
In this paper, we identify and investigate several key decision areas, with a
goal of simplyfing the early adoption of GPGPU in astronomy. We consider the
merits of OpenCL as an open standard in order to reduce risks associated with
coding in a native, vendor-specific programming environment, and present a GPU
programming philosophy based on using brute force solutions. We assert that
effective use of new GPU-based supercomputing facilities will require a change
in approach from astronomers. This will likely include improved programming
training, an increased need for software development best-practice through the
use of profiling and related optimisation tools, and a greater reliance on
third-party code libraries. As with any new technology, those willing to take
the risks, and make the investment of time and effort to become early adopters
of GPGPU in astronomy, stand to reap great benefits.Comment: 13 pages, 5 figures, accepted for publication in PAS
- …