1,868 research outputs found

    Algorithms & implementation of advanced video coding standards

    Get PDF
    Advanced video coding standards have become widely deployed coding techniques used in numerous products, such as broadcast, video conference, mobile television and blu-ray disc, etc. New compression techniques are gradually included in video coding standards so that a 50% compression rate reduction is achievable every five years. However, the trend also has brought many problems, such as, dramatically increased computational complexity, co-existing multiple standards and gradually increased development time. To solve the above problems, this thesis intends to investigate efficient algorithms for the latest video coding standard, H.264/AVC. Two aspects of H.264/AVC standard are inspected in this thesis: (1) Speeding up intra4x4 prediction with parallel architecture. (2) Applying an efficient rate control algorithm based on deviation measure to intra frame. Another aim of this thesis is to work on low-complexity algorithms for MPEG-2 to H.264/AVC transcoder. Three main mapping algorithms and a computational complexity reduction algorithm are focused by this thesis: motion vector mapping, block mapping, field-frame mapping and efficient modes ranking algorithms. Finally, a new video coding framework methodology to reduce development time is examined. This thesis explores the implementation of MPEG-4 simple profile with the RVC framework. A key technique of automatically generating variable length decoder table is solved in this thesis. Moreover, another important video coding standard, DV/DVCPRO, is further modeled by RVC framework. Consequently, besides the available MPEG-4 simple profile and China audio/video standard, a new member is therefore added into the RVC framework family. A part of the research work presented in this thesis is targeted algorithms and implementation of video coding standards. In the wide topic, three main problems are investigated. The results show that the methodologies presented in this thesis are efficient and encourage

    Domain-specific and reconfigurable instruction cells based architectures for low-power SoC

    Get PDF

    Algorithm/Architecture Co-Exploration of Visual Computing: Overview and Future Perspectives

    Get PDF
    Concurrently exploring both algorithmic and architectural optimizations is a new design paradigm. This survey paper addresses the latest research and future perspectives on the simultaneous development of video coding, processing, and computing algorithms with emerging platforms that have multiple cores and reconfigurable architecture. As the algorithms in forthcoming visual systems become increasingly complex, many applications must have different profiles with different levels of performance. Hence, with expectations that the visual experience in the future will become continuously better, it is critical that advanced platforms provide higher performance, better flexibility, and lower power consumption. To achieve these goals, algorithm and architecture co-design is significant for characterizing the algorithmic complexity used to optimize targeted architecture. This paper shows that seamless weaving of the development of previously autonomous visual computing algorithms and multicore or reconfigurable architectures will unavoidably become the leading trend in the future of video technology

    Autonomous Recovery Of Reconfigurable Logic Devices Using Priority Escalation Of Slack

    Get PDF
    Field Programmable Gate Array (FPGA) devices offer a suitable platform for survivable hardware architectures in mission-critical systems. In this dissertation, active dynamic redundancy-based fault-handling techniques are proposed which exploit the dynamic partial reconfiguration capability of SRAM-based FPGAs. Self-adaptation is realized by employing reconfiguration in detection, diagnosis, and recovery phases. To extend these concepts to semiconductor aging and process variation in the deep submicron era, resilient adaptable processing systems are sought to maintain quality and throughput requirements despite the vulnerabilities of the underlying computational devices. A new approach to autonomous fault-handling which addresses these goals is developed using only a uniplex hardware arrangement. It operates by observing a health metric to achieve Fault Demotion using Recon- figurable Slack (FaDReS). Here an autonomous fault isolation scheme is employed which neither requires test vectors nor suspends the computational throughput, but instead observes the value of a health metric based on runtime input. The deterministic flow of the fault isolation scheme guarantees success in a bounded number of reconfigurations of the FPGA fabric. FaDReS is then extended to the Priority Using Resource Escalation (PURE) online redundancy scheme which considers fault-isolation latency and throughput trade-offs under a dynamic spare arrangement. While deep-submicron designs introduce new challenges, use of adaptive techniques are seen to provide several promising avenues for improving resilience. The scheme developed is demonstrated by hardware design of various signal processing circuits and their implementation on a Xilinx Virtex-4 FPGA device. These include a Discrete Cosine Transform (DCT) core, Motion Estimation (ME) engine, Finite Impulse Response (FIR) Filter, Support Vector Machine (SVM), and Advanced Encryption Standard (AES) blocks in addition to MCNC benchmark circuits. A iii significant reduction in power consumption is achieved ranging from 83% for low motion-activity scenes to 12.5% for high motion activity video scenes in a novel ME engine configuration. For a typical benchmark video sequence, PURE is shown to maintain a PSNR baseline near 32dB. The diagnosability, reconfiguration latency, and resource overhead of each approach is analyzed. Compared to previous alternatives, PURE maintains a PSNR within a difference of 4.02dB to 6.67dB from the fault-free baseline by escalating healthy resources to higher-priority signal processing functions. The results indicate the benefits of priority-aware resiliency over conventional redundancy approaches in terms of fault-recovery, power consumption, and resource-area requirements. Together, these provide a broad range of strategies to achieve autonomous recovery of reconfigurable logic devices under a variety of constraints, operating conditions, and optimization criteria

    Exploring Processor and Memory Architectures for Multimedia

    Get PDF
    Multimedia has become one of the cornerstones of our 21st century society and, when combined with mobility, has enabled a tremendous evolution of our society. However, joining these two concepts introduces many technical challenges. These range from having sufficient performance for handling multimedia content to having the battery stamina for acceptable mobile usage. When taking a projection of where we are heading, we see these issues becoming ever more challenging by increased mobility as well as advancements in multimedia content, such as introduction of stereoscopic 3D and augmented reality. The increased performance needs for handling multimedia come not only from an ongoing step-up in resolution going from QVGA (320x240) to Full HD (1920x1080) a 27x increase in less than half a decade. On top of this, there is also codec evolution (MPEG-2 to H.264 AVC) that adds to the computational load increase. To meet these performance challenges there has been processing and memory architecture advances (SIMD, out-of-order superscalarity, multicore processing and heterogeneous multilevel memories) in the mobile domain, in conjunction with ever increasing operating frequencies (200MHz to 2GHz) and on-chip memory sizes (128KB to 2-3MB). At the same time there is an increase in requirements for mobility, placing higher demands on battery-powered systems despite the steady increase in battery capacity (500 to 2000mAh). This leaves negative net result in-terms of battery capacity versus performance advances. In order to make optimal use of these architectural advances and to meet the power limitations in mobile systems, there is a need for taking an overall approach on how to best utilize these systems. The right trade-off between performance and power is crucial. On top of these constraints, the flexibility aspects of the system need to be addressed. All this makes it very important to reach the right architectural balance in the system. The first goal for this thesis is to examine multimedia applications and propose a flexible solution that can meet the architectural requirements in a mobile system. Secondly, propose an automated methodology of optimally mapping multimedia data and instructions to a heterogeneous multilevel memory subsystem. The proposed methodology uses constraint programming for solving a multidimensional optimization problem. Results from this work indicate that using today’s most advanced mobile processor technology together with a multi-level heterogeneous on-chip memory subsystem can meet the performance requirements for handling multimedia. By utilizing the automated optimal memory mapping method presented in this thesis lower total power consumption can be achieved, whilst performance for multimedia applications is improved, by employing enhanced memory management. This is achieved through reduced external accesses and better reuse of memory objects. This automatic method shows high accuracy, up to 90%, for predicting multimedia memory accesses for a given architecture

    High-Level Synthesis Based VLSI Architectures for Video Coding

    Get PDF
    High Efficiency Video Coding (HEVC) is state-of-the-art video coding standard. Emerging applications like free-viewpoint video, 360degree video, augmented reality, 3D movies etc. require standardized extensions of HEVC. The standardized extensions of HEVC include HEVC Scalable Video Coding (SHVC), HEVC Multiview Video Coding (MV-HEVC), MV-HEVC+ Depth (3D-HEVC) and HEVC Screen Content Coding. 3D-HEVC is used for applications like view synthesis generation, free-viewpoint video. Coding and transmission of depth maps in 3D-HEVC is used for the virtual view synthesis by the algorithms like Depth Image Based Rendering (DIBR). As first step, we performed the profiling of the 3D-HEVC standard. Computational intensive parts of the standard are identified for the efficient hardware implementation. One of the computational intensive part of the 3D-HEVC, HEVC and H.264/AVC is the Interpolation Filtering used for Fractional Motion Estimation (FME). The hardware implementation of the interpolation filtering is carried out using High-Level Synthesis (HLS) tools. Xilinx Vivado Design Suite is used for the HLS implementation of the interpolation filters of HEVC and H.264/AVC. The complexity of the digital systems is greatly increased. High-Level Synthesis is the methodology which offers great benefits such as late architectural or functional changes without time consuming in rewriting of RTL-code, algorithms can be tested and evaluated early in the design cycle and development of accurate models against which the final hardware can be verified

    Baseband analog front-end and digital back-end for reconfigurable multi-standard terminals

    Get PDF
    Multimedia applications are driving wireless network operators to add high-speed data services such as Edge (E-GPRS), WCDMA (UMTS) and WLAN (IEEE 802.11a,b,g) to the existing GSM network. This creates the need for multi-mode cellular handsets that support a wide range of communication standards, each with a different RF frequency, signal bandwidth, modulation scheme etc. This in turn generates several design challenges for the analog and digital building blocks of the physical layer. In addition to the above-mentioned protocols, mobile devices often include Bluetooth, GPS, FM-radio and TV services that can work concurrently with data and voice communication. Multi-mode, multi-band, and multi-standard mobile terminals must satisfy all these different requirements. Sharing and/or switching transceiver building blocks in these handsets is mandatory in order to extend battery life and/or reduce cost. Only adaptive circuits that are able to reconfigure themselves within the handover time can meet the design requirements of a single receiver or transmitter covering all the different standards while ensuring seamless inter-interoperability. This paper presents analog and digital base-band circuits that are able to support GSM (with Edge), WCDMA (UMTS), WLAN and Bluetooth using reconfigurable building blocks. The blocks can trade off power consumption for performance on the fly, depending on the standard to be supported and the required QoS (Quality of Service) leve
    • …
    corecore