734 research outputs found

    Towards flexible hardware/software encoding using H.264

    Get PDF
    As the electronics world continues to expand, bringing smaller and more portable devices to consumers, demands for media access continue to rise. Consumers are seeking the ability to view the wealth of information available on the Internet from devices such as smart phones, tablets, and music players. In addition to Internet browsing, smart phones and tablets in particular look to reinvent phone communication by adding video chat through services such as Skype and FaceTime. Bringing video to mobile platforms requires trade-offs between size, channel capacity, hardware cost, quality, loading times and power consumption. H.264, the current standard for video encoding specifies multiple profiles to support different modes of operation and environments. Creating an H.264 video encoder for a mobile platform requires a proper balance between the aforementioned trade-offs while maintaining flexibility in a real time environment such as video chatting. The goal of this thesis was to investigate the trade-offs of implementing the H.264 Baseline encoding process specifically at low bit rates in hardware and software using Field Programmable Gate Array (FPGA) reconfigurable resources with an embedded processor core on the same chip. To further preserve encoding flexibility, existing encoding parameters were left intact. The Joint Model (JM) Reference encoder modified to include only the Baseline Profile was used as an initial reference point to evaluate the efficacy of the finished encoder. To improve upon the initial software implementation, major software bottlenecks were identified and hardware accelerators were designed aimed at producing a speedup capable of encoding 176x144 or Quarter Common Intermediate Format (QCIF) videos in real-time at 24 Frames Per Second (FPS) or greater. Finally, the hardware/software implementation was analyzed in comparison with the original JM Reference software encoder. This analysis included FPS, bit rate, encoding time, luminance Peak Signal-to-Noise Ratio (Y-PSNR) and associated hardware costs

    ANALOG SIGNAL PROCESSING SOLUTIONS AND DESIGN OF MEMRISTOR-CMOS ANALOG CO-PROCESSOR FOR ACCELERATION OF HIGH-PERFORMANCE COMPUTING APPLICATIONS

    Get PDF
    Emerging applications in the field of machine vision, deep learning and scientific simulation require high computational speed and are run on platforms that are size, weight and power constrained. With the transistor scaling coming to an end, existing digital hardware architectures will not be able to meet these ever-increasing demands. Analog computation with its rich set of primitives and inherent parallel architecture can be faster, more efficient and compact for some of these applications. The major contribution of this work is to show that analog processing can be a viable solution to this problem. This is demonstrated in the three parts of the dissertation. In the first part of the dissertation, we demonstrate that analog processing can be used to solve the problem of stereo correspondence. Novel modifications to the algorithms are proposed which improves the computational speed and makes them efficiently implementable in analog hardware. The analog domain implementation provides further speedup in computation and has lower power consumption than a digital implementation. In the second part of the dissertation, a prototype of an analog processor was developed using commercially available off-the-shelf components. The focus was on providing experimental results that demonstrate functionality and to show that the performance of the prototype for low-level and mid-level image processing tasks is equivalent to a digital implementation. To demonstrate improvement in speed and power consumption, an integrated circuit design of the analog processor was proposed, and it was shown that such an analog processor would be faster than state-of-the-art digital and other analog processors. In the third part of the dissertation, a memristor-CMOS analog co-processor that can perform floating point vector matrix multiplication (VMM) is proposed. VMM computation underlies some of the major applications. To demonstrate the working of the analog co-processor at a system level, a new tool called PSpice Systems Option is used. It is shown that the analog co-processor has a superior performance when compared to the projected performances of digital and analog processors. Using the new tool, various application simulations for image processing and solution to partial differential equations are performed on the co-processor model

    Video Processing Acceleration using Reconfigurable Logic and Graphics Processors

    No full text
    A vexing question is `which architecture will prevail as the core feature of the next state of the art video processing system?' This thesis examines the substitutive and collaborative use of the two alternatives of the reconfigurable logic and graphics processor architectures. A structured approach to executing architecture comparison is presented - this includes a proposed `Three Axes of Algorithm Characterisation' scheme and a formulation of perfor- mance drivers. The approach is an appealing platform for clearly defining the problem, assumptions and results of a comparison. In this work it is used to resolve the advanta- geous factors of the graphics processor and reconfigurable logic for video processing, and the conditions determining which one is superior. The comparison results prompt the exploration of the customisable options for the graphics processor architecture. To clearly define the architectural design space, the graphics processor is first identifed as part of a wider scope of homogeneous multi-processing element (HoMPE) architectures. A novel exploration tool is described which is suited to the investigation of the customisable op- tions of HoMPE architectures. The tool adopts a systematic exploration approach and a high-level parameterisable system model, and is used to explore pre- and post-fabrication customisable options for the graphics processor. A positive result of the exploration is the proposal of a reconfigurable engine for data access (REDA) to optimise graphics processor performance for video processing-specific memory access patterns. REDA demonstrates the viability of the use of reconfigurable logic as collaborative `glue logic' in the graphics processor architecture
    • …
    corecore