6,238 research outputs found

    Motion estimation and CABAC VLSI co-processors for real-time high-quality H.264/AVC video coding

    Get PDF
    Real-time and high-quality video coding is gaining a wide interest in the research and industrial community for different applications. H.264/AVC, a recent standard for high performance video coding, can be successfully exploited in several scenarios including digital video broadcasting, high-definition TV and DVD-based systems, which require to sustain up to tens of Mbits/s. To that purpose this paper proposes optimized architectures for H.264/AVC most critical tasks, Motion estimation and context adaptive binary arithmetic coding. Post synthesis results on sub-micron CMOS standard-cells technologies show that the proposed architectures can actually process in real-time 720 × 480 video sequences at 30 frames/s and grant more than 50 Mbits/s. The achieved circuit complexity and power consumption budgets are suitable for their integration in complex VLSI multimedia systems based either on AHB bus centric on-chip communication system or on novel Network-on-Chip (NoC) infrastructures for MPSoC (Multi-Processor System on Chip

    Complexity Analysis Of Next-Generation VVC Encoding and Decoding

    Full text link
    While the next generation video compression standard, Versatile Video Coding (VVC), provides a superior compression efficiency, its computational complexity dramatically increases. This paper thoroughly analyzes this complexity for both encoder and decoder of VVC Test Model 6, by quantifying the complexity break-down for each coding tool and measuring the complexity and memory requirements for VVC encoding/decoding. These extensive analyses are performed for six video sequences of 720p, 1080p, and 2160p, under Low-Delay (LD), Random-Access (RA), and All-Intra (AI) conditions (a total of 320 encoding/decoding). Results indicate that the VVC encoder and decoder are 5x and 1.5x more complex compared to HEVC in LD, and 31x and 1.8x in AI, respectively. Detailed analysis of coding tools reveals that in LD on average, motion estimation tools with 53%, transformation and quantization with 22%, and entropy coding with 7% dominate the encoding complexity. In decoding, loop filters with 30%, motion compensation with 20%, and entropy decoding with 16%, are the most complex modules. Moreover, the required memory bandwidth for VVC encoding/decoding are measured through memory profiling, which are 30x and 3x of HEVC. The reported results and insights are a guide for future research and implementations of energy-efficient VVC encoder/decoder.Comment: IEEE ICIP 202

    Efficient hardware architectures for MPEG-4 core profile

    Get PDF
    Efficient hardware acceleration architectures are proposed for the most demandingMPEG-4 core profile algorithms, namely; texture motion estimation (TME), binary motion estimation (BME)and the shape adaptive discrete cosine transform (SA-DCT). The proposed ME designs may also be used for H.264, since both architectures can handle variable block sizes. Both ME architectures employ early termination techniques that reduce latency and save needless memory accesses and power consumption. They also use a pixel subsampling technique to facilitate parallelism, while balancing the computational load. The BME datapath also saves operations by using Run Length Coded (RLC) pixel addressing. The SA-DCT module has a re-configuring multiplier-less serial datapath using adders and multiplexers only to improve area and power. The SA-DCT packing steps are done using a minimal switching addressing scheme with guarded evaluation. All three modules have been synthesised targeting the WildCard-II FPGA benchmarking platform adopted by the MPEG-4 Part9 reference hardware group

    Joint source-channel multistream coding and optical network adapter design for video over IP

    Full text link

    Energy-efficient acceleration of MPEG-4 compression tools

    Get PDF
    We propose novel hardware accelerator architectures for the most computationally demanding algorithms of the MPEG-4 video compression standard-motion estimation, binary motion estimation (for shape coding), and the forward/inverse discrete cosine transforms (incorporating shape adaptive modes). These accelerators have been designed using general low-energy design philosophies at the algorithmic/architectural abstraction levels. The themes of these philosophies are avoiding waste and trading area/performance for power and energy gains. Each core has been synthesised targeting TSMC 0.09 μm TCBN90LP technology, and the experimental results presented in this paper show that the proposed cores improve upon the prior art

    Hybrid switching : converging packet and TDM flows in a single platform

    Get PDF
    Optical fibers have brought fast and reliable data transmission to today’s network. The immense fiber build-out over the last few years has generated a wide array of new access technologies, transport and network protocols, and next-generation services in the Local Area Network (LAN), Metropolitan Area Network (MAN), and Wide Area Network (WAN). All these different technologies, protocols, and services were introduced to address particular telecommunication needs. To remain competitive in the market, the service providers must offer most of these services, while maintaining their own profitability. However, offering a large variety of equipment, protocols, and services posses a big challenge for service carriers because it requires a huge investment in different technology platforms, lots of training of staff, and the management of all these networks. In today’s network, service providers use SONET (Synchronous Optical NETwork) as a basic TDM (Time Division Multiplexing) transport network. SONET was primarily designed to carry voice traffic from telephone networks. However, with the explosion of traffic in the Internet, the same SONET based TDM network is optimized to support increasing demand for packet based Internet network services (data, voice, video, teleconference etc.) at access networks and LANs. Therefore the service providers need to support their Internet Protocol (IP) infrastructure as well as in the legacy telephony infrastructure. Supporting both TDM and packet services in the present condition needs multilayer operations which is complex, expensive, and difficult to manage. A hybrid switch is a novel architecture that combines packets (IP) and TDM switching in a unified access platform and provides seamless integration of access networks and LANs with MAN/WAN networks. The ability to fully integrate these two capabilities in a single chassis will allow service providers to deploy a more cost effective and flexible architecture that can support a variety of different services. This thesis develops a hybrid switch which is capable of offering bundled services for TDM switching and packet routing. This is done by dividing the switch’s bandwidth into VT1.5 (Virtual Tributary -1.5) channels and providing SONET based signaling for routing the data and controlling the switch’s resources. The switch is a TDM based architecture which allows each switch’s port to be independently configured for any mixture of packet and TDM traffic, including 100% packet and 100% TDM. This switch allows service providers to simplify their edge networks by consolidating the number of separate boxes needed to provide fast and reliable access. This switch also reduces the number of network management systems needed, and decreases the resources needed to install, provision and maintain the network because of its ability to “collapse” two network layers into one platform. The scope of this thesis includes system architecture, logic implementation, and verification testing, and performance evaluation of the hybrid switch. The architecture consists of ingress/egress ports, an arbiter and a crossbar. Data from ingress ports is carried to the egress ports via VT1.5 channels which are switched at the cross point of the crossbar. The crossbar setup and channel assignments at ingress port are done by the arbiter. The design was tested by simulation and the hardware cost was estimated. The performance results showed that the switch is non-blocking, provide differentiated service, and has an overall effective throughput of 80%. This result is a significant step towards the goal of building a switch that can support multiprotocol and provide different network capabilities into one platform. The long-term goal of this project is to develop a prototype of the hybrid switch with broadband capability

    Energy efficient hardware acceleration of multimedia processing tools

    Get PDF
    The world of mobile devices is experiencing an ongoing trend of feature enhancement and generalpurpose multimedia platform convergence. This trend poses many grand challenges, the most pressing being their limited battery life as a consequence of delivering computationally demanding features. The envisaged mobile application features can be considered to be accelerated by a set of underpinning hardware blocks Based on the survey that this thesis presents on modem video compression standards and their associated enabling technologies, it is concluded that tight energy and throughput constraints can still be effectively tackled at algorithmic level in order to design re-usable optimised hardware acceleration cores. To prove these conclusions, the work m this thesis is focused on two of the basic enabling technologies that support mobile video applications, namely the Shape Adaptive Discrete Cosine Transform (SA-DCT) and its inverse, the SA-IDCT. The hardware architectures presented in this work have been designed with energy efficiency in mind. This goal is achieved by employing high level techniques such as redundant computation elimination, parallelism and low switching computation structures. Both architectures compare favourably against the relevant pnor art in the literature. The SA-DCT/IDCT technologies are instances of a more general computation - namely, both are Constant Matrix Multiplication (CMM) operations. Thus, this thesis also proposes an algorithm for the efficient hardware design of any general CMM-based enabling technology. The proposed algorithm leverages the effective solution search capability of genetic programming. A bonus feature of the proposed modelling approach is that it is further amenable to hardware acceleration. Another bonus feature is an early exit mechanism that achieves large search space reductions .Results show an improvement on state of the art algorithms with future potential for even greater savings

    Energy efficient enabling technologies for semantic video processing on mobile devices

    Get PDF
    Semantic object-based processing will play an increasingly important role in future multimedia systems due to the ubiquity of digital multimedia capture/playback technologies and increasing storage capacity. Although the object based paradigm has many undeniable benefits, numerous technical challenges remain before the applications becomes pervasive, particularly on computational constrained mobile devices. A fundamental issue is the ill-posed problem of semantic object segmentation. Furthermore, on battery powered mobile computing devices, the additional algorithmic complexity of semantic object based processing compared to conventional video processing is highly undesirable both from a real-time operation and battery life perspective. This thesis attempts to tackle these issues by firstly constraining the solution space and focusing on the human face as a primary semantic concept of use to users of mobile devices. A novel face detection algorithm is proposed, which from the outset was designed to be amenable to be offloaded from the host microprocessor to dedicated hardware, thereby providing real-time performance and reducing power consumption. The algorithm uses an Artificial Neural Network (ANN), whose topology and weights are evolved via a genetic algorithm (GA). The computational burden of the ANN evaluation is offloaded to a dedicated hardware accelerator, which is capable of processing any evolved network topology. Efficient arithmetic circuitry, which leverages modified Booth recoding, column compressors and carry save adders, is adopted throughout the design. To tackle the increased computational costs associated with object tracking or object based shape encoding, a novel energy efficient binary motion estimation architecture is proposed. Energy is reduced in the proposed motion estimation architecture by minimising the redundant operations inherent in the binary data. Both architectures are shown to compare favourable with the relevant prior art
    corecore