40 research outputs found

    High-Efficient Parallel CAVLC Encoders on Heterogeneous Multicore Architectures

    Get PDF
    This article presents two high-efficient parallel realizations of the context-based adaptive variable length coding (CAVLC) based on heterogeneous multicore processors. By optimizing the architecture of the CAVLC encoder, three kinds of dependences are eliminated or weaken, including the context-based data dependence, the memory accessing dependence and the control dependence. The CAVLC pipeline is divided into three stages: two scans, coding, and lag packing, and be implemented on two typical heterogeneous multicore architectures. One is a block-based SIMD parallel CAVLC encoder on multicore stream processor STORM. The other is a component-oriented SIMT parallel encoder on massively parallel architecture GPU. Both of them exploited rich data-level parallelism. Experiments results show that compared with the CPU version, more than 70 times of speedup can be obtained for STORM and over 50 times for GPU. The implementation of encoder on STORM can make a real-time processing for 1080p @30fps and GPU-based version can satisfy the requirements for 720p real-time encoding. The throughput of the presented CAVLC encoders is more than 10 times higher than that of published software encoders on DSP and multicore platforms

    Parallel HEVC Decoding on Multi- and Many-core Architectures : A Power and Performance Analysis

    Get PDF
    The Joint Collaborative Team on Video Decoding is developing a new standard named High Efficiency Video Coding (HEVC) that aims at reducing the bitrate of H.264/AVC by another 50 %. In order to fulfill the computational demands of the new standard, in particular for high resolutions and at low power budgets, exploiting parallelism is no longer an option but a requirement. Therefore, HEVC includes several coding tools that allows to divide each picture into several partitions that can be processed in parallel, without degrading the quality nor the bitrate. In this paper we adapt one of these approaches, the Wavefront Parallel Processing (WPP) coding, and show how it can be implemented on multi- and many-core processors. Our approach, named Overlapped Wavefront (OWF), processes several partitions as well as several pictures in parallel. This has the advantage that the amount of (thread-level) parallelism stays constant during execution. In addition, performance and power results are provided for three platforms: a server Intel CPU with 8 cores, a laptop Intel CPU with 4 cores, and a TILE-Gx36 with 36 cores from Tilera. The results show that our parallel HEVC decoder is capable of achieving an average frame rate of 116 fps for 4k resolution on a standard multicore CPU. The results also demonstrate that exploiting more parallelism by increasing the number of cores can improve the energy efficiency measured in terms of Joules per frame substantially

    A Cost Shared Quantization Algorithm and its Implementation for Multi-Standard Video CODECS

    Get PDF
    The current trend of digital convergence creates the need for the video encoder and decoder system, known as codec in short, that should support multiple video standards on a single platform. In a modern video codec, quantization is a key unit used for video compression. In this thesis, a generalized quantization algorithm and hardware implementation is presented to compute quantized coefficient for six different video codecs including the new developing codec High Efficiency Video Coding (HEVC). HEVC, successor to H.264/MPEG-4 AVC, aims to substantially improve coding efficiency compared to AVC High Profile. The thesis presents a high performance circuit shared architecture that can perform the quantization operation for HEVC, H.264/AVC, AVS, VC-1, MPEG- 2/4 and Motion JPEG (MJPEG). Since HEVC is still in drafting stage, the architecture was designed in such a way that any final changes can be accommodated into the design. The proposed quantizer architecture is completely division free as the division operation is replaced by multiplication, shift and addition operations. The design was implemented on FPGA and later synthesized in CMOS 0.18 μm technology. The results show that the proposed design satisfies the requirement of all codecs with a maximum decoding capability of 60 fps at 187.3 MHz for Xilinx Virtex4 LX60 FPGA of a 1080p HD video. The scheme is also suitable for low-cost implementation in modern multi-codec systems

    Algoritmo de estimação de movimento e sua arquitetura de hardware para HEVC

    Get PDF
    Doutoramento em Engenharia EletrotécnicaVideo coding has been used in applications like video surveillance, video conferencing, video streaming, video broadcasting and video storage. In a typical video coding standard, many algorithms are combined to compress a video. However, one of those algorithms, the motion estimation is the most complex task. Hence, it is necessary to implement this task in real time by using appropriate VLSI architectures. This thesis proposes a new fast motion estimation algorithm and its implementation in real time. The results show that the proposed algorithm and its motion estimation hardware architecture out performs the state of the art. The proposed architecture operates at a maximum operating frequency of 241.6 MHz and is able to process 1080p@60Hz with all possible variables block sizes specified in HEVC standard as well as with motion vector search range of up to ±64 pixels.A codificação de vídeo tem sido usada em aplicações tais como, vídeovigilância, vídeo-conferência, video streaming e armazenamento de vídeo. Numa norma de codificação de vídeo, diversos algoritmos são combinados para comprimir o vídeo. Contudo, um desses algoritmos, a estimação de movimento é a tarefa mais complexa. Por isso, é necessário implementar esta tarefa em tempo real usando arquiteturas de hardware apropriadas. Esta tese propõe um algoritmo de estimação de movimento rápido bem como a sua implementação em tempo real. Os resultados mostram que o algoritmo e a arquitetura de hardware propostos têm melhor desempenho que os existentes. A arquitetura proposta opera a uma frequência máxima de 241.6 MHz e é capaz de processar imagens de resolução 1080p@60Hz, com todos os tamanhos de blocos especificados na norma HEVC, bem como um domínio de pesquisa de vetores de movimento até ±64 pixels

    A camera trap to reveal the obscure world of the arctic subnivean ecology

    Get PDF
    Subnivean life is an important part of the Arctic ecosystem but it has been little explored. Long, harsh winters in addition to remoteness have made direct studies in these hardly accessible areas very expensive and extremely difficult. To tackle this problem, a low-power autonomous camera system (called ArcÇav) is developed for monitoring small mammals beneath the snow in the Canadian Arctic. ArcÇav is composed of several components, including a digital camera, a single board computer, a microcontroller board, and a motion detection sensor. A limited energy source, very cold temperatures, darkness, and a very long recording period (several months) are major challenges that ArcÇav is designed to deal with. The performance of the developed system is evaluated in a real situation in the High Arctic. The field results show that ArcÇav can function well for an extended period of time on a battery at very low temperatures during the arctic winters. To the best of our knowledge, this is the first time that life under snow has been filmed by a camera trap in the Arctic during winter. ArcÇav equips ecologists with a new means to explore and study subnivean life remotely. These observations can provide a foundation to answer some of questions that have puzzled animal ecologists for decades

    Codificação de vídeo escalonável em complexidade e em energia

    Get PDF
    Tese (Doutorado)—Universidade de Brasília, Faculdade de Tecnologia, Departamento de Engenharia Elétrica, 2012.Um dos tipos de sinais que mais se beneficiou dos avanços tecnológicos e industriais recentes foi o vídeo digital. O barateamento de sistemas de aquisição e a evolução das técnicas de processamento de sinais difundiu o emprego de sistemas de vídeo digital nas mais diversas aplicações. Uma das peças fundamentais dessa popularização foi a evolução dos codificadores de vídeo digital, culminando com o padrão H.264/AVC, considerado estado da arte em compressão de vídeo. Sua ampla gama de ferramentas de codificação tornou o conjunto complexo em termos computacionais, deixando como desafio a projetistas de sistemas de hardware e de software a otimização das metodologias do padrão para a devida realização do H.264/AVC em produtos comercialmente viáveis. Esta tese abordará a análise do codificador H.264/AVC sob a ótica do esforço computacional envolvido em sua operação a partir de implementações em software executadas em computadores pessoais. A primeira contribuição trata de uma metodologia de otimização on-line do módulo de predições de forma a restringir a complexidade computacional da codificação a uma determinada provisão. A segunda contribuição apresentada estende o conceito de otimização RD com a inserção de mais um eixo de análise, o eixo da complexidade C. Duas implementações de alto desempenho computacional foram estudadas e otimizadas em termos de RDC. Derivou-se, a partir de treinamento off-line, dois arranjos de codificadores capazes de comprimir vídeo digital a velocidades controladas em faixas de valores de interesse prático. Por fim, uma última contribuição altera o esquema de otimização RDC e adiciona o eixo da energia demandada E ao problema de otimização RD, resultando num sistema em tempo real otimizado em termos de RDE. O codificador proposto otimizado por demanda energética é capaz de escalonar o consumo de energia em valores significativos às custas de impacto mínimo em termos de desempenho RD. Essa contribuição resume-se em um exemplo real de computação verde, em que uma atividade computacional é realizada por um mesmo equipamento, gastando menos energia e exposto a pequenas penalidades em termos de desempenho. Com isso, esperamos estar contribuindo para um sistema mais “verde”, reduzindo as emissões de carbono de servidores de computação intensiva. _______________________________________________________________________________________ ABSTRACTDigital video communications were largely benefited from advances in technology and in industrial processes. The falling prices of acquisition devices and the evolution of signal processing made digital video an ubiquitous technology. Digital video encoders are the cornerstone for the popularity of video technologies and its state-of-the-art is represented by the H.264/AVC standard. The myriad of coding tools made the H.264/AVC a massively complex application, imposing challenges to hardware and software designers when realizing commercial appliances. This thesis analyses the H.264/AVC complexity when implemented in software and executed on personal computers. The first contribution leads to a on-line optimization method for the prediction stage in order to constrain the complexity to a certain level. The approach uses mode ranking and yields substantive complexity reduction. The second contribution extends the RD optimization framework adding a third analysis axis, the complexity C axis. Two high performance implementations were studied and RDC optimized. We derived a framework that allow for practical values of encoding speed with minor performance penalties. The RDC optimization framework was also modified by adding another axis to the optimization: the energy E axis. We provide a real-time RDE optimized scheme which is capable of scaling the energy demands in a significant range, slightly impacting the RD performance. This third contribution is a true example of green computingwhere the same task is accomplished in the same hardware system with much less energy consumption, incurring only is small performance penalties. Since we can provide settings to meet the rate and distortion targets, as well as the maximum encoding speed, using less energy, we hope to contribute towards a “greener” system, reducing the carbon footprint of video compression servers

    High speed 802.11ad wireless video streaming

    Get PDF
    The aim of this thesis is to investigate, both theoretically and experimentally, the capability of the IEEE 802.11ad device, the Wireless Gigabit Alliance known as WiGig operating in the 60 GHz band to handle rise in data traffic ubiquitous to high speed data transmission such as bulk data transfer, and wireless video streaming. According to Cisco and others, it is estimated that in 2020, internet video traffic will account for 82 % of all consumer internet traffic. This research evalu- ated the feasibility of the 60 GHz to provide minimum data rate of about 970 Mbps from the Ethernet link limited or clamped to 1 Gbps. This translated to 97 % effi- ciency with respect to the IEEE 802.11ad system performance. For the first time, the author proposed the enhancement of millimetre wave propagation through the use of specular reflection in non-line-of-sight environment, providing at least 94 % bandwidth utilization. Additional investigations result of the IEEE 802.11ad device in real live streaming of 4k ultra-high definition (UHD) video shows the feasibility of aggressive frequency reuse in the absence of co-channel interference. Moreover, using heuristic approach, this work compared materials absorption and signal reception at 60 GHz and the results gives better performance in contrast to the theoretical values. Finally, this thesis proposes a framework for the 802.11ad wireless H.264 video streaming over 60 GHz band. The work describes the potential and efficiency of WiGig device in streaming high definition (HD) video with high temporal index (TI) and 4k UHD video with no retransmission. Caching point established at the re-transmitter increase coverage and cache multimedia data. The results in this thesis shows the growing potential of millimeter wave technology, the WiGig for very high speed bulk data transfer, and live streaming video transmission
    corecore