226 research outputs found

    PEA265: Perceptual Assessment of Video Compression Artifacts

    Full text link
    The most widely used video encoders share a common hybrid coding framework that includes block-based motion estimation/compensation and block-based transform coding. Despite their high coding efficiency, the encoded videos often exhibit visually annoying artifacts, denoted as Perceivable Encoding Artifacts (PEAs), which significantly degrade the visual Qualityof- Experience (QoE) of end users. To monitor and improve visual QoE, it is crucial to develop subjective and objective measures that can identify and quantify various types of PEAs. In this work, we make the first attempt to build a large-scale subjectlabelled database composed of H.265/HEVC compressed videos containing various PEAs. The database, namely the PEA265 database, includes 4 types of spatial PEAs (i.e. blurring, blocking, ringing and color bleeding) and 2 types of temporal PEAs (i.e. flickering and floating). Each containing at least 60,000 image or video patches with positive and negative labels. To objectively identify these PEAs, we train Convolutional Neural Networks (CNNs) using the PEA265 database. It appears that state-of-theart ResNeXt is capable of identifying each type of PEAs with high accuracy. Furthermore, we define PEA pattern and PEA intensity measures to quantify PEA levels of compressed video sequence. We believe that the PEA265 database and our findings will benefit the future development of video quality assessment methods and perceptually motivated video encoders.Comment: 10 pages,15 figures,4 table

    Comparative analysis of DIRAC PRO-VC-2, H.264 AVC and AVS CHINA-P7

    Get PDF
    Video codec compresses the input video source to reduce storage and transmission bandwidth requirements while maintaining the quality. It is an essential technology for applications, to name a few such as digital television, DVD-Video, mobile TV, videoconferencing and internet video streaming. There are different video codecs used in the industry today and understanding their operation to target certain video applications is the key to optimization. The latest advanced video codec standards have become of great importance in multimedia industries which provide cost-effective encoding and decoding of video and contribute for high compression and efficiency. Currently, H.264 AVC, AVS, and DIRAC are used in the industry to compress video. H.264 codec standard developed by the ITU-T Video Coding Experts Group (VCEG) together with the ISO/IEC Moving Picture Experts Group (MPEG). Audio-video coding standard (AVS) is a working group of audio and video coding standard in China. VC-2, also known as Dirac Pro developed by BBC, is a royalty free technology that anyone can use and has been standardized through the SMPTE as VC-2. H.264 AVC, Dirac Pro, Dirac and AVS-P2 are dedicated to High Definition Video, while AVS-P7 is to mobile video. Out of many standards, this work performs a comparative analysis for the H.264 AVC, DIRAC PRO/SMPTE-VC-2 and AVS-P7 standards in low bitrate region and high bitrate region. Bitrate control and constant QP are the methods which are employed for analysis. Evaluation parameters like Compression Ratio, PSNR and SSIM are used for quality comparison. Depending on target application and available bitrate, order of performance is mentioned to show the preferred codec

    Video Stream Adaptation In Computer Vision Systems

    Get PDF
    Computer Vision (CV) has been deployed recently in a wide range of applications, including surveillance and automotive industries. According to a recent report, the market for CV technologies will grow to $33.3 billion by 2019. Surveillance and automotive industries share over 20% of this market. This dissertation considers the design of real-time CV systems with live video streaming, especially those over wireless and mobile networks. Such systems include video cameras/sensors and monitoring stations. The cameras should adapt their captured videos based on the events and/or available resources and time requirement. The monitoring station receives video streams from all cameras and run CV algorithms for decisions, warnings, control, and/or other actions. Real-time CV systems have constraints in power, computational, and communicational resources. Most video adaptation techniques considered the video distortion as the primary metric. In CV systems, however, the main objective is enhancing the event/object detection/recognition/tracking accuracy. The accuracy can essentially be thought of as the quality perceived by machines, as opposed to the human perceptual quality. High-Efficiency Video Coding (HEVC) is a recent encoding standard that seeks to address the limited communication bandwidth problem as a result of the popularity of High Definition (HD) videos. Unfortunately, HEVC adopts algorithms that greatly slow down the encoding process, and thus results in complications in real-time systems. This dissertation presents a method for adapting live video streams to limited and varying network bandwidth and energy resources. It analyzes and compares the rate-accuracy and rate-energy characteristics of various video streams adaptation techniques in CV systems. We model the video capturing, encoding, and transmission aspects and then provide an overall model of the power consumed by the video cameras and/or sensors. In addition to modeling the power consumption, we model the achieved bitrate of video encoding. We validate and analyze the power consumption models of each phase as well as the aggregate power consumption model through extensive experiments. The analysis includes examining individual parameters separately and examining the impacts of changing more than one parameter at a time. For HEVC, we develop an algorithm that predicts the size of the block without iterating through the exhaustive Rate Distortion Optimization (RDO) method. We demonstrate the effectiveness of the proposed algorithm in comparison with existing algorithms. The proposed algorithm achieves approximately 5 times the encoding speed of the RDO algorithm and 1.42 times the encoding speed of the fastest analyzed algorithm

    Lightweight super resolution network for point cloud geometry compression

    Full text link
    This paper presents an approach for compressing point cloud geometry by leveraging a lightweight super-resolution network. The proposed method involves decomposing a point cloud into a base point cloud and the interpolation patterns for reconstructing the original point cloud. While the base point cloud can be efficiently compressed using any lossless codec, such as Geometry-based Point Cloud Compression, a distinct strategy is employed for handling the interpolation patterns. Rather than directly compressing the interpolation patterns, a lightweight super-resolution network is utilized to learn this information through overfitting. Subsequently, the network parameter is transmitted to assist in point cloud reconstruction at the decoder side. Notably, our approach differentiates itself from lookup table-based methods, allowing us to obtain more accurate interpolation patterns by accessing a broader range of neighboring voxels at an acceptable computational cost. Experiments on MPEG Cat1 (Solid) and Cat2 datasets demonstrate the remarkable compression performance achieved by our method.Comment: 10 pages, 3 figures, 2 tables, and 27 reference

    A toolset for the analysis and optimization of motion estimation algorithms and processors

    Get PDF

    Low-complexity video compression algorithm and video encoder LSI design

    Get PDF
    制度:新 ; 報告番号:甲2876号 ; 学位の種類:博士(工学) ; 授与年月日:2009/9/15 ; 早大学位記番号:新510

    MPAI-EEV: Standardization Efforts of Artificial Intelligence based End-to-End Video Coding

    Full text link
    The rapid advancement of artificial intelligence (AI) technology has led to the prioritization of standardizing the processing, coding, and transmission of video using neural networks. To address this priority area, the Moving Picture, Audio, and Data Coding by Artificial Intelligence (MPAI) group is developing a suite of standards called MPAI-EEV for "end-to-end optimized neural video coding." The aim of this AI-based video standard project is to compress the number of bits required to represent high-fidelity video data by utilizing data-trained neural coding technologies. This approach is not constrained by how data coding has traditionally been applied in the context of a hybrid framework. This paper presents an overview of recent and ongoing standardization efforts in this area and highlights the key technologies and design philosophy of EEV. It also provides a comparison and report on some primary efforts such as the coding efficiency of the reference model. Additionally, it discusses emerging activities such as learned Unmanned-Aerial-Vehicles (UAVs) video coding which are currently planned, under development, or in the exploration phase. With a focus on UAV video signals, this paper addresses the current status of these preliminary efforts. It also indicates development timelines, summarizes the main technical details, and provides pointers to further points of reference. The exploration experiment shows that the EEV model performs better than the state-of-the-art video coding standard H.266/VVC in terms of perceptual evaluation metric

    LOW-DELAY WINDOW-BASED RATE CONTROL SCHEME FOR VIDEO QUALITY OPTIMIZATION IN VIDEO ENCODER

    Get PDF
    ABSTRACT The consistent video quality and encoding latency due to buffering are two important aspects in designing rate control scheme for the application of real-time video coding system. To well balance these two contrary objectives, we firstly analyze the constraint of buffer latency and the definition of a "consistent" video quality. Then a window-based rate control scheme is proposed with one window for controlling the rate and latency, while the other window for optimizing video quality. By applying low complexity frame level ratedistortion model in the testing sequences, our proposed method shows excellent performance in balancing the encoder buffer latency and optimized video quality. Besides, this one-pass rate control scheme is highly practical for the real-time video coding application. Rate control, buffer latency, video quality, window-based, video encode
    corecore