53 research outputs found

    Deep Video Compression

    Get PDF

    Study of Compression Statistics and Prediction of Rate-Distortion Curves for Video Texture

    Get PDF
    Encoding textural content remains a challenge for current standardised video codecs. It is therefore beneficial to understand video textures in terms of both their spatio-temporal characteristics and their encoding statistics in order to optimize encoding performance. In this paper, we analyse the spatio-temporal features and statistics of video textures, explore the rate-quality performance of different texture types and investigate models to mathematically describe them. For all considered theoretical models, we employ machine-learning regression to predict the rate-quality curves based solely on selected spatio-temporal features extracted from uncompressed content. All experiments were performed on homogeneous video textures to ensure validity of the observations. The results of the regression indicate that using an exponential model we can more accurately predict the expected rate-quality curve (with a mean Bj{\o}ntegaard Delta rate of 0.46% over the considered dataset) while maintaining a low relative complexity. This is expected to be adopted by in the loop processes for faster encoding decisions such as rate-distortion optimisation, adaptive quantization, partitioning, etc.Comment: 17 page

    Semantic-aware video compression for automotive cameras

    Get PDF
    Assisted and automated driving functions in vehicles exploit sensor data to build situational awareness, however, the data amount required by these functions might exceed the bandwidth of current wired vehicle communication networks. Consequently, sensor data reduction, and automotive camera video compression need investigation. However, conventional video compression schemes, such as H.264 and H.265, have been mainly optimised for human vision. In this paper, we propose a semantic-aware (SA) video compression (SAC) framework that compresses separately and simultaneously region-of-interest and region-out-of-interest of automotive camera video frames, before transmitting them to processing unit(s), where the data are used for perception tasks, such as object detection, semantic segmentation, etc. Using our newly proposed technique, the region-of-interest (ROI), encapsulating most of the road stakeholders, retains higher quality using lower compression ratio. The experimental results show that under the same overall compression ratio, our proposed SAC scheme maintains a similar or better image quality, measured accordingly to traditional metrics and to our newly proposed semantic-aware metrics. The newly proposed metrics, namely SA-PSNR, SA-SSIM, and iIoU, give more emphasis to ROI quality, which has an immediate impact on the planning and decisions of assisted and automated driving functions. Using our SA-X264 compression, SA-PSNR and SA-SSIM have an increase of 2.864 and 0.008 respectively compared to traditional H.264, with higher ROI quality and the same compression ratio. Finally, a segmentation-based perception algorithm has been used to compare reconstructed frames, demonstrating a 2.7% mIOU improvement, when using the proposed SAC method versus traditional compression techniques

    Image and Video Coding Techniques for Ultra-low Latency

    Get PDF
    The next generation of wireless networks fosters the adoption of latency-critical applications such as XR, connected industry, or autonomous driving. This survey gathers implementation aspects of different image and video coding schemes and discusses their tradeoffs. Standardized video coding technologies such as HEVC or VVC provide a high compression ratio, but their enormous complexity sets the scene for alternative approaches like still image, mezzanine, or texture compression in scenarios with tight resource or latency constraints. Regardless of the coding scheme, we found inter-device memory transfers and the lack of sub-frame coding as limitations of current full-system and software-programmable implementations.publishedVersionPeer reviewe

    Multi-Resolution Feature Embedded Level Set Model for Crosshatched Texture Segmentation

    Get PDF
    In image processing applications, texture is the most important element utilized by human visual systems for distinguishing dissimilar objects in a scene. In this research article, a variational model based on the level set is implemented for crosshatched texture segmentation. In this study, the proposed model’s performance is validated on the Brodatz texture dataset. The cross-hatched texture segmentation in the lower resolution texture images is difficult, due to the computational and memory requirements. The aforementioned issue has been resolved by implementing a variational model based on the level set that enables efficient segmentation in both low and high-resolution images with automatic selection of the filter size. In the proposed model, the multi-resolution feature obtained from the frequency domain filters enhances the dissimilarity between the regions of crosshatched textures that have low-intensity variations. Then, the resultant images are integrated with a level set-based active contour model that addresses the segmentation of crosshatched texture images. The noise added during the segmentation process is eliminated by morphological processing. The experiments conducted on the Brodatz texture dataset demonstrated the effectiveness of the proposed model, and the obtained results are validated in terms of Intersection over the Union (IoU) index, accuracy, precision, f1-score and recall. The extensive experimental investigation shows that the proposed model effectively segments the region of interest in close correspondence with the original image. The proposed segmentation model with a multi-support vector machine has achieved a classification accuracy of 99.82%, which is superior to the comparative model (modified convolutional neural network with whale optimization algorithm). The proposed model almost showed a 0.11% improvement in classification accuracy related to the existing mode
    • …
    corecore