5 research outputs found

    Deep learning-based switchable network for in-loop filtering in high efficiency video coding

    Get PDF
    The video codecs are focusing on a smart transition in this era. A future area of research that has not yet been fully investigated is the effect of deep learning on video compression. The paper’s goal is to reduce the ringing and artifacts that loop filtering causes when high-efficiency video compression is used. Even though there is a lot of research being done to lessen this effect, there are still many improvements that can be made. In This paper we have focused on an intelligent solution for improvising in-loop filtering in high efficiency video coding (HEVC) using a deep convolutional neural network (CNN). The paper proposes the design and implementation of deep CNN-based loop filtering using a series of 15 CNN networks followed by a combine and squeeze network that improves feature extraction. The resultant output is free from double enhancement and the peak signal-to-noise ratio is improved by 0.5 dB compared to existing techniques. The experiments then demonstrate that improving the coding efficiency by pipelining this network to the current network and using it for higher quantization parameters (QP) is more effective than using it separately. Coding efficiency is improved by an average of 8.3% with the switching based deep CNN in-loop filtering

    INTERPRETING AND PRUNING COMPUTER VISON BASED NEURAL NETWORKS

    Get PDF
    Computer vision is a complex subject matter entailing tasks, such as, object detection and recognition, image segmentation, super resolution, image restoration, generated artwork, and many others. The application of these tasks is becoming more fundamental to our everyday lives. Hence, beyond the complexity of said systems, their accuracy has become critical. In this context, the ability to decentralise the computation of the neural networks behind cutting edge computer vision systems has become essential. However, this is not always possible, models are getting larger, and this makes them harder, or potentially impossible to use on consumer hardware. This thesis develops a pruning methodology called “Weight Action Pruning” to reduce the complexity of computer vision neural networks, this method combines sparsity pruning and structured pruning. Sparsity pruning highlights the importance of specific neurons and weights, and structural pruning is then used to remove any redundancies. This process is repeated multiple times and results in a significant decrease in the computing power required to deploy a neural network, reducing inference times and memory requirements. Weight Action Pruning is first applied to deblocking neural networks used in video coding. Pruning these networks with Weight Action Pruning allowed for large computational reductions without significant impacts on accuracy. To further test the validity of Weight Action Pruning on multiple datasets and different network architectures, Weight Action Pruning was tested on the generative adversarial U-Net used in a seminal paper in the field. This work showed that the ability to prune a neural network relies not only on the neural network’s architecture, but also the dataset used to train the model. Weight Action Pruning was then applied to image recognition networks VGG-16 and ResNet-50, this allowed Weight Action Pruning to be directly evaluated against other state of the art pruning methods. It was found that, models that were pruned to a set size had higher accuracies than models that were trained from scratch with the same size. Finally, the impact of pruning a neural network is investigated by analysing weight distribution, saliency maps and other visualizations. It must be noted that Weight Action Pruning comes at a cost at training time, due to the re-training required. Additionally pruning may cause networks to become less robust, as they are optimised by removing the learnt “edge cases”
    corecore