639 research outputs found

    DEEP LEARNING FOR IMAGE RESTORATION AND ROBOTIC VISION

    Get PDF
    Traditional model-based approach requires the formulation of mathematical model, and the model often has limited performance. The quality of an image may degrade due to a variety of reasons: It could be the context of scene is affected by weather conditions such as haze, rain, and snow; It\u27s also possible that there is some noise generated during image processing/transmission (e.g., artifacts generated during compression.). The goal of image restoration is to restore the image back to desirable quality both subjectively and objectively. Agricultural robotics is gaining interest these days since most agricultural works are lengthy and repetitive. Computer vision is crucial to robots especially the autonomous ones. However, it is challenging to have a precise mathematical model to describe the aforementioned problems. Compared with traditional approach, learning-based approach has an edge since it does not require any model to describe the problem. Moreover, learning-based approach now has the best-in-class performance on most of the vision problems such as image dehazing, super-resolution, and image recognition. In this dissertation, we address the problem of image restoration and robotic vision with deep learning. These two problems are highly related with each other from a unique network architecture perspective: It is essential to select appropriate networks when dealing with different problems. Specifically, we solve the problems of single image dehazing, High Efficiency Video Coding (HEVC) loop filtering and super-resolution, and computer vision for an autonomous robot. Our technical contributions are threefold: First, we propose to reformulate haze as a signal-dependent noise which allows us to uncover it by learning a structural residual. Based on our novel reformulation, we solve dehazing with recursive deep residual network and generative adversarial network which emphasizes on objective and perceptual quality, respectively. Second, we replace traditional filters in HEVC with a Convolutional Neural Network (CNN) filter. We show that our CNN filter could achieve 7% BD-rate saving when compared with traditional filters such as bilateral and deblocking filter. We also propose to incorporate a multi-scale CNN super-resolution module into HEVC. Such post-processing module could improve visual quality under extremely low bandwidth. Third, a transfer learning technique is implemented to support vision and autonomous decision making of a precision pollination robot. Good experimental results are reported with real-world data

    Neural Image Compression with a Diffusion-Based Decoder

    Full text link
    Diffusion probabilistic models have recently achieved remarkable success in generating high quality image and video data. In this work, we build on this class of generative models and introduce a method for lossy compression of high resolution images. The resulting codec, which we call DIffuson-based Residual Augmentation Codec (DIRAC),is the first neural codec to allow smooth traversal of the rate-distortion-perception tradeoff at test time, while obtaining competitive performance with GAN-based methods in perceptual quality. Furthermore, while sampling from diffusion probabilistic models is notoriously expensive, we show that in the compression setting the number of steps can be drastically reduced.Comment: v1: 26 pages, 13 figures v2: corrected typo in first author name in arxiv metadat

    Artificial Intelligence in the Creative Industries: A Review

    Full text link
    This paper reviews the current state of the art in Artificial Intelligence (AI) technologies and applications in the context of the creative industries. A brief background of AI, and specifically Machine Learning (ML) algorithms, is provided including Convolutional Neural Network (CNNs), Generative Adversarial Networks (GANs), Recurrent Neural Networks (RNNs) and Deep Reinforcement Learning (DRL). We categorise creative applications into five groups related to how AI technologies are used: i) content creation, ii) information analysis, iii) content enhancement and post production workflows, iv) information extraction and enhancement, and v) data compression. We critically examine the successes and limitations of this rapidly advancing technology in each of these areas. We further differentiate between the use of AI as a creative tool and its potential as a creator in its own right. We foresee that, in the near future, machine learning-based AI will be adopted widely as a tool or collaborative assistant for creativity. In contrast, we observe that the successes of machine learning in domains with fewer constraints, where AI is the `creator', remain modest. The potential of AI (or its developers) to win awards for its original creations in competition with human creatives is also limited, based on contemporary technologies. We therefore conclude that, in the context of creative industries, maximum benefit from AI will be derived where its focus is human centric -- where it is designed to augment, rather than replace, human creativity
    corecore