5 research outputs found

    Artificial Intelligence in the Creative Industries: A Review

    Full text link
    This paper reviews the current state of the art in Artificial Intelligence (AI) technologies and applications in the context of the creative industries. A brief background of AI, and specifically Machine Learning (ML) algorithms, is provided including Convolutional Neural Network (CNNs), Generative Adversarial Networks (GANs), Recurrent Neural Networks (RNNs) and Deep Reinforcement Learning (DRL). We categorise creative applications into five groups related to how AI technologies are used: i) content creation, ii) information analysis, iii) content enhancement and post production workflows, iv) information extraction and enhancement, and v) data compression. We critically examine the successes and limitations of this rapidly advancing technology in each of these areas. We further differentiate between the use of AI as a creative tool and its potential as a creator in its own right. We foresee that, in the near future, machine learning-based AI will be adopted widely as a tool or collaborative assistant for creativity. In contrast, we observe that the successes of machine learning in domains with fewer constraints, where AI is the `creator', remain modest. The potential of AI (or its developers) to win awards for its original creations in competition with human creatives is also limited, based on contemporary technologies. We therefore conclude that, in the context of creative industries, maximum benefit from AI will be derived where its focus is human centric -- where it is designed to augment, rather than replace, human creativity

    Flow prediction meets flow learning : combining different learning strategies for computing the optical flow

    Get PDF
    Optical flow estimation is an important topic in computer vision. The goal is to computethe inter-frame displacement field between two consecutive frames of an image sequence. In practice, optical flow estimation plays a significant role in multiple application domains including autonomous driving and medical imaging. Different categories of methods exist for solving the optical flow problem. The most common technique is based on a variational framework, where an energy functional is designed and minimized in order to calculate the optical flow. Recently, other approaches like pipeline-based approach and learning-based approach also attract much attention. Despite the great advances achieved by these algorithms, it is still difficult to find an algorithm that can perform well under all the challenges, e.g. lightning changes, large displacements, and occlusions. Hence, it is worth combining different algorithms to create a new approach that can combine their advantages. Inspired by this idea, in this thesis we select two top-performing algorithms PWC-Net and ProFlow as candidate approaches and conduct a combination of these two algorithms. While PWC-Net performs generally well in the estimation of non-occluded areas, ProFlow can especially provide an accurate estimation for the occluded areas. Thereby, we expect that the combination of these two algorithms can yield an algorithm that performs well in both occluded and non-occluded areas. Since ProFlow is a pipeline approach, we first integrate the PWC-Net in the ProFlow pipeline, then evaluate the new created pipeline PWC-ProFlow based on the MPI Sintel and KITTI 2015 benchmarks. Contrary to our expectations, the newly created algorithm does not exceed the candidate methods PWC-Net and ProFlow on either benchmark. Through the analysis of the evaluation results, we explore the problems hidden in the PWC-ProFlow pipeline that can lead to its underperformance, and organize some modification ideas. Based on these ideas, we propose six new pipelines with the purpose of improving the estimation accuracy of PWC-ProFlow. All the new generated pipelines are also evaluated on the Sintel and KITTI benchmarks. The experiment results demonstrate that all the modifications created achieve great improvements on both datasets compared to PWC-ProFlow. Further, all of them also outperform the ProFlow pipeline on both benchmarks. Compared to PWC-Net, one modification exceeds PWC-Net on the KITTI dataset, however, all our modifications achieve a better performance on the Sintel dataset, in particular, one modification presents a significant improvement with a more than 10% lower average endpoint error on the Sintel dataset

    Sparse Cost Volume for Efficient Stereo Matching

    No full text
    Stereo matching has been solved as a supervised learning task with convolutional neural network (CNN). However, CNN based approaches basically require huge memory use. In addition, it is still challenging to find correct correspondences between images at ill-posed dim and sensor noise regions. To solve these problems, we propose Sparse Cost Volume Net (SCV-Net) achieving high accuracy, low memory cost and fast computation. The idea of the cost volume for stereo matching was initially proposed in GC-Net. In our work, by making the cost volume compact and proposing an efficient similarity evaluation for the volume, we achieved faster stereo matching while improving the accuracy. Moreover, we propose to use weight normalization instead of commonly-used batch normalization for stereo matching tasks. This improves the robustness to not only sensor noises in images but also batch size in the training process. We evaluated our proposed network on the Scene Flow and KITTI 2015 datasets, its performance overall surpasses the GC-Net. Comparing with the GC-Net, our SCV-Net achieved to: (1) reduce 73.08 % GPU memory cost; (2) reduce 61.11 % processing time; (3) improve the 3PE from 2.87 % to 2.61 % on the KITTI 2015 dataset
    corecore