7,516 research outputs found

    Optimising Spatial and Tonal Data for PDE-based Inpainting

    Full text link
    Some recent methods for lossy signal and image compression store only a few selected pixels and fill in the missing structures by inpainting with a partial differential equation (PDE). Suitable operators include the Laplacian, the biharmonic operator, and edge-enhancing anisotropic diffusion (EED). The quality of such approaches depends substantially on the selection of the data that is kept. Optimising this data in the domain and codomain gives rise to challenging mathematical problems that shall be addressed in our work. In the 1D case, we prove results that provide insights into the difficulty of this problem, and we give evidence that a splitting into spatial and tonal (i.e. function value) optimisation does hardly deteriorate the results. In the 2D setting, we present generic algorithms that achieve a high reconstruction quality even if the specified data is very sparse. To optimise the spatial data, we use a probabilistic sparsification, followed by a nonlocal pixel exchange that avoids getting trapped in bad local optima. After this spatial optimisation we perform a tonal optimisation that modifies the function values in order to reduce the global reconstruction error. For homogeneous diffusion inpainting, this comes down to a least squares problem for which we prove that it has a unique solution. We demonstrate that it can be found efficiently with a gradient descent approach that is accelerated with fast explicit diffusion (FED) cycles. Our framework allows to specify the desired density of the inpainting mask a priori. Moreover, is more generic than other data optimisation approaches for the sparse inpainting problem, since it can also be extended to nonlinear inpainting operators such as EED. This is exploited to achieve reconstructions with state-of-the-art quality. We also give an extensive literature survey on PDE-based image compression methods

    Side information in robust principal component analysis: algorithms and applications

    Get PDF
    Dimensionality reduction and noise removal are fundamental machine learning tasks that are vital to artificial intelligence applications. Principal component analysis has long been utilised in computer vision to achieve the above mentioned goals. Recently, it has been enhanced in terms of robustness to outliers in robust principal component analysis. Both convex and non-convex programs have been developed to solve this new formulation, some with exact convergence guarantees. Its effectiveness can be witnessed in image and video applications ranging from image denoising and alignment to background separation and face recognition. However, robust principal component analysis is by no means perfect. This dissertation identifies its limitations, explores various promising options for improvement and validates the proposed algorithms on both synthetic and real-world datasets. Common algorithms approximate the NP-hard formulation of robust principal component analysis with convex envelopes. Though under certain assumptions exact recovery can be guaranteed, the relaxation margin is too big to be squandered. In this work, we propose to apply gradient descent on the Burer-Monteiro bilinear matrix factorisation to squeeze this margin given available subspaces. This non-convex approach improves upon conventional convex approaches both in terms of accuracy and speed. On the other hand, oftentimes there is accompanying side information when an observation is made. The ability to assimilate such auxiliary sources of data can ameliorate the recovery process. In this work, we investigate in-depth such possibilities for incorporating side information in restoring the true underlining low-rank component from gross sparse noise. Lastly, tensors, also known as multi-dimensional arrays, represent real-world data more naturally than matrices. It is thus advantageous to adapt robust principal component analysis to tensors. Since there is no exact equivalence between tensor rank and matrix rank, we employ the notions of Tucker rank and CP rank as our optimisation objectives. Overall, this dissertation carefully defines the problems when facing real-world computer vision challenges, extensively and impartially evaluates the state-of-the-art approaches, proposes novel solutions and provides sufficient validations on both simulated data and popular real-world datasets for various mainstream computer vision tasks.Open Acces

    Linear Shape Deformation Models with Local Support Using Graph-based Structured Matrix Factorisation

    Get PDF
    Representing 3D shape deformations by linear models in high-dimensional space has many applications in computer vision and medical imaging, such as shape-based interpolation or segmentation. Commonly, using Principal Components Analysis a low-dimensional (affine) subspace of the high-dimensional shape space is determined. However, the resulting factors (the most dominant eigenvectors of the covariance matrix) have global support, i.e. changing the coefficient of a single factor deforms the entire shape. In this paper, a method to obtain deformation factors with local support is presented. The benefits of such models include better flexibility and interpretability as well as the possibility of interactively deforming shapes locally. For that, based on a well-grounded theoretical motivation, we formulate a matrix factorisation problem employing sparsity and graph-based regularisation terms. We demonstrate that for brain shapes our method outperforms the state of the art in local support models with respect to generalisation ability and sparse shape reconstruction, whereas for human body shapes our method gives more realistic deformations.Comment: Please cite CVPR 2016 versio

    Robotic Cameraman for Augmented Reality based Broadcast and Demonstration

    Get PDF
    In recent years, a number of large enterprises have gradually begun to use vari-ous Augmented Reality technologies to prominently improve the audiences’ view oftheir products. Among them, the creation of an immersive virtual interactive scenethrough the projection has received extensive attention, and this technique refers toprojection SAR, which is short for projection spatial augmented reality. However,as the existing projection-SAR systems have immobility and limited working range,they have a huge difficulty to be accepted and used in human daily life. Therefore,this thesis research has proposed a technically feasible optimization scheme so thatit can be practically applied to AR broadcasting and demonstrations. Based on three main techniques required by state-of-art projection SAR applica-tions, this thesis has created a novel mobile projection SAR cameraman for ARbroadcasting and demonstration. Firstly, by combining the CNN scene parsingmodel and multiple contour extractors, the proposed contour extraction pipelinecan always detect the optimal contour information in non-HD or blurred images.This algorithm reduces the dependency on high quality visual sensors and solves theproblems of low contour extraction accuracy in motion blurred images. Secondly, aplane-based visual mapping algorithm is introduced to solve the difficulties of visualmapping in these low-texture scenarios. Finally, a complete process of designing theprojection SAR cameraman robot is introduced. This part has solved three mainproblems in mobile projection-SAR applications: (i) a new method for marking con-tour on projection model is proposed to replace the model rendering process. Bycombining contour features and geometric features, users can identify objects oncolourless model easily. (ii) a camera initial pose estimation method is developedbased on visual tracking algorithms, which can register the start pose of robot to thewhole scene in Unity3D. (iii) a novel data transmission approach is introduced to establishes a link between external robot and the robot in Unity3D simulation work-space. This makes the robotic cameraman can simulate its trajectory in Unity3D simulation work-space and project correct virtual content. Our proposed mobile projection SAR system has made outstanding contributionsto the academic value and practicality of the existing projection SAR technique. Itfirstly solves the problem of limited working range. When the system is running ina large indoor scene, it can follow the user and project dynamic interactive virtualcontent automatically instead of increasing the number of visual sensors. Then,it creates a more immersive experience for audience since it supports the user hasmore body gestures and richer virtual-real interactive plays. Lastly, a mobile systemdoes not require up-front frameworks and cheaper and has provided the public aninnovative choice for indoor broadcasting and exhibitions

    Neural 3D Morphable Models: Spiral Convolutional Networks for 3D Shape Representation Learning and Generation

    Full text link
    Generative models for 3D geometric data arise in many important applications in 3D computer vision and graphics. In this paper, we focus on 3D deformable shapes that share a common topological structure, such as human faces and bodies. Morphable Models and their variants, despite their linear formulation, have been widely used for shape representation, while most of the recently proposed nonlinear approaches resort to intermediate representations, such as 3D voxel grids or 2D views. In this work, we introduce a novel graph convolutional operator, acting directly on the 3D mesh, that explicitly models the inductive bias of the fixed underlying graph. This is achieved by enforcing consistent local orderings of the vertices of the graph, through the spiral operator, thus breaking the permutation invariance property that is adopted by all the prior work on Graph Neural Networks. Our operator comes by construction with desirable properties (anisotropic, topology-aware, lightweight, easy-to-optimise), and by using it as a building block for traditional deep generative architectures, we demonstrate state-of-the-art results on a variety of 3D shape datasets compared to the linear Morphable Model and other graph convolutional operators.Comment: to appear at ICCV 201

    Local wavelet features for statistical object classification and localisation

    Get PDF
    This article presents a system for texture-based probabilistic classification and localisation of 3D objects in 2D digital images and discusses selected applications. The objects are described by local feature vectors computed using the wavelet transform. In the training phase, object features are statistically modelled as normal density functions. In the recognition phase, a maximisation algorithm compares the learned density functions with the feature vectors extracted from a real scene and yields the classes and poses of objects found in it. Experiments carried out on a real dataset of over 40000 images demonstrate the robustness of the system in terms of classification and localisation accuracy. Finally, two important application scenarios are discussed, namely classification of museum artefacts and classification of metallography images
    corecore