5,061 research outputs found

    Steered mixture-of-experts for light field images and video : representation and coding

    Get PDF
    Research in light field (LF) processing has heavily increased over the last decade. This is largely driven by the desire to achieve the same level of immersion and navigational freedom for camera-captured scenes as it is currently available for CGI content. Standardization organizations such as MPEG and JPEG continue to follow conventional coding paradigms in which viewpoints are discretely represented on 2-D regular grids. These grids are then further decorrelated through hybrid DPCM/transform techniques. However, these 2-D regular grids are less suited for high-dimensional data, such as LFs. We propose a novel coding framework for higher-dimensional image modalities, called Steered Mixture-of-Experts (SMoE). Coherent areas in the higher-dimensional space are represented by single higher-dimensional entities, called kernels. These kernels hold spatially localized information about light rays at any angle arriving at a certain region. The global model consists thus of a set of kernels which define a continuous approximation of the underlying plenoptic function. We introduce the theory of SMoE and illustrate its application for 2-D images, 4-D LF images, and 5-D LF video. We also propose an efficient coding strategy to convert the model parameters into a bitstream. Even without provisions for high-frequency information, the proposed method performs comparable to the state of the art for low-to-mid range bitrates with respect to subjective visual quality of 4-D LF images. In case of 5-D LF video, we observe superior decorrelation and coding performance with coding gains of a factor of 4x in bitrate for the same quality. At least equally important is the fact that our method inherently has desired functionality for LF rendering which is lacking in other state-of-the-art techniques: (1) full zero-delay random access, (2) light-weight pixel-parallel view reconstruction, and (3) intrinsic view interpolation and super-resolution

    Image Segmentation with Eigenfunctions of an Anisotropic Diffusion Operator

    Full text link
    We propose the eigenvalue problem of an anisotropic diffusion operator for image segmentation. The diffusion matrix is defined based on the input image. The eigenfunctions and the projection of the input image in some eigenspace capture key features of the input image. An important property of the model is that for many input images, the first few eigenfunctions are close to being piecewise constant, which makes them useful as the basis for a variety of applications such as image segmentation and edge detection. The eigenvalue problem is shown to be related to the algebraic eigenvalue problems resulting from several commonly used discrete spectral clustering models. The relation provides a better understanding and helps developing more efficient numerical implementation and rigorous numerical analysis for discrete spectral segmentation methods. The new continuous model is also different from energy-minimization methods such as geodesic active contour in that no initial guess is required for in the current model. The multi-scale feature is a natural consequence of the anisotropic diffusion operator so there is no need to solve the eigenvalue problem at multiple levels. A numerical implementation based on a finite element method with an anisotropic mesh adaptation strategy is presented. It is shown that the numerical scheme gives much more accurate results on eigenfunctions than uniform meshes. Several interesting features of the model are examined in numerical examples and possible applications are discussed

    SilNet : Single- and Multi-View Reconstruction by Learning from Silhouettes

    Full text link
    The objective of this paper is 3D shape understanding from single and multiple images. To this end, we introduce a new deep-learning architecture and loss function, SilNet, that can handle multiple views in an order-agnostic manner. The architecture is fully convolutional, and for training we use a proxy task of silhouette prediction, rather than directly learning a mapping from 2D images to 3D shape as has been the target in most recent work. We demonstrate that with the SilNet architecture there is generalisation over the number of views -- for example, SilNet trained on 2 views can be used with 3 or 4 views at test-time; and performance improves with more views. We introduce two new synthetics datasets: a blobby object dataset useful for pre-training, and a challenging and realistic sculpture dataset; and demonstrate on these datasets that SilNet has indeed learnt 3D shape. Finally, we show that SilNet exceeds the state of the art on the ShapeNet benchmark dataset, and use SilNet to generate novel views of the sculpture dataset.Comment: BMVC 2017; Best Poste

    Knowledge-based systems and geological survey

    Get PDF
    This personal and pragmatic review of the philosophy underpinning methods of geological surveying suggests that important influences of information technology have yet to make their impact. Early approaches took existing systems as metaphors, retaining the separation of maps, map explanations and information archives, organised around map sheets of fixed boundaries, scale and content. But system design should look ahead: a computer-based knowledge system for the same purpose can be built around hierarchies of spatial objects and their relationships, with maps as one means of visualisation, and information types linked as hypermedia and integrated in mark-up languages. The system framework and ontology, derived from the general geoscience model, could support consistent representation of the underlying concepts and maintain reference information on object classes and their behaviour. Models of processes and historical configurations could clarify the reasoning at any level of object detail and introduce new concepts such as complex systems. The up-to-date interpretation might centre on spatial models, constructed with explicit geological reasoning and evaluation of uncertainties. Assuming (at a future time) full computer support, the field survey results could be collected in real time as a multimedia stream, hyperlinked to and interacting with the other parts of the system as appropriate. Throughout, the knowledge is seen as human knowledge, with interactive computer support for recording and storing the information and processing it by such means as interpolating, correlating, browsing, selecting, retrieving, manipulating, calculating, analysing, generalising, filtering, visualising and delivering the results. Responsibilities may have to be reconsidered for various aspects of the system, such as: field surveying; spatial models and interpretation; geological processes, past configurations and reasoning; standard setting, system framework and ontology maintenance; training; storage, preservation, and dissemination of digital records

    OctNetFusion: Learning Depth Fusion from Data

    Full text link
    In this paper, we present a learning based approach to depth fusion, i.e., dense 3D reconstruction from multiple depth images. The most common approach to depth fusion is based on averaging truncated signed distance functions, which was originally proposed by Curless and Levoy in 1996. While this method is simple and provides great results, it is not able to reconstruct (partially) occluded surfaces and requires a large number frames to filter out sensor noise and outliers. Motivated by the availability of large 3D model repositories and recent advances in deep learning, we present a novel 3D CNN architecture that learns to predict an implicit surface representation from the input depth maps. Our learning based method significantly outperforms the traditional volumetric fusion approach in terms of noise reduction and outlier suppression. By learning the structure of real world 3D objects and scenes, our approach is further able to reconstruct occluded regions and to fill in gaps in the reconstruction. We demonstrate that our learning based approach outperforms both vanilla TSDF fusion as well as TV-L1 fusion on the task of volumetric fusion. Further, we demonstrate state-of-the-art 3D shape completion results.Comment: 3DV 2017, https://github.com/griegler/octnetfusio
    • 

    corecore