969 research outputs found

    Playable Environments: {V}ideo Manipulation in Space and Time

    Get PDF
    We present Playable Environments-a new representation for interactive video generation and manipulation in space and time. With a single image at inference time, our novel framework allows the user to move objects in 3D while generating a video by providing a sequence of desired actions. The actions are learnt in an unsupervised manner. The camera can be controlled to get the desired viewpoint. Our method builds an environment state for each frame, which can be manipulated by our proposed action mod-ule and decoded back to the image space with volumetric rendering. To support diverse appearances of objects, we extend neural radiance fields with style-based modulation. Our method trains on a collection of various monocular videos requiring only the estimated camera parameters and 2D object locations. To set a challenging benchmark, we in-troduce two large scale video datasets with significant cam-era movements. As evidenced by our experiments, playable environments enable several creative applications not at-tainable by prior video synthesis works, including playable 3D video generation, stylization and manipulation 1 1 willi-menapace.github.io/playable-environments-website

    Learning shape correspondence with anisotropic convolutional neural networks

    Get PDF
    Establishing correspondence between shapes is a fundamental problem in geometry processing, arising in a wide variety of applications. The problem is especially difficult in the setting of non-isometric deformations, as well as in the presence of topological noise and missing parts, mainly due to the limited capability to model such deformations axiomatically. Several recent works showed that invariance to complex shape transformations can be learned from examples. In this paper, we introduce an intrinsic convolutional neural network architecture based on anisotropic diffusion kernels, which we term Anisotropic Convolutional Neural Network (ACNN). In our construction, we generalize convolutions to non-Euclidean domains by constructing a set of oriented anisotropic diffusion kernels, creating in this way a local intrinsic polar representation of the data (`patch'), which is then correlated with a filter. Several cascades of such filters, linear, and non-linear operators are stacked to form a deep neural network whose parameters are learned by minimizing a task-specific cost. We use ACNNs to effectively learn intrinsic dense correspondences between deformable shapes in very challenging settings, achieving state-of-the-art results on some of the most difficult recent correspondence benchmarks

    Crowdsourcing in Computer Vision

    Full text link
    Computer vision systems require large amounts of manually annotated data to properly learn challenging visual concepts. Crowdsourcing platforms offer an inexpensive method to capture human knowledge and understanding, for a vast number of visual perception tasks. In this survey, we describe the types of annotations computer vision researchers have collected using crowdsourcing, and how they have ensured that this data is of high quality while annotation effort is minimized. We begin by discussing data collection on both classic (e.g., object recognition) and recent (e.g., visual story-telling) vision tasks. We then summarize key design decisions for creating effective data collection interfaces and workflows, and present strategies for intelligently selecting the most important data instances to annotate. Finally, we conclude with some thoughts on the future of crowdsourcing in computer vision.Comment: A 69-page meta review of the field, Foundations and Trends in Computer Graphics and Vision, 201

    An Automatic Level Set Based Liver Segmentation from MRI Data Sets

    Get PDF
    A fast and accurate liver segmentation method is a challenging work in medical image analysis area. Liver segmentation is an important process for computer-assisted diagnosis, pre-evaluation of liver transplantation and therapy planning of liver tumors. There are several advantages of magnetic resonance imaging such as free form ionizing radiation and good contrast visualization of soft tissue. Also, innovations in recent technology and image acquisition techniques have made magnetic resonance imaging a major tool in modern medicine. However, the use of magnetic resonance images for liver segmentation has been slow when we compare applications with the central nervous systems and musculoskeletal. The reasons are irregular shape, size and position of the liver, contrast agent effects and similarities of the gray values of neighbor organs. Therefore, in this study, we present a fully automatic liver segmentation method by using an approximation of the level set based contour evolution from T2 weighted magnetic resonance data sets. The method avoids solving partial differential equations and applies only integer operations with a two-cycle segmentation algorithm. The efficiency of the proposed approach is achieved by applying the algorithm to all slices with a constant number of iteration and performing the contour evolution without any user defined initial contour. The obtained results are evaluated with four different similarity measures and they show that the automatic segmentation approach gives successful results
    corecore