174 research outputs found

    Learning Sparse High Dimensional Filters: Image Filtering, Dense CRFs and Bilateral Neural Networks

    Full text link
    Bilateral filters have wide spread use due to their edge-preserving properties. The common use case is to manually choose a parametric filter type, usually a Gaussian filter. In this paper, we will generalize the parametrization and in particular derive a gradient descent algorithm so the filter parameters can be learned from data. This derivation allows to learn high dimensional linear filters that operate in sparsely populated feature spaces. We build on the permutohedral lattice construction for efficient filtering. The ability to learn more general forms of high-dimensional filters can be used in several diverse applications. First, we demonstrate the use in applications where single filter applications are desired for runtime reasons. Further, we show how this algorithm can be used to learn the pairwise potentials in densely connected conditional random fields and apply these to different image segmentation tasks. Finally, we introduce layers of bilateral filters in CNNs and propose bilateral neural networks for the use of high-dimensional sparse data. This view provides new ways to encode model structure into network architectures. A diverse set of experiments empirically validates the usage of general forms of filters

    Three dimensional reconstruction of plant roots via low energy x-ray computed tomography

    Get PDF
    Plant roots are vital organs for water and nutrient uptake. The structure and spatial distribution of plant roots in the soil affects a plant's physiological functions such as soil-based resource acquisition, yield and its ability to live under abiotic stress. Visualizing and quantifying roots' configuration below the ground can help in identifying the phenotypic traits responsible for a plant's physiological functions. Existing efforts have successfully employed X-ray computed tomography to visualize plant roots in three-dimensions and to quantify their complexity in a non-invasive and non-destructive manner. However, they used expensive and less accessible industrial or medical tomographic systems. This research uses an inexpensive, lab-built X-ray computed tomography (CT) system, operating at lower energy levels (30kV-40kV), to obtain two-dimensional projections of a plant root from different viewpoints. I propose image processing pipelines to segment roots and generate a three-dimensional model of the root system architecture from the two-dimensional projections. Observing that a Gaussian-shaped curve can approximate the cross-sectional intensity profle of a root segment, I propose a novel multi-scale matched filtering with a two-dimensional Gaussian kernel to enhance the root system. The filter assumes different orientations to highlight the root segments grown in different directions. The roots are isolated from the background by manual thresholding, followed by a mathematical morphological process to reduce spurious noise. The segmented images are filtered back projected to generate a three-dimensional model of the plant root system. The results from the research conducted show that the proposed method yields a structurally consistent three-dimensional model of the plant root image set obtained in the air, whereas alternate methods could not process the image set. For plant root images collected in the air, the three-dimensional model generated from the proposed matched-guided filtering and filtered back projection has a better contrast measure (0.0036) compared to the contrast measure (0.099) of the three-dimensional model created from raw images. For plant root images captured in the soil, proposed multiscale matched filtering resulted in better receiver operating characteristic curves than the raw images. Compared to Otsu's thresholding, multi-scale root enhancement and thresholding have reduced the average false positive rate from 0.344 to 0.042, and improved the average F1 score from 0.4 to 0.775. Experimental results show that the proposed root enhancement methods are robust to the number of orientational filters chosen, and are sensitive to the filter length selected. Small size filters are preferred, since increasing the filter length increases the number of false positives around root segments.Includes bibliographical reference

    Efficient data structures for piecewise-smooth video processing

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2011.Cataloged from PDF version of thesis.Includes bibliographical references (p. 95-102).A number of useful image and video processing techniques, ranging from low level operations such as denoising and detail enhancement to higher level methods such as object manipulation and special effects, rely on piecewise-smooth functions computed from the input data. In this thesis, we present two computationally efficient data structures for representing piecewise-smooth visual information and demonstrate how they can dramatically simplify and accelerate a variety of video processing algorithms. We start by introducing the bilateral grid, an image representation that explicitly accounts for intensity edges. By interpreting brightness values as Euclidean coordinates, the bilateral grid enables simple expressions for edge-aware filters. Smooth functions defined on the bilateral grid are piecewise-smooth in image space. Within this framework, we derive efficient reinterpretations of a number of edge-aware filters commonly used in computational photography as operations on the bilateral grid, including the bilateral filter, edgeaware scattered data interpolation, and local histogram equalization. We also show how these techniques can be easily parallelized onto modern graphics hardware for real-time processing of high definition video. The second data structure we introduce is the video mesh, designed as a flexible central data structure for general-purpose video editing. It represents objects in a video sequence as 2.5D "paper cutouts" and allows interactive editing of moving objects and modeling of depth, which enables 3D effects and post-exposure camera control. In our representation, we assume that motion and depth are piecewise-smooth, and encode them sparsely as a set of points tracked over time. The video mesh is a triangulation over this point set and per-pixel information is obtained by interpolation. To handle occlusions and detailed object boundaries, we rely on the user to rotoscope the scene at a sparse set of frames using spline curves. We introduce an algorithm to robustly and automatically cut the mesh into local layers with proper occlusion topology, and propagate the splines to the remaining frames. Object boundaries are refined with per-pixel alpha mattes. At its core, the video mesh is a collection of texture-mapped triangles, which we can edit and render interactively using graphics hardware. We demonstrate the effectiveness of our representation with special effects such as 3D viewpoint changes, object insertion, depthof- field manipulation, and 2D to 3D video conversion.by Jiawen Chen.Ph.D

    Geometric Structure Extraction and Reconstruction

    Get PDF
    Geometric structure extraction and reconstruction is a long-standing problem in research communities including computer graphics, computer vision, and machine learning. Within different communities, it can be interpreted as different subproblems such as skeleton extraction from the point cloud, surface reconstruction from multi-view images, or manifold learning from high dimensional data. All these subproblems are building blocks of many modern applications, such as scene reconstruction for AR/VR, object recognition for robotic vision and structural analysis for big data. Despite its importance, the extraction and reconstruction of a geometric structure from real-world data are ill-posed, where the main challenges lie in the incompleteness, noise, and inconsistency of the raw input data. To address these challenges, three studies are conducted in this thesis: i) a new point set representation for shape completion, ii) a structure-aware data consolidation method, and iii) a data-driven deep learning technique for multi-view consistency. In addition to theoretical contributions, the algorithms we proposed significantly improve the performance of several state-of-the-art geometric structure extraction and reconstruction approaches, validated by extensive experimental results

    ํŒ”์ง„ํŠธ๋ฆฌ์ƒ์—์„œ ์‚ฐ์žฌํ•œ ์ ๊ตฐ์œผ๋กœ๋ถ€ํ„ฐ์˜ ์Œํ•จ์ˆ˜ ๊ณก๋ฉด ์žฌ๊ตฌ์„ฑ๊ณผ ์Œํ•จ์ˆ˜ ๊ณก๋ฉด์˜ ํŠน์ง• ํƒ์ง€

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ (๋ฐ•์‚ฌ)-- ์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› : ์ˆ˜๋ฆฌ๊ณผํ•™๋ถ€, 2013. 2. ๊ฐ•๋ช…์ฃผ.In this thesis, we are concerned with reverse engineering process using implicit surface represented by level set. We consider two methods. One is to reconstruct implicit surface from scattered point data on octree and the other detects features such as edges and corners on the implicit surface. Our surface reconstruction method is based on the level set method using octree i.e. a kind of adaptive grid. We start with the surface reconstruction model proposed in Ye's where they considered the surface reconstruction process as an elliptic problem while most previous methods employed the time marching process from an initial surface to point cloud. However, as far as their method is implemented on uniform grid, it exposes inefficiency such as the high cost of memory. We improved it by adapting octree data structure to our problem and by introducing a new redistancing algorithm which is different from the existing one. We also address feature detection from 3D CT image which is a form of implicit surface. While laser scanner is accurate and has little noise, it can't examine the inside of object. So, CT scanner is recently becoming popular for non-destructive inspection of mechanical part. But for reverse engineering, we should transform 3D image data into B-spline surface data in order to use it on CAD software, that is, change from implicit surface to parametric surface. In that process, we need feature detection for parametrization of surface. But it has more artifacts such as noise and blur than laser scanner. Consequently, preprocess for reducing artifacts is required. We apply some existing denoising algorithms to CT image data and then extract edges and corners with our feature detection method.1. Introduction 2. Surface Reconstruction Method from Scattered Point Data on Octree 2.1 Previous work 2.1.1 History 2.1.2 Fast sweeping method 2.1.3 Basic finite difference methods on octree 2.1.4 Biconjugate gradient stabilized(BICGSTAB) algorithm 2.2 Mathematical models 2.3 Numerical method 2.3.1 Tree generation and splitting condition 2.3.2 Distance function 2.3.3 Initial guess of signed distance function 2.3.4 Numerical discretization of model (2.2.6) on octree 2.4 Results 2.4.1 Five-leafed clover 2.4.2 Bunny, Dragon, Happy buddha 3 Feature Detection on Implicit Surface 3.1 Related work and background 3.1.1 Segmentation with the level set method 3.1.2 Signed distance function 3.1.3 Nonlocal means filtering 3.2 Corner and sharp edge detection 3.2.1 Corner detection 3.2.2 Sharp edge detection 3.2.3 False feature removal 3.3 Results 4 Conclusion and Further WorkDocto
    • โ€ฆ
    corecore