174 research outputs found
Learning Sparse High Dimensional Filters: Image Filtering, Dense CRFs and Bilateral Neural Networks
Bilateral filters have wide spread use due to their edge-preserving
properties. The common use case is to manually choose a parametric filter type,
usually a Gaussian filter. In this paper, we will generalize the
parametrization and in particular derive a gradient descent algorithm so the
filter parameters can be learned from data. This derivation allows to learn
high dimensional linear filters that operate in sparsely populated feature
spaces. We build on the permutohedral lattice construction for efficient
filtering. The ability to learn more general forms of high-dimensional filters
can be used in several diverse applications. First, we demonstrate the use in
applications where single filter applications are desired for runtime reasons.
Further, we show how this algorithm can be used to learn the pairwise
potentials in densely connected conditional random fields and apply these to
different image segmentation tasks. Finally, we introduce layers of bilateral
filters in CNNs and propose bilateral neural networks for the use of
high-dimensional sparse data. This view provides new ways to encode model
structure into network architectures. A diverse set of experiments empirically
validates the usage of general forms of filters
Recommended from our members
Cardiac Motion Analysis Based on Optical Flow on Real-Time Three-Dimensional Ultrasound Data
With relatively high frame rates and the ability to acquire volume data sets with a stationary transducer, 3D ultrasound systems, based on matrix phased array transducers, provide valuable three-dimensional information, from which quantitative measures of cardiac function can be extracted. Such analyses require segmentation and visual tracking of the left ventricular endocardial border. Due to the large size of the volumetric data sets, manual tracing of the endocardial border is tedious and impractical for clinical applications. Therefore the development of automatic methods for tracking three-dimensional endocardial motion is essential. In this study, we evaluate a four-dimensional optical flow motion tracking algorithm to determine its capability to follow the endocardial border in three dimensional ultrasound data through time. The four-dimensional optical flow method was implemented using three-dimensional correlation. We tested the algorithm on an experimental open-chest dog data set and a clinical data set acquired with a Philips' iE33 three-dimensional ultrasound machine. Initialized with left ventricular endocardial data points obtained from manual tracing at end-diastole, the algorithm automatically tracked these points frame by frame through the whole cardiac cycle. Finite element surfaces were fitted through the data points obtained by both optical flow tracking and manual tracing by an experienced observer for quantitative comparison of the results. Parameterization of the finite element surfaces was performed and maps displaying relative differences between the manual and semi-automatic methods were compared. The results showed good consistency with less than 10% difference between manual tracing and optical flow estimation on 73% of the entire surface. In addition, the optical flow motion tracking algorithm greatly reduced processing time (about 94% reduction compared to human involvement per cardiac cycle) for analyzing cardiac function in three-dimensional ultrasound data sets. A displacement field was computed from the optical flow output, and a framework for computation of dynamic cardiac information is introduced. The method was applied to a clinical data set from a heart transplant patient and dynamic measurements agreed with known physiology as well as experimental results
Three dimensional reconstruction of plant roots via low energy x-ray computed tomography
Plant roots are vital organs for water and nutrient uptake. The structure and spatial distribution of plant roots in the soil affects a plant's physiological functions such as soil-based resource acquisition, yield and its ability to live under abiotic stress. Visualizing and quantifying roots' configuration below the ground can help in identifying the phenotypic traits responsible for a plant's physiological functions. Existing efforts have successfully employed X-ray computed tomography to visualize plant roots in three-dimensions and to quantify their complexity in a non-invasive and non-destructive manner. However, they used expensive and less accessible industrial or medical tomographic systems. This research uses an inexpensive, lab-built X-ray computed tomography (CT) system, operating at lower energy levels (30kV-40kV), to obtain two-dimensional projections of a plant root from different viewpoints. I propose image processing pipelines to segment roots and generate a three-dimensional model of the root system architecture from the two-dimensional projections. Observing that a Gaussian-shaped curve can approximate the cross-sectional intensity profle of a root segment, I propose a novel multi-scale matched filtering with a two-dimensional Gaussian kernel to enhance the root system. The filter assumes different orientations to highlight the root segments grown in different directions. The roots are isolated from the background by manual thresholding, followed by a mathematical morphological process to reduce spurious noise. The segmented images are filtered back projected to generate a three-dimensional model of the plant root system. The results from the research conducted show that the proposed method yields a structurally consistent three-dimensional model of the plant root image set obtained in the air, whereas alternate methods could not process the image set. For plant root images collected in the air, the three-dimensional model generated from the proposed matched-guided filtering and filtered back projection has a better contrast measure (0.0036) compared to the contrast measure (0.099) of the three-dimensional model created from raw images. For plant root images captured in the soil, proposed multiscale matched filtering resulted in better receiver operating characteristic curves than the raw images. Compared to Otsu's thresholding, multi-scale root enhancement and thresholding have reduced the average false positive rate from 0.344 to 0.042, and improved the average F1 score from 0.4 to 0.775. Experimental results show that the proposed root enhancement methods are robust to the number of orientational filters chosen, and are sensitive to the filter length selected. Small size filters are preferred, since increasing the filter length increases the number of false positives around root segments.Includes bibliographical reference
Efficient data structures for piecewise-smooth video processing
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2011.Cataloged from PDF version of thesis.Includes bibliographical references (p. 95-102).A number of useful image and video processing techniques, ranging from low level operations such as denoising and detail enhancement to higher level methods such as object manipulation and special effects, rely on piecewise-smooth functions computed from the input data. In this thesis, we present two computationally efficient data structures for representing piecewise-smooth visual information and demonstrate how they can dramatically simplify and accelerate a variety of video processing algorithms. We start by introducing the bilateral grid, an image representation that explicitly accounts for intensity edges. By interpreting brightness values as Euclidean coordinates, the bilateral grid enables simple expressions for edge-aware filters. Smooth functions defined on the bilateral grid are piecewise-smooth in image space. Within this framework, we derive efficient reinterpretations of a number of edge-aware filters commonly used in computational photography as operations on the bilateral grid, including the bilateral filter, edgeaware scattered data interpolation, and local histogram equalization. We also show how these techniques can be easily parallelized onto modern graphics hardware for real-time processing of high definition video. The second data structure we introduce is the video mesh, designed as a flexible central data structure for general-purpose video editing. It represents objects in a video sequence as 2.5D "paper cutouts" and allows interactive editing of moving objects and modeling of depth, which enables 3D effects and post-exposure camera control. In our representation, we assume that motion and depth are piecewise-smooth, and encode them sparsely as a set of points tracked over time. The video mesh is a triangulation over this point set and per-pixel information is obtained by interpolation. To handle occlusions and detailed object boundaries, we rely on the user to rotoscope the scene at a sparse set of frames using spline curves. We introduce an algorithm to robustly and automatically cut the mesh into local layers with proper occlusion topology, and propagate the splines to the remaining frames. Object boundaries are refined with per-pixel alpha mattes. At its core, the video mesh is a collection of texture-mapped triangles, which we can edit and render interactively using graphics hardware. We demonstrate the effectiveness of our representation with special effects such as 3D viewpoint changes, object insertion, depthof- field manipulation, and 2D to 3D video conversion.by Jiawen Chen.Ph.D
Geometric Structure Extraction and Reconstruction
Geometric structure extraction and reconstruction is a long-standing problem in research communities including computer graphics, computer vision, and machine learning. Within different communities, it can be interpreted as different subproblems such as skeleton extraction from the point cloud, surface reconstruction from multi-view images, or manifold learning from high dimensional data. All these subproblems are building blocks of many modern applications, such as scene reconstruction for AR/VR, object recognition for robotic vision and structural analysis for big data. Despite its importance, the extraction and reconstruction of a geometric structure from real-world data are ill-posed, where the main challenges lie in the incompleteness, noise, and inconsistency of the raw input data. To address these challenges, three studies are conducted in this thesis: i) a new point set representation for shape completion, ii) a structure-aware data consolidation method, and iii) a data-driven deep learning technique for multi-view consistency. In addition to theoretical contributions, the algorithms we proposed significantly improve the performance of several state-of-the-art geometric structure extraction and reconstruction approaches, validated by extensive experimental results
ํ์งํธ๋ฆฌ์์์ ์ฐ์ฌํ ์ ๊ตฐ์ผ๋ก๋ถํฐ์ ์ํจ์ ๊ณก๋ฉด ์ฌ๊ตฌ์ฑ๊ณผ ์ํจ์ ๊ณก๋ฉด์ ํน์ง ํ์ง
ํ์๋
ผ๋ฌธ (๋ฐ์ฌ)-- ์์ธ๋ํ๊ต ๋ํ์ : ์๋ฆฌ๊ณผํ๋ถ, 2013. 2. ๊ฐ๋ช
์ฃผ.In this thesis, we are concerned with reverse engineering process using implicit surface represented by level set. We consider two methods. One is to reconstruct implicit surface from scattered point data on octree and the other detects features such as edges and corners on the implicit surface.
Our surface reconstruction method is based on the level set method using octree i.e. a kind of adaptive grid. We start with the surface reconstruction model proposed in Ye's where they considered the surface reconstruction process as an elliptic problem while most previous methods employed the time marching process from an initial surface to point cloud. However, as far as their method is implemented on uniform grid, it exposes inefficiency such as the high cost of memory. We improved it by adapting octree data structure to our problem and by introducing a new redistancing algorithm which is different from the existing one.
We also address feature detection from 3D CT image which is a form of implicit surface. While laser scanner is accurate and has little noise, it can't examine the inside of object. So, CT scanner is recently becoming popular for non-destructive inspection of mechanical part. But for reverse engineering, we should transform 3D image data into B-spline surface data in order to use it on CAD software, that is, change from implicit surface to parametric surface. In that process, we need feature detection for parametrization of surface. But it has more artifacts such as noise and blur than laser scanner. Consequently, preprocess for reducing artifacts is required. We apply some existing denoising algorithms to CT image data and then extract edges and corners with our feature detection method.1. Introduction
2. Surface Reconstruction Method from Scattered Point Data on Octree
2.1 Previous work
2.1.1 History
2.1.2 Fast sweeping method
2.1.3 Basic finite difference methods on octree
2.1.4 Biconjugate gradient stabilized(BICGSTAB)
algorithm
2.2 Mathematical models
2.3 Numerical method
2.3.1 Tree generation and splitting condition
2.3.2 Distance function
2.3.3 Initial guess of signed distance function
2.3.4 Numerical discretization of model (2.2.6) on
octree
2.4 Results
2.4.1 Five-leafed clover
2.4.2 Bunny, Dragon, Happy buddha
3 Feature Detection on Implicit Surface
3.1 Related work and background
3.1.1 Segmentation with the level set method
3.1.2 Signed distance function
3.1.3 Nonlocal means filtering
3.2 Corner and sharp edge detection
3.2.1 Corner detection
3.2.2 Sharp edge detection
3.2.3 False feature removal
3.3 Results
4 Conclusion and Further WorkDocto
- โฆ