144 research outputs found
Learning Sparse High Dimensional Filters: Image Filtering, Dense CRFs and Bilateral Neural Networks
Bilateral filters have wide spread use due to their edge-preserving
properties. The common use case is to manually choose a parametric filter type,
usually a Gaussian filter. In this paper, we will generalize the
parametrization and in particular derive a gradient descent algorithm so the
filter parameters can be learned from data. This derivation allows to learn
high dimensional linear filters that operate in sparsely populated feature
spaces. We build on the permutohedral lattice construction for efficient
filtering. The ability to learn more general forms of high-dimensional filters
can be used in several diverse applications. First, we demonstrate the use in
applications where single filter applications are desired for runtime reasons.
Further, we show how this algorithm can be used to learn the pairwise
potentials in densely connected conditional random fields and apply these to
different image segmentation tasks. Finally, we introduce layers of bilateral
filters in CNNs and propose bilateral neural networks for the use of
high-dimensional sparse data. This view provides new ways to encode model
structure into network architectures. A diverse set of experiments empirically
validates the usage of general forms of filters
Learning Task-Specific Generalized Convolutions in the Permutohedral Lattice
Dense prediction tasks typically employ encoder-decoder architectures, but
the prevalent convolutions in the decoder are not image-adaptive and can lead
to boundary artifacts. Different generalized convolution operations have been
introduced to counteract this. We go beyond these by leveraging guidance data
to redefine their inherent notion of proximity. Our proposed network layer
builds on the permutohedral lattice, which performs sparse convolutions in a
high-dimensional space allowing for powerful non-local operations despite small
filters. Multiple features with different characteristics span this
permutohedral space. In contrast to prior work, we learn these features in a
task-specific manner by generalizing the basic permutohedral operations to
learnt feature representations. As the resulting objective is complex, a
carefully designed framework and learning procedure are introduced, yielding
rich feature embeddings in practice. We demonstrate the general applicability
of our approach in different joint upsampling tasks. When adding our network
layer to state-of-the-art networks for optical flow and semantic segmentation,
boundary artifacts are removed and the accuracy is improved.Comment: To appear at GCPR 201
Deep Learning Applications in Medical Image and Shape Analysis
Deep learning is one of the most rapidly growing fields in computer and data science in the past few years. It has been widely used for feature extraction and recognition in various applications. The training process as a black-box utilizes deep neural networks, whose parameters are adjusted by minimizing the difference between the predicted feedback and labeled data (so-called training dataset). The trained model is then applied to unknown inputs to predict the results that mimic human\u27s decision-making. This technology has found tremendous success in many fields involving data analysis such as images, shapes, texts, audio and video signals and so on. In medical applications, images have been regularly used by physicians for diagnosis of diseases, making treatment plans, and tracking progress of patient treatment. One of the most challenging and common problems in image processing is segmentation of features of interest, so-called feature extraction. To this end, we aim to develop a deep learning framework in the current thesis to extract regions of interest in wound images. In addition, we investigate deep learning approaches for segmentation of 3D surface shapes as a potential tool for surface analysis in our future work. Experiments are presented and discussed for both 2D image and 3D shape analysis using deep learning networks
HPLFlowNet: Hierarchical Permutohedral Lattice FlowNet for Scene Flow Estimation on Large-scale Point Clouds
We present a novel deep neural network architecture for end-to-end scene flow
estimation that directly operates on large-scale 3D point clouds. Inspired by
Bilateral Convolutional Layers (BCL), we propose novel DownBCL, UpBCL, and
CorrBCL operations that restore structural information from unstructured point
clouds, and fuse information from two consecutive point clouds. Operating on
discrete and sparse permutohedral lattice points, our architectural design is
parsimonious in computational cost. Our model can efficiently process a pair of
point cloud frames at once with a maximum of 86K points per frame. Our approach
achieves state-of-the-art performance on the FlyingThings3D and KITTI Scene
Flow 2015 datasets. Moreover, trained on synthetic data, our approach shows
great generalization ability on real-world data and on different point
densities without fine-tuning
Segmentation-Aware Convolutional Networks Using Local Attention Masks
We introduce an approach to integrate segmentation information within a
convolutional neural network (CNN). This counter-acts the tendency of CNNs to
smooth information across regions and increases their spatial precision. To
obtain segmentation information, we set up a CNN to provide an embedding space
where region co-membership can be estimated based on Euclidean distance. We use
these embeddings to compute a local attention mask relative to every neuron
position. We incorporate such masks in CNNs and replace the convolution
operation with a "segmentation-aware" variant that allows a neuron to
selectively attend to inputs coming from its own region. We call the resulting
network a segmentation-aware CNN because it adapts its filters at each image
point according to local segmentation cues. We demonstrate the merit of our
method on two widely different dense prediction tasks, that involve
classification (semantic segmentation) and regression (optical flow). Our
results show that in semantic segmentation we can match the performance of
DenseCRFs while being faster and simpler, and in optical flow we obtain clearly
sharper responses than networks that do not use local attention masks. In both
cases, segmentation-aware convolution yields systematic improvements over
strong baselines. Source code for this work is available online at
http://cs.cmu.edu/~aharley/segaware
Permutohedral Attention Module for Efficient Non-Local Neural Networks
Medical image processing tasks such as segmentation often require capturing
non-local information. As organs, bones, and tissues share common
characteristics such as intensity, shape, and texture, the contextual
information plays a critical role in correctly labeling them. Segmentation and
labeling is now typically done with convolutional neural networks (CNNs) but
the context of the CNN is limited by the receptive field which itself is
limited by memory requirements and other properties. In this paper, we propose
a new attention module, that we call Permutohedral Attention Module (PAM), to
efficiently capture non-local characteristics of the image. The proposed method
is both memory and computationally efficient. We provide a GPU implementation
of this module suitable for 3D medical imaging problems. We demonstrate the
efficiency and scalability of our module with the challenging task of vertebrae
segmentation and labeling where context plays a crucial role because of the
very similar appearance of different vertebrae.Comment: Accepted at MICCAI-201
Deep Learning Applications in Medical Image and Shape Analysis
Deep learning is one of the most rapidly growing fields in computer and data science in the past few years. It has been widely used for feature extraction and recognition in various applications. The training process as a black-box utilizes deep neural networks, whose parameters are adjusted by minimizing the difference between the predicted feedback and labeled data (so-called training dataset). The trained model is then applied to unknown inputs to predict the results that mimic human\u27s decision-making. This technology has found tremendous success in many fields involving data analysis such as images, shapes, texts, audio and video signals and so on. In medical applications, images have been regularly used by physicians for diagnosis of diseases, making treatment plans, and tracking progress of patient treatment. One of the most challenging and common problems in image processing is segmentation of features of interest, so-called feature extraction. To this end, we aim to develop a deep learning framework in the current thesis to extract regions of interest in wound images. In addition, we investigate deep learning approaches for segmentation of 3D surface shapes as a potential tool for surface analysis in our future work. Experiments are presented and discussed for both 2D image and 3D shape analysis using deep learning networks
- …