133 research outputs found
Learning to Generate 3D Training Data
Human-level visual 3D perception ability has long been pursued by researchers in computer vision, computer graphics, and robotics. Recent years have seen an emerging line of works using synthetic images to train deep networks for single image 3D perception. Synthetic images rendered by graphics engines are a promising source for training deep neural networks because it comes with perfect 3D ground truth for free. However, the 3D shapes and scenes to be rendered are largely made manual. Besides, it is challenging to ensure that synthetic images collected this way can help train a deep network to perform well on real images. This is because graphics generation pipelines require numerous design decisions such as the selection of 3D shapes and the placement of the camera.
In this dissertation, we propose automatic generation pipelines of synthetic data that aim to improve the task performance of a trained network. We explore both supervised and unsupervised directions for automatic optimization of 3D decisions. For supervised learning, we demonstrate how to optimize 3D parameters such that a trained network can generalize well to real images. We first show that we can construct a pure synthetic 3D shape to achieve state-of-the-art performance on a shape-from-shading benchmark. We further parameterize the decisions as a vector and propose a hybrid gradient approach to efficiently optimize the vector towards usefulness. Our hybrid gradient is able to outperform classic black-box approaches on a wide selection of 3D perception tasks. For unsupervised learning, we propose a novelty metric for 3D parameter evolution based on deep autoregressive models. We show that without any extrinsic motivation, the novelty computed from autoregressive models alone is helpful. Our novelty metric can consistently encourage a random synthetic generator to produce more useful training data for downstream 3D perception tasks.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/163240/1/ydawei_1.pd
Deep Unrolling for Magnetic Resonance Fingerprinting
Magnetic Resonance Fingerprinting (MRF) has emerged as a promising
quantitative MR imaging approach. Deep learning methods have been proposed for
MRF and demonstrated improved performance over classical compressed sensing
algorithms. However many of these end-to-end models are physics-free, while
consistency of the predictions with respect to the physical forward model is
crucial for reliably solving inverse problems. To address this, recently [1]
proposed a proximal gradient descent framework that directly incorporates the
forward acquisition and Bloch dynamic models within an unrolled learning
mechanism. However, [1] only evaluated the unrolled model on synthetic data
using Cartesian sampling trajectories. In this paper, as a complementary to
[1], we investigate other choices of encoders to build the proximal neural
network, and evaluate the deep unrolling algorithm on real accelerated MRF
scans with non-Cartesian k-space sampling trajectories.Comment: Tech report. arXiv admin note: substantial text overlap with
arXiv:2006.1527
Deep Decomposition Learning for Inverse Imaging Problems
Deep learning is emerging as a new paradigm for solving inverse imaging
problems. However, the deep learning methods often lack the assurance of
traditional physics-based methods due to the lack of physical information
considerations in neural network training and deploying. The appropriate
supervision and explicit calibration by the information of the physic model can
enhance the neural network learning and its practical performance. In this
paper, inspired by the geometry that data can be decomposed by two components
from the null-space of the forward operator and the range space of its
pseudo-inverse, we train neural networks to learn the two components and
therefore learn the decomposition, i.e. we explicitly reformulate the neural
network layers as learning range-nullspace decomposition functions with
reference to the layer inputs, instead of learning unreferenced functions. We
empirically show that the proposed framework demonstrates superior performance
over recent deep residual learning, unrolled learning and nullspace learning on
tasks including compressive sensing medical imaging and natural image
super-resolution. Our code is available at
https://github.com/edongdongchen/DDN.Comment: To appear in ECCV 202
Interpretable Hyperspectral AI: When Non-Convex Modeling meets Hyperspectral Remote Sensing
Hyperspectral imaging, also known as image spectrometry, is a landmark
technique in geoscience and remote sensing (RS). In the past decade, enormous
efforts have been made to process and analyze these hyperspectral (HS) products
mainly by means of seasoned experts. However, with the ever-growing volume of
data, the bulk of costs in manpower and material resources poses new challenges
on reducing the burden of manual labor and improving efficiency. For this
reason, it is, therefore, urgent to develop more intelligent and automatic
approaches for various HS RS applications. Machine learning (ML) tools with
convex optimization have successfully undertaken the tasks of numerous
artificial intelligence (AI)-related applications. However, their ability in
handling complex practical problems remains limited, particularly for HS data,
due to the effects of various spectral variabilities in the process of HS
imaging and the complexity and redundancy of higher dimensional HS signals.
Compared to the convex models, non-convex modeling, which is capable of
characterizing more complex real scenes and providing the model
interpretability technically and theoretically, has been proven to be a
feasible solution to reduce the gap between challenging HS vision tasks and
currently advanced intelligent data processing models
Image Processing and Machine Learning for Hyperspectral Unmixing: An Overview and the HySUPP Python Package
Spectral pixels are often a mixture of the pure spectra of the materials,
called endmembers, due to the low spatial resolution of hyperspectral sensors,
double scattering, and intimate mixtures of materials in the scenes. Unmixing
estimates the fractional abundances of the endmembers within the pixel.
Depending on the prior knowledge of endmembers, linear unmixing can be divided
into three main groups: supervised, semi-supervised, and unsupervised (blind)
linear unmixing. Advances in Image processing and machine learning
substantially affected unmixing. This paper provides an overview of advanced
and conventional unmixing approaches. Additionally, we draw a critical
comparison between advanced and conventional techniques from the three
categories. We compare the performance of the unmixing techniques on three
simulated and two real datasets. The experimental results reveal the advantages
of different unmixing categories for different unmixing scenarios. Moreover, we
provide an open-source Python-based package available at
https://github.com/BehnoodRasti/HySUPP to reproduce the results
- …