7,637 research outputs found
Numerical Coordinate Regression with Convolutional Neural Networks
We study deep learning approaches to inferring numerical coordinates for
points of interest in an input image. Existing convolutional neural
network-based solutions to this problem either take a heatmap matching approach
or regress to coordinates with a fully connected output layer. Neither of these
approaches is ideal, since the former is not entirely differentiable, and the
latter lacks inherent spatial generalization. We propose our differentiable
spatial to numerical transform (DSNT) to fill this gap. The DSNT layer adds no
trainable parameters, is fully differentiable, and exhibits good spatial
generalization. Unlike heatmap matching, DSNT works well with low heatmap
resolutions, so it can be dropped in as an output layer for a wide range of
existing fully convolutional architectures. Consequently, DSNT offers a better
trade-off between inference speed and prediction accuracy compared to existing
techniques. When used to replace the popular heatmap matching approach used in
almost all state-of-the-art methods for pose estimation, DSNT gives better
prediction accuracy for all model architectures tested
PFCNN: Convolutional Neural Networks on 3D Surfaces Using Parallel Frames
Surface meshes are widely used shape representations and capture finer
geometry data than point clouds or volumetric grids, but are challenging to
apply CNNs directly due to their non-Euclidean structure. We use parallel
frames on surface to define PFCNNs that enable effective feature learning on
surface meshes by mimicking standard convolutions faithfully. In particular,
the convolution of PFCNN not only maps local surface patches onto flat tangent
planes, but also aligns the tangent planes such that they locally form a flat
Euclidean structure, thus enabling recovery of standard convolutions. The
alignment is achieved by the tool of locally flat connections borrowed from
discrete differential geometry, which can be efficiently encoded and computed
by parallel frame fields. In addition, the lack of canonical axis on surface is
handled by sampling with the frame directions. Experiments show that for tasks
including classification, segmentation and registration on deformable geometric
domains, as well as semantic scene segmentation on rigid domains, PFCNNs
achieve robust and superior performances without using sophisticated input
features than state-of-the-art surface based CNNs.Comment: 15 pages, 18 figures. CVPR 2020. Project page:
https://haopan.github.io/surfacecnn.htm
Convolutional Neural Network for Transition Modeling Based on Linear Stability Theory
Transition prediction is an important aspect of aerodynamic design because of
its impact on skin friction and potential coupling with flow separation
characteristics. Traditionally, the modeling of transition has relied on
correlation-based empirical formulas based on integral quantities such as the
shape factor of the boundary layer. However, in many applications of
computational fluid dynamics, the shape factor is not straightforwardly
available or not well-defined. We propose using the complete velocity profile
along with other quantities (e.g., frequency, Reynolds number) to predict the
perturbation amplification factor. While this can be achieved with regression
models based on a classical fully connected neural network, such a model can be
computationally more demanding. We propose a novel convolutional neural network
inspired by the underlying physics as described by the stability equations.
Specifically, convolutional layers are first used to extract integral
quantities from the velocity profiles, and then fully connected layers are used
to map the extracted integral quantities, along with frequency and Reynolds
number, to the output (amplification ratio). Numerical tests on classical
boundary layers clearly demonstrate the merits of the proposed method. More
importantly, we demonstrate that, for Tollmien-Schlichting instabilities in
two-dimensional, low-speed boundary layers, the proposed network encodes
information in the boundary layer profiles into an integral quantity that is
strongly correlated to a well-known, physically defined parameter -- the shape
factor.Comment: 15 pages, 7 figures, submitted to Physical Review Fluids journa
Deep Learning Seismic Substructure Detection using the Frozen Gaussian Approximation
We propose a deep learning algorithm for seismic interface and pocket
detection with neural networks trained by synthetic high-frequency displacement
data efficiently generated by the frozen Gaussian approximation (FGA). In
seismic imaging high-frequency data is advantageous since it can provide high
resolution of substructures. However, generation of sufficient synthetic
high-frequency data sets for training neural networks is computationally
challenging. This bottleneck is overcome by a highly scalable computational
platform built upon the FGA, which comes from the semiclassical theory and
approximates the wavefields by a sum of fixed-width (frozen) Gaussian wave
packets. Data is generated from a forward simulation of the elastic wave
equation using the FGA. This data contains accurate traveltime information
(from the ray path) but not exact amplitude information (with asymptotic errors
not shrinking to zero even at extremely fine numerical resolution). Using this
data we build convolutional neural network models using an open source API,
GeoSeg, developed using Keras and Tensorflow. On a simple model, networks,
despite only being trained on FGA data, can detect an interface with a high
success rate from displacement data generated by the spectral element method.
Benchmark tests are done for P-waves (acoustic) and P- and S-waves (elastic)
generated using the FGA and a spectral element method. Further, results with a
high accuracy are shown for more complicated geometries including a three
layered model, and a 2D-pocket model where the neural networks trained by both
clean and noisy data
3D Human Pose Estimation with 2D Marginal Heatmaps
Automatically determining three-dimensional human pose from monocular RGB
image data is a challenging problem. The two-dimensional nature of the input
results in intrinsic ambiguities which make inferring depth particularly
difficult. Recently, researchers have demonstrated that the flexible
statistical modelling capabilities of deep neural networks are sufficient to
make such inferences with reasonable accuracy. However, many of these models
use coordinate output techniques which are memory-intensive, not
differentiable, and/or do not spatially generalise well. We propose
improvements to 3D coordinate prediction which avoid the aforementioned
undesirable traits by predicting 2D marginal heatmaps under an augmented
soft-argmax scheme. Our resulting model, MargiPose, produces visually coherent
heatmaps whilst maintaining differentiability. We are also able to achieve
state-of-the-art accuracy on publicly available 3D human pose estimation data.Comment: Accepted in WACV 201
Empirical study of PROXTONE and PROXTONE for Fast Learning of Large Scale Sparse Models
PROXTONE is a novel and fast method for optimization of large scale
non-smooth convex problem \cite{shi2015large}. In this work, we try to use
PROXTONE method in solving large scale \emph{non-smooth non-convex} problems,
for example training of sparse deep neural network (sparse DNN) or sparse
convolutional neural network (sparse CNN) for embedded or mobile device.
PROXTONE converges much faster than first order methods, while first order
method is easy in deriving and controlling the sparseness of the solutions.
Thus in some applications, in order to train sparse models fast, we propose to
combine the merits of both methods, that is we use PROXTONE in the first
several epochs to reach the neighborhood of an optimal solution, and then use
the first order method to explore the possibility of sparsity in the following
training. We call such method PROXTONE plus (PROXTONE). Both PROXTONE and
PROXTONE are tested in our experiments, and which demonstrate both methods
improved convergence speed twice as fast at least on diverse sparse model
learning problems, and at the same time reduce the size to 0.5\% for DNN
models. The source of all the algorithms is available upon request.Comment: arXiv admin note: text overlap with arXiv:1311.2115 by other author
Gradient Sparsification for Communication-Efficient Distributed Optimization
Modern large scale machine learning applications require stochastic
optimization algorithms to be implemented on distributed computational
architectures. A key bottleneck is the communication overhead for exchanging
information such as stochastic gradients among different workers. In this
paper, to reduce the communication cost we propose a convex optimization
formulation to minimize the coding length of stochastic gradients. To solve the
optimal sparsification efficiently, several simple and fast algorithms are
proposed for approximate solution, with theoretical guaranteed for sparseness.
Experiments on regularized logistic regression, support vector
machines, and convolutional neural networks validate our sparsification
approaches
Design and Analysis of Machine Learning Exchange-Correlation Functionals via Rotationally Invariant Convolutional Descriptors
In this work we explore the potential of a new data-driven approach to the
design of exchange-correlation (XC) functionals. The approach, inspired by
convolutional filters in computer vision and surrogate functions from
optimization, utilizes convolutions of the electron density to form a feature
space to represent local electronic environments and neural networks to map the
features to the exchange-correlation energy density. These features are orbital
free, and provide a systematic route to including information at various length
scales. This work shows that convolutional descriptors are theoretically
capable of an exact representation of the electron density, and proposes
Maxwell-Cartesian spherical harmonic kernels as a class of rotationally
invariant descriptors for the construction of machine-learned functionals. The
approach is demonstrated using data from the B3LYP functional on a number of
small-molecules containing C, H, O, and N along with a neural network
regression model. The machine-learned functionals are compared to standard
physical approximations and the accuracy is assessed for the absolute energy of
each molecular system as well as formation energies. The results indicate that
it is possible to reproduce B3LYP formation energies to within chemical
accuracy using orbital-free descriptors with a spatial extent of 0.2 A. The
findings provide empirical insight into the spatial range of electron exchange,
and suggest that the combination of convolutional descriptors and
machine-learning regression models is a promising new framework for XC
functional design, although challenges remain in obtaining training data and
generating models consistent with pseudopotentials
A Selective Overview of Deep Learning
Deep learning has arguably achieved tremendous success in recent years. In
simple words, deep learning uses the composition of many nonlinear functions to
model the complex dependency between input features and labels. While neural
networks have a long history, recent advances have greatly improved their
performance in computer vision, natural language processing, etc. From the
statistical and scientific perspective, it is natural to ask: What is deep
learning? What are the new characteristics of deep learning, compared with
classical methods? What are the theoretical foundations of deep learning? To
answer these questions, we introduce common neural network models (e.g.,
convolutional neural nets, recurrent neural nets, generative adversarial nets)
and training techniques (e.g., stochastic gradient descent, dropout, batch
normalization) from a statistical point of view. Along the way, we highlight
new characteristics of deep learning (including depth and over-parametrization)
and explain their practical and theoretical benefits. We also sample recent
results on theories of deep learning, many of which are only suggestive. While
a complete understanding of deep learning remains elusive, we hope that our
perspectives and discussions serve as a stimulus for new statistical research
Leveraging Heteroscedastic Aleatoric Uncertainties for Robust Real-Time LiDAR 3D Object Detection
We present a robust real-time LiDAR 3D object detector that leverages
heteroscedastic aleatoric uncertainties to significantly improve its detection
performance. A multi-loss function is designed to incorporate uncertainty
estimations predicted by auxiliary output layers. Using our proposed method,
the network ignores to train from noisy samples, and focuses more on
informative ones. We validate our method on the KITTI object detection
benchmark. Our method surpasses the baseline method which does not explicitly
estimate uncertainties by up to nearly 9% in terms of Average Precision (AP).
It also produces state-of-the-art results compared to other methods while
running with an inference time of only 72 ms. In addition, we conduct extensive
experiments to understand how aleatoric uncertainties behave. Extracting
aleatoric uncertainties brings almost no additional computation cost during the
deployment, making our method highly desirable for autonomous driving
applications.Comment: 30th IEEE Intelligent Vehicles Symposiu
- …