7,637 research outputs found

    Numerical Coordinate Regression with Convolutional Neural Networks

    Full text link
    We study deep learning approaches to inferring numerical coordinates for points of interest in an input image. Existing convolutional neural network-based solutions to this problem either take a heatmap matching approach or regress to coordinates with a fully connected output layer. Neither of these approaches is ideal, since the former is not entirely differentiable, and the latter lacks inherent spatial generalization. We propose our differentiable spatial to numerical transform (DSNT) to fill this gap. The DSNT layer adds no trainable parameters, is fully differentiable, and exhibits good spatial generalization. Unlike heatmap matching, DSNT works well with low heatmap resolutions, so it can be dropped in as an output layer for a wide range of existing fully convolutional architectures. Consequently, DSNT offers a better trade-off between inference speed and prediction accuracy compared to existing techniques. When used to replace the popular heatmap matching approach used in almost all state-of-the-art methods for pose estimation, DSNT gives better prediction accuracy for all model architectures tested

    PFCNN: Convolutional Neural Networks on 3D Surfaces Using Parallel Frames

    Full text link
    Surface meshes are widely used shape representations and capture finer geometry data than point clouds or volumetric grids, but are challenging to apply CNNs directly due to their non-Euclidean structure. We use parallel frames on surface to define PFCNNs that enable effective feature learning on surface meshes by mimicking standard convolutions faithfully. In particular, the convolution of PFCNN not only maps local surface patches onto flat tangent planes, but also aligns the tangent planes such that they locally form a flat Euclidean structure, thus enabling recovery of standard convolutions. The alignment is achieved by the tool of locally flat connections borrowed from discrete differential geometry, which can be efficiently encoded and computed by parallel frame fields. In addition, the lack of canonical axis on surface is handled by sampling with the frame directions. Experiments show that for tasks including classification, segmentation and registration on deformable geometric domains, as well as semantic scene segmentation on rigid domains, PFCNNs achieve robust and superior performances without using sophisticated input features than state-of-the-art surface based CNNs.Comment: 15 pages, 18 figures. CVPR 2020. Project page: https://haopan.github.io/surfacecnn.htm

    Convolutional Neural Network for Transition Modeling Based on Linear Stability Theory

    Full text link
    Transition prediction is an important aspect of aerodynamic design because of its impact on skin friction and potential coupling with flow separation characteristics. Traditionally, the modeling of transition has relied on correlation-based empirical formulas based on integral quantities such as the shape factor of the boundary layer. However, in many applications of computational fluid dynamics, the shape factor is not straightforwardly available or not well-defined. We propose using the complete velocity profile along with other quantities (e.g., frequency, Reynolds number) to predict the perturbation amplification factor. While this can be achieved with regression models based on a classical fully connected neural network, such a model can be computationally more demanding. We propose a novel convolutional neural network inspired by the underlying physics as described by the stability equations. Specifically, convolutional layers are first used to extract integral quantities from the velocity profiles, and then fully connected layers are used to map the extracted integral quantities, along with frequency and Reynolds number, to the output (amplification ratio). Numerical tests on classical boundary layers clearly demonstrate the merits of the proposed method. More importantly, we demonstrate that, for Tollmien-Schlichting instabilities in two-dimensional, low-speed boundary layers, the proposed network encodes information in the boundary layer profiles into an integral quantity that is strongly correlated to a well-known, physically defined parameter -- the shape factor.Comment: 15 pages, 7 figures, submitted to Physical Review Fluids journa

    Deep Learning Seismic Substructure Detection using the Frozen Gaussian Approximation

    Full text link
    We propose a deep learning algorithm for seismic interface and pocket detection with neural networks trained by synthetic high-frequency displacement data efficiently generated by the frozen Gaussian approximation (FGA). In seismic imaging high-frequency data is advantageous since it can provide high resolution of substructures. However, generation of sufficient synthetic high-frequency data sets for training neural networks is computationally challenging. This bottleneck is overcome by a highly scalable computational platform built upon the FGA, which comes from the semiclassical theory and approximates the wavefields by a sum of fixed-width (frozen) Gaussian wave packets. Data is generated from a forward simulation of the elastic wave equation using the FGA. This data contains accurate traveltime information (from the ray path) but not exact amplitude information (with asymptotic errors not shrinking to zero even at extremely fine numerical resolution). Using this data we build convolutional neural network models using an open source API, GeoSeg, developed using Keras and Tensorflow. On a simple model, networks, despite only being trained on FGA data, can detect an interface with a high success rate from displacement data generated by the spectral element method. Benchmark tests are done for P-waves (acoustic) and P- and S-waves (elastic) generated using the FGA and a spectral element method. Further, results with a high accuracy are shown for more complicated geometries including a three layered model, and a 2D-pocket model where the neural networks trained by both clean and noisy data

    3D Human Pose Estimation with 2D Marginal Heatmaps

    Full text link
    Automatically determining three-dimensional human pose from monocular RGB image data is a challenging problem. The two-dimensional nature of the input results in intrinsic ambiguities which make inferring depth particularly difficult. Recently, researchers have demonstrated that the flexible statistical modelling capabilities of deep neural networks are sufficient to make such inferences with reasonable accuracy. However, many of these models use coordinate output techniques which are memory-intensive, not differentiable, and/or do not spatially generalise well. We propose improvements to 3D coordinate prediction which avoid the aforementioned undesirable traits by predicting 2D marginal heatmaps under an augmented soft-argmax scheme. Our resulting model, MargiPose, produces visually coherent heatmaps whilst maintaining differentiability. We are also able to achieve state-of-the-art accuracy on publicly available 3D human pose estimation data.Comment: Accepted in WACV 201

    Empirical study of PROXTONE and PROXTONE+^+ for Fast Learning of Large Scale Sparse Models

    Full text link
    PROXTONE is a novel and fast method for optimization of large scale non-smooth convex problem \cite{shi2015large}. In this work, we try to use PROXTONE method in solving large scale \emph{non-smooth non-convex} problems, for example training of sparse deep neural network (sparse DNN) or sparse convolutional neural network (sparse CNN) for embedded or mobile device. PROXTONE converges much faster than first order methods, while first order method is easy in deriving and controlling the sparseness of the solutions. Thus in some applications, in order to train sparse models fast, we propose to combine the merits of both methods, that is we use PROXTONE in the first several epochs to reach the neighborhood of an optimal solution, and then use the first order method to explore the possibility of sparsity in the following training. We call such method PROXTONE plus (PROXTONE+^+). Both PROXTONE and PROXTONE+^+ are tested in our experiments, and which demonstrate both methods improved convergence speed twice as fast at least on diverse sparse model learning problems, and at the same time reduce the size to 0.5\% for DNN models. The source of all the algorithms is available upon request.Comment: arXiv admin note: text overlap with arXiv:1311.2115 by other author

    Gradient Sparsification for Communication-Efficient Distributed Optimization

    Full text link
    Modern large scale machine learning applications require stochastic optimization algorithms to be implemented on distributed computational architectures. A key bottleneck is the communication overhead for exchanging information such as stochastic gradients among different workers. In this paper, to reduce the communication cost we propose a convex optimization formulation to minimize the coding length of stochastic gradients. To solve the optimal sparsification efficiently, several simple and fast algorithms are proposed for approximate solution, with theoretical guaranteed for sparseness. Experiments on â„“2\ell_2 regularized logistic regression, support vector machines, and convolutional neural networks validate our sparsification approaches

    Design and Analysis of Machine Learning Exchange-Correlation Functionals via Rotationally Invariant Convolutional Descriptors

    Full text link
    In this work we explore the potential of a new data-driven approach to the design of exchange-correlation (XC) functionals. The approach, inspired by convolutional filters in computer vision and surrogate functions from optimization, utilizes convolutions of the electron density to form a feature space to represent local electronic environments and neural networks to map the features to the exchange-correlation energy density. These features are orbital free, and provide a systematic route to including information at various length scales. This work shows that convolutional descriptors are theoretically capable of an exact representation of the electron density, and proposes Maxwell-Cartesian spherical harmonic kernels as a class of rotationally invariant descriptors for the construction of machine-learned functionals. The approach is demonstrated using data from the B3LYP functional on a number of small-molecules containing C, H, O, and N along with a neural network regression model. The machine-learned functionals are compared to standard physical approximations and the accuracy is assessed for the absolute energy of each molecular system as well as formation energies. The results indicate that it is possible to reproduce B3LYP formation energies to within chemical accuracy using orbital-free descriptors with a spatial extent of 0.2 A. The findings provide empirical insight into the spatial range of electron exchange, and suggest that the combination of convolutional descriptors and machine-learning regression models is a promising new framework for XC functional design, although challenges remain in obtaining training data and generating models consistent with pseudopotentials

    A Selective Overview of Deep Learning

    Full text link
    Deep learning has arguably achieved tremendous success in recent years. In simple words, deep learning uses the composition of many nonlinear functions to model the complex dependency between input features and labels. While neural networks have a long history, recent advances have greatly improved their performance in computer vision, natural language processing, etc. From the statistical and scientific perspective, it is natural to ask: What is deep learning? What are the new characteristics of deep learning, compared with classical methods? What are the theoretical foundations of deep learning? To answer these questions, we introduce common neural network models (e.g., convolutional neural nets, recurrent neural nets, generative adversarial nets) and training techniques (e.g., stochastic gradient descent, dropout, batch normalization) from a statistical point of view. Along the way, we highlight new characteristics of deep learning (including depth and over-parametrization) and explain their practical and theoretical benefits. We also sample recent results on theories of deep learning, many of which are only suggestive. While a complete understanding of deep learning remains elusive, we hope that our perspectives and discussions serve as a stimulus for new statistical research

    Leveraging Heteroscedastic Aleatoric Uncertainties for Robust Real-Time LiDAR 3D Object Detection

    Full text link
    We present a robust real-time LiDAR 3D object detector that leverages heteroscedastic aleatoric uncertainties to significantly improve its detection performance. A multi-loss function is designed to incorporate uncertainty estimations predicted by auxiliary output layers. Using our proposed method, the network ignores to train from noisy samples, and focuses more on informative ones. We validate our method on the KITTI object detection benchmark. Our method surpasses the baseline method which does not explicitly estimate uncertainties by up to nearly 9% in terms of Average Precision (AP). It also produces state-of-the-art results compared to other methods while running with an inference time of only 72 ms. In addition, we conduct extensive experiments to understand how aleatoric uncertainties behave. Extracting aleatoric uncertainties brings almost no additional computation cost during the deployment, making our method highly desirable for autonomous driving applications.Comment: 30th IEEE Intelligent Vehicles Symposiu
    • …
    corecore