346 research outputs found
Robust Depth Estimation from Auto Bracketed Images
As demand for advanced photographic applications on hand-held devices grows,
these electronics require the capture of high quality depth. However, under
low-light conditions, most devices still suffer from low imaging quality and
inaccurate depth acquisition. To address the problem, we present a robust depth
estimation method from a short burst shot with varied intensity (i.e., Auto
Bracketing) or strong noise (i.e., High ISO). We introduce a geometric
transformation between flow and depth tailored for burst images, enabling our
learning-based multi-view stereo matching to be performed effectively. We then
describe our depth estimation pipeline that incorporates the geometric
transformation into our residual-flow network. It allows our framework to
produce an accurate depth map even with a bracketed image sequence. We
demonstrate that our method outperforms state-of-the-art methods for various
datasets captured by a smartphone and a DSLR camera. Moreover, we show that the
estimated depth is applicable for image quality enhancement and photographic
editing.Comment: To appear in CVPR 2018. Total 9 page
Computational ghost imaging using deep learning
Computational ghost imaging (CGI) is a single-pixel imaging technique that
exploits the correlation between known random patterns and the measured
intensity of light transmitted (or reflected) by an object. Although CGI can
obtain two- or three- dimensional images with a single or a few bucket
detectors, the quality of the reconstructed images is reduced by noise due to
the reconstruction of images from random patterns. In this study, we improve
the quality of CGI images using deep learning. A deep neural network is used to
automatically learn the features of noise-contaminated CGI images. After
training, the network is able to predict low-noise images from new
noise-contaminated CGI images
Classification-based Financial Markets Prediction using Deep Neural Networks
Deep neural networks (DNNs) are powerful types of artificial neural networks
(ANNs) that use several hidden layers. They have recently gained considerable
attention in the speech transcription and image recognition community
(Krizhevsky et al., 2012) for their superior predictive properties including
robustness to overfitting. However their application to algorithmic trading has
not been previously researched, partly because of their computational
complexity. This paper describes the application of DNNs to predicting
financial market movement directions. In particular we describe the
configuration and training approach and then demonstrate their application to
backtesting a simple trading strategy over 43 different Commodity and FX future
mid-prices at 5-minute intervals. All results in this paper are generated using
a C++ implementation on the Intel Xeon Phi co-processor which is 11.4x faster
than the serial version and a Python strategy backtesting environment both of
which are available as open source code written by the authors
CodeX: Bit-Flexible Encoding for Streaming-based FPGA Acceleration of DNNs
This paper proposes CodeX, an end-to-end framework that facilitates encoding,
bitwidth customization, fine-tuning, and implementation of neural networks on
FPGA platforms. CodeX incorporates nonlinear encoding to the computation flow
of neural networks to save memory. The encoded features demand significantly
lower storage compared to the raw full-precision activation values; therefore,
the execution flow of CodeX hardware engine is completely performed within the
FPGA using on-chip streaming buffers with no access to the off-chip DRAM. We
further propose a fully-automated algorithm inspired by reinforcement learning
which determines the customized encoding bitwidth across network layers. CodeX
full-stack framework comprises of a compiler which takes a high-level Python
description of an arbitrary neural network architecture. The compiler then
instantiates the corresponding elements from CodeX Hardware library for FPGA
implementation. Proof-of-concept evaluations on MNIST, SVHN, and CIFAR-10
datasets demonstrate an average of 4.65x throughput improvement compared to
stand-alone weight encoding. We further compare CodeX with six existing
full-precision DNN accelerators on ImageNet, showing an average of 3.6x and
2.54x improvement in throughput and performance-per-watt, respectively
Fast Initial Access with Deep Learning for Beam Prediction in 5G mmWave Networks
This paper presents DeepIA, a deep learning solution for faster and more
accurate initial access (IA) in 5G millimeter wave (mmWave) networks when
compared to conventional IA. By utilizing a subset of beams in the IA process,
DeepIA removes the need for an exhaustive beam search thereby reducing the beam
sweep time in IA. A deep neural network (DNN) is trained to learn the complex
mapping from the received signal strengths (RSSs) collected with a reduced
number of beams to the optimal spatial beam of the receiver (among a larger set
of beams). In test time, DeepIA measures RSSs only from a small number of beams
and runs the DNN to predict the best beam for IA. We show that DeepIA reduces
the IA time by sweeping fewer beams and significantly outperforms the
conventional IA's beam prediction accuracy in both line of sight (LoS) and
non-line of sight (NLoS) mmWave channel conditions
Restructuring Batch Normalization to Accelerate CNN Training
Batch Normalization (BN) has become a core design block of modern
Convolutional Neural Networks (CNNs). A typical modern CNN has a large number
of BN layers in its lean and deep architecture. BN requires mean and variance
calculations over each mini-batch during training. Therefore, the existing
memory access reduction techniques, such as fusing multiple CONV layers, are
not effective for accelerating BN due to their inability to optimize mini-batch
related calculations during training. To address this increasingly important
problem, we propose to restructure BN layers by first splitting a BN layer into
two sub-layers (fission) and then combining the first sub-layer with its
preceding CONV layer and the second sub-layer with the following activation and
CONV layers (fusion). The proposed solution can significantly reduce
main-memory accesses while training the latest CNN models, and the experiments
on a chip multiprocessor show that the proposed BN restructuring can improve
the performance of DenseNet-121 by 25.7%.Comment: 13 pages, 8 figures, to appear in SysML 2019, added ResNet-50 result
Runtime Configurable Deep Neural Networks for Energy-Accuracy Trade-off
We present a novel dynamic configuration technique for deep neural networks
that permits step-wise energy-accuracy trade-offs during runtime. Our
configuration technique adjusts the number of channels in the network
dynamically depending on response time, power, and accuracy targets. To enable
this dynamic configuration technique, we co-design a new training algorithm,
where the network is incrementally trained such that the weights in channels
trained in earlier steps are fixed. Our technique provides the flexibility of
multiple networks while storing and utilizing one set of weights. We evaluate
our techniques using both an ASIC-based hardware accelerator as well as a
low-power embedded GPGPU and show that our approach leads to only a small or
negligible loss in the final network accuracy. We analyze the performance of
our proposed methodology using three well-known networks for MNIST, CIFAR-10,
and SVHN datasets, and we show that we are able to achieve up to 95% energy
reduction with less than 1% accuracy loss across the three benchmarks. In
addition, compared to prior work on dynamic network reconfiguration, we show
that our approach leads to approximately 50% savings in storage requirements,
while achieving similar accuracy
Deep Learning Assisted Heuristic Tree Search for the Container Pre-marshalling Problem
The container pre-marshalling problem (CPMP) is concerned with the
re-ordering of containers in container terminals during off-peak times so that
containers can be quickly retrieved when the port is busy. The problem has
received significant attention in the literature and is addressed by a large
number of exact and heuristic methods. Existing methods for the CPMP heavily
rely on problem-specific components (e.g., proven lower bounds) that need to be
developed by domain experts with knowledge of optimization techniques and a
deep understanding of the problem at hand. With the goal to automate the costly
and time-intensive design of heuristics for the CPMP, we propose a new method
called Deep Learning Heuristic Tree Search (DLTS). It uses deep neural networks
to learn solution strategies and lower bounds customized to the CPMP solely
through analyzing existing (near-) optimal solutions to CPMP instances. The
networks are then integrated into a tree search procedure to decide which
branch to choose next and to prune the search tree. DLTS produces the highest
quality heuristic solutions to the CPMP to date with gaps to optimality below
2% on real-world sized instances
BagNet: Berkeley Analog Generator with Layout Optimizer Boosted with Deep Neural Networks
The discrepancy between post-layout and schematic simulation results
continues to widen in analog design due in part to the domination of layout
parasitics. This paradigm shift is forcing designers to adopt design
methodologies that seamlessly integrate layout effects into the standard design
flow. Hence, any simulation-based optimization framework should take into
account time-consuming post-layout simulation results. This work presents a
learning framework that learns to reduce the number of simulations of
evolutionary-based combinatorial optimizers, using a DNN that discriminates
against generated samples, before running simulations. Using this approach, the
discriminator achieves at least two orders of magnitude improvement on sample
efficiency for several large circuit examples including an optical link
receiver layout.Comment: Accepted on ICCAD 2019 Conferenc
A holistic approach to computing first-arrival traveltimes using neural networks
Since the original algorithm by John Vidale in 1988 to numerically solve the
isotropic eikonal equation, there has been tremendous progress on the topic
addressing an array of challenges including improvement of the solution
accuracy, incorporation of surface topography, adding more accurate physics by
accounting for anisotropy/attenuation in the medium, and speeding up
computations using multiple CPUs and GPUs. Despite these advances, there is no
mechanism in these algorithms to carry on information gained by solving one
problem to the next. Moreover, these approaches may breakdown for certain
complex forms of the eikonal equation, requiring approximation methods to
estimate the solution. Therefore, we seek an alternate approach to address the
challenge in a holistic manner, i.e., a method that not only makes it simpler
to incorporate topography, allow accounting for any level of complexity in
physics, benefiting from computational speedup due to the availability of
multiple CPUs or GPUs, but also able to transfer knowledge gained from solving
one problem to the next. We develop an algorithm based on the emerging paradigm
of physics-informed neural network to solve various forms of the eikonal
equation. We show how transfer learning and surrogate modeling can be used to
speed up computations by utilizing information gained from prior solutions. We
also propose a two-stage optimization scheme to expedite the training process
in presence of sharper heterogeneity in the velocity model. Furthermore, we
demonstrate how the proposed approach makes it simpler to incorporate
additional physics and other features in contrast to conventional methods that
took years and often decades to make these advances. Such an approach not only
makes the implementation of eikonal solvers much simpler but also puts us on a
much faster path to progress
- …