3,136 research outputs found
Depth from Monocular Images using a Semi-Parallel Deep Neural Network (SPDNN) Hybrid Architecture
Deep neural networks are applied to a wide range of problems in recent years.
In this work, Convolutional Neural Network (CNN) is applied to the problem of
determining the depth from a single camera image (monocular depth). Eight
different networks are designed to perform depth estimation, each of them
suitable for a feature level. Networks with different pooling sizes determine
different feature levels. After designing a set of networks, these models may
be combined into a single network topology using graph optimization techniques.
This "Semi Parallel Deep Neural Network (SPDNN)" eliminates duplicated common
network layers, and can be further optimized by retraining to achieve an
improved model compared to the individual topologies. In this study, four SPDNN
models are trained and have been evaluated at 2 stages on the KITTI dataset.
The ground truth images in the first part of the experiment are provided by the
benchmark, and for the second part, the ground truth images are the depth map
results from applying a state-of-the-art stereo matching method. The results of
this evaluation demonstrate that using post-processing techniques to refine the
target of the network increases the accuracy of depth estimation on individual
mono images. The second evaluation shows that using segmentation data alongside
the original data as the input can improve the depth estimation results to a
point where performance is comparable with stereo depth estimation. The
computational time is also discussed in this study.Comment: 44 pages, 25 figure
A Study of Speed of the Boundary Element Method as applied to the Realtime Computational Simulation of Biological Organs
In this work, possibility of simulating biological organs in realtime using
the Boundary Element Method (BEM) is investigated. Biological organs are
assumed to follow linear elastostatic material behavior, and constant boundary
element is the element type used. First, a Graphics Processing Unit (GPU) is
used to speed up the BEM computations to achieve the realtime performance.
Next, instead of the GPU, a computer cluster is used. Results indicate that BEM
is fast enough to provide for realtime graphics if biological organs are
assumed to follow linear elastostatic material behavior. Although the present
work does not conduct any simulation using nonlinear material models, results
from using the linear elastostatic material model imply that it would be
difficult to obtain realtime performance if highly nonlinear material models
that properly characterize biological organs are used. Although the use of BEM
for the simulation of biological organs is not new, the results presented in
the present study are not found elsewhere in the literature.Comment: preprint, draft, 2 tables, 47 references, 7 files, Codes that can
solve three dimensional linear elastostatic problems using constant boundary
elements (of triangular shape) while ignoring body forces are provided as
supplementary files; codes are distributed under the MIT License in three
versions: i) MATLAB version ii) Fortran 90 version (sequential code) iii)
Fortran 90 version (parallel code
An efficient surrogate model for emulation and physics extraction of large eddy simulations
In the quest for advanced propulsion and power-generation systems,
high-fidelity simulations are too computationally expensive to survey the
desired design space, and a new design methodology is needed that combines
engineering physics, computer simulations and statistical modeling. In this
paper, we propose a new surrogate model that provides efficient prediction and
uncertainty quantification of turbulent flows in swirl injectors with varying
geometries, devices commonly used in many engineering applications. The novelty
of the proposed method lies in the incorporation of known physical properties
of the fluid flow as {simplifying assumptions} for the statistical model. In
view of the massive simulation data at hand, which is on the order of hundreds
of gigabytes, these assumptions allow for accurate flow predictions in around
an hour of computation time. To contrast, existing flow emulators which forgo
such simplications may require more computation time for training and
prediction than is needed for conducting the simulation itself. Moreover, by
accounting for coupling mechanisms between flow variables, the proposed model
can jointly reduce prediction uncertainty and extract useful flow physics,
which can then be used to guide further investigations.Comment: Submitted to JASA A&C
Scalable and Sustainable Deep Learning via Randomized Hashing
Current deep learning architectures are growing larger in order to learn from
complex datasets. These architectures require giant matrix multiplication
operations to train millions of parameters. Conversely, there is another
growing trend to bring deep learning to low-power, embedded devices. The matrix
operations, associated with both training and testing of deep networks, are
very expensive from a computational and energy standpoint. We present a novel
hashing based technique to drastically reduce the amount of computation needed
to train and test deep networks. Our approach combines recent ideas from
adaptive dropouts and randomized hashing for maximum inner product search to
select the nodes with the highest activation efficiently. Our new algorithm for
deep learning reduces the overall computational cost of forward and
back-propagation by operating on significantly fewer (sparse) nodes. As a
consequence, our algorithm uses only 5% of the total multiplications, while
keeping on average within 1% of the accuracy of the original model. A unique
property of the proposed hashing based back-propagation is that the updates are
always sparse. Due to the sparse gradient updates, our algorithm is ideally
suited for asynchronous and parallel training leading to near linear speedup
with increasing number of cores. We demonstrate the scalability and
sustainability (energy efficiency) of our proposed algorithm via rigorous
experimental evaluations on several real datasets
Design and Validation of a Software Defined Radio Testbed for DVB-T Transmission
This paper describes the design and validation of a Software Defined Radio (SDR) testbed, which can be used for Digital Television transmission using the Digital Video Broadcasting - Terrestrial (DVB-T) standard. In order to generate a DVB-T-compliant signal with low computational complexity, we design an SDR architecture that uses the C/C++ language and exploits multithreading and vectorized instructions. Then, we transmit the generated DVB-T signal in real time, using a common PC equipped with multicore central processing units (CPUs) and a commercially available SDR modem board. The proposed SDR architecture has been validated using fixed TV sets, and portable receivers. Our results show that the proposed SDR architecture for DVB-T transmission is a low-cost low-complexity solution that, in the worst case, only requires less than 22% of CPU load and less than 170 MB of memory usage, on a 3.0 GHz Core i7 processor. In addition, using the same SDR modem board, we design an off-line software receiver that also performs time synchronization and carrier frequency offset estimation and compensation
Event Stream Processing with Multiple Threads
Current runtime verification tools seldom make use of multi-threading to
speed up the evaluation of a property on a large event trace. In this paper, we
present an extension to the BeepBeep 3 event stream engine that allows the use
of multiple threads during the evaluation of a query. Various parallelization
strategies are presented and described on simple examples. The implementation
of these strategies is then evaluated empirically on a sample of problems.
Compared to the previous, single-threaded version of the BeepBeep engine, the
allocation of just a few threads to specific portions of a query provides
dramatic improvement in terms of running time
- …