30,943 research outputs found
Comparing Computing Platforms for Deep Learning on a Humanoid Robot
The goal of this study is to test two different computing platforms with
respect to their suitability for running deep networks as part of a humanoid
robot software system. One of the platforms is the CPU-centered Intel NUC7i7BNH
and the other is a NVIDIA Jetson TX2 system that puts more emphasis on GPU
processing. The experiments addressed a number of benchmarking tasks including
pedestrian detection using deep neural networks. Some of the results were
unexpected but demonstrate that platforms exhibit both advantages and
disadvantages when taking computational performance and electrical power
requirements of such a system into account.Comment: 12 pages, 5 figure
A real-time proximity querying algorithm for haptic-based molecular docking
Intermolecular binding underlies every metabolic and regulatory processes of the cell, and the therapeutic and pharmacological properties of drugs. Molecular docking systems model and simulate these interactions in silico and allow us to study the binding process. Haptic-based docking provides an immersive virtual docking environment where the user can interact with and guide the molecules to their binding pose. Moreover, it allows human perception, intuition and knowledge to assist and accelerate the docking process, and reduces incorrect binding poses. Crucial for interactive docking is the real-time calculation of interaction forces. For smooth and accurate haptic exploration and manipulation, force-feedback cues have to be updated at a rate of 1 kHz. Hence, force calculations must be performed within 1ms. To achieve this, modern haptic-based docking approaches often utilize pre-computed force grids and linear interpolation. However, such grids are time-consuming to pre-compute (especially for large molecules), memory hungry, can induce rough force transitions at cell boundaries and cannot be applied to flexible docking. Here we propose an efficient proximity querying method for computing intermolecular forces in real time. Our motivation is the eventual development of a haptic-based docking solution that can model molecular flexibility. Uniquely in a haptics application we use octrees to decompose the 3D search space in order to identify the set of interacting atoms within a cut-off distance. Force calculations are then performed on this set in real time. The implementation constructs the trees dynamically, and computes the interaction forces of large molecular structures (i.e. consisting of thousands of atoms) within haptic refresh rates. We have implemented this method in an immersive, haptic-based, rigid-body, molecular docking application called Haptimol_RD. The user can use the haptic device to orientate the molecules in space, sense the interaction forces on the device, and guide the molecules to their binding pose. Haptimol_RD is designed to run on consumer level hardware, i.e. there is no need for specialized/proprietary hardware
Fast Calculation of the Lomb-Scargle Periodogram Using Graphics Processing Units
I introduce a new code for fast calculation of the Lomb-Scargle periodogram,
that leverages the computing power of graphics processing units (GPUs). After
establishing a background to the newly emergent field of GPU computing, I
discuss the code design and narrate key parts of its source. Benchmarking
calculations indicate no significant differences in accuracy compared to an
equivalent CPU-based code. However, the differences in performance are
pronounced; running on a low-end GPU, the code can match 8 CPU cores, and on a
high-end GPU it is faster by a factor approaching thirty. Applications of the
code include analysis of long photometric time series obtained by ongoing
satellite missions and upcoming ground-based monitoring facilities; and
Monte-Carlo simulation of periodogram statistical properties.Comment: Accepted by ApJ. Accompanying program source (updated since
acceptance) can be downloaded from
http://www.astro.wisc.edu/~townsend/resource/download/code/culsp.tar.g
Batch Size Influence on Performance of Graphic and Tensor Processing Units during Training and Inference Phases
The impact of the maximally possible batch size (for the better runtime) on
performance of graphic processing units (GPU) and tensor processing units (TPU)
during training and inference phases is investigated. The numerous runs of the
selected deep neural network (DNN) were performed on the standard MNIST and
Fashion-MNIST datasets. The significant speedup was obtained even for extremely
low-scale usage of Google TPUv2 units (8 cores only) in comparison to the quite
powerful GPU NVIDIA Tesla K80 card with the speedup up to 10x for training
stage (without taking into account the overheads) and speedup up to 2x for
prediction stage (with and without taking into account overheads). The precise
speedup values depend on the utilization level of TPUv2 units and increase with
the increase of the data volume under processing, but for the datasets used in
this work (MNIST and Fashion-MNIST with images of sizes 28x28) the speedup was
observed for batch sizes >512 images for training phase and >40 000 images for
prediction phase. It should be noted that these results were obtained without
detriment to the prediction accuracy and loss that were equal for both GPU and
TPU runs up to the 3rd significant digit for MNIST dataset, and up to the 2nd
significant digit for Fashion-MNIST dataset.Comment: 10 pages, 7 figures, 2 table
- …