Search CORE

30,943 research outputs found

Comparing Computing Platforms for Deep Learning on a Humanoid Robot

Author: Abdul Jabbar
D Speck
HA Pierson
Sergey Levine
SK Chalup
T Houliston
Victor W. Lee
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 20/01/2019
Field of study

The goal of this study is to test two different computing platforms with respect to their suitability for running deep networks as part of a humanoid robot software system. One of the platforms is the CPU-centered Intel NUC7i7BNH and the other is a NVIDIA Jetson TX2 system that puts more emphasis on GPU processing. The experiments addressed a number of benchmarking tasks including pedestrian detection using deep neural networks. Some of the results were unexpected but demonstrate that platforms exhibit both advantages and disadvantages when taking computational performance and electrical power requirements of such a system into account.Comment: 12 pages, 5 figure

arXiv.org e-Print Archive

Crossref

A real-time proximity querying algorithm for haptic-based molecular docking

Author: Bayazit
Berman
Brooks
Brooks Jr
Chen
Daunay
Davies
Delano
Eisenstein
Ferrin
Férey
Férey
Georgios Iakovou
Hou
Humphrey
Jackins
Krenek
Lai-Yuen
Leach
Lee
Lin
Moitessier
Nagata
Otaduy
Pattabiraman
Pettersen
Rizzo
Salisbury
Sauer
Sayle
Schmid
Schneidman-Duhovny
Schuttelkopf
Sourina
Stephen Laycock
Steven Hayward
Stocks
Stocks
Stone
Subasi
Teschner
Weiner
Wollacott
Yuriev
Zonta
Publication venue: 'Royal Society of Chemistry (RSC)'
Publication date: 01/01/2014
Field of study

Intermolecular binding underlies every metabolic and regulatory processes of the cell, and the therapeutic and pharmacological properties of drugs. Molecular docking systems model and simulate these interactions in silico and allow us to study the binding process. Haptic-based docking provides an immersive virtual docking environment where the user can interact with and guide the molecules to their binding pose. Moreover, it allows human perception, intuition and knowledge to assist and accelerate the docking process, and reduces incorrect binding poses. Crucial for interactive docking is the real-time calculation of interaction forces. For smooth and accurate haptic exploration and manipulation, force-feedback cues have to be updated at a rate of 1 kHz. Hence, force calculations must be performed within 1ms. To achieve this, modern haptic-based docking approaches often utilize pre-computed force grids and linear interpolation. However, such grids are time-consuming to pre-compute (especially for large molecules), memory hungry, can induce rough force transitions at cell boundaries and cannot be applied to flexible docking. Here we propose an efficient proximity querying method for computing intermolecular forces in real time. Our motivation is the eventual development of a haptic-based docking solution that can model molecular flexibility. Uniquely in a haptics application we use octrees to decompose the 3D search space in order to identify the set of interacting atoms within a cut-off distance. Force calculations are then performed on this set in real time. The implementation constructs the trees dynamically, and computes the interaction forces of large molecular structures (i.e. consisting of thousands of atoms) within haptic refresh rates. We have implemented this method in an immersive, haptic-based, rigid-body, molecular docking application called Haptimol_RD. The user can use the haptic device to orientate the molecules in space, sense the interaction forces on the device, and guide the molecules to their binding pose. Haptimol_RD is designed to run on consumer level hardware, i.e. there is no need for specialized/proprietary hardware

Crossref

University of East Anglia digital repository

Fast Calculation of the Lomb-Scargle Periodogram Using Graphics Processing Units

Author: Aubert
Koch
LSST Science Collaborations & LSST Project
Nguyen
NVIDIA
Owens
Pharr
Press
R. H. D. Townsend
Rani
Rost
Schive
Schwarzenberg-Czerny
Sturrock
Waelkens
Publication venue: 'IOP Publishing'
Publication date: 19/10/2010
Field of study

I introduce a new code for fast calculation of the Lomb-Scargle periodogram, that leverages the computing power of graphics processing units (GPUs). After establishing a background to the newly emergent field of GPU computing, I discuss the code design and narrate key parts of its source. Benchmarking calculations indicate no significant differences in accuracy compared to an equivalent CPU-based code. However, the differences in performance are pronounced; running on a low-end GPU, the code can match 8 CPU cores, and on a high-end GPU it is faster by a factor approaching thirty. Applications of the code include analysis of long photometric time series obtained by ongoing satellite missions and upcoming ground-based monitoring facilities; and Monte-Carlo simulation of periodogram statistical properties.Comment: Accepted by ApJ. Accompanying program source (updated since acceptance) can be downloaded from http://www.astro.wisc.edu/~townsend/resource/download/code/culsp.tar.g

arXiv.org e-Print Archive

Crossref

Batch Size Influence on Performance of Graphic and Tensor Processing Units during Training and Inference Phases

Author: Alienin Oleg
Gordienko Nikita
Gordienko Yuri
Kochura Yuriy
Rokovyi Alexandr
Stirenko Sergii
Taran Vlad
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 31/12/2018
Field of study

The impact of the maximally possible batch size (for the better runtime) on performance of graphic processing units (GPU) and tensor processing units (TPU) during training and inference phases is investigated. The numerous runs of the selected deep neural network (DNN) were performed on the standard MNIST and Fashion-MNIST datasets. The significant speedup was obtained even for extremely low-scale usage of Google TPUv2 units (8 cores only) in comparison to the quite powerful GPU NVIDIA Tesla K80 card with the speedup up to 10x for training stage (without taking into account the overheads) and speedup up to 2x for prediction stage (with and without taking into account overheads). The precise speedup values depend on the utilization level of TPUv2 units and increase with the increase of the data volume under processing, but for the datasets used in this work (MNIST and Fashion-MNIST with images of sizes 28x28) the speedup was observed for batch sizes >512 images for training phase and >40 000 images for prediction phase. It should be noted that these results were obtained without detriment to the prediction accuracy and loss that were equal for both GPU and TPU runs up to the 3rd significant digit for MNIST dataset, and up to the 2nd significant digit for Fashion-MNIST dataset.Comment: 10 pages, 7 figures, 2 table

arXiv.org e-Print Archive

Crossref