36,328 research outputs found
Enabling pulsar and fast transient searches using coherent dedispersion
We present an implementation of the coherent dedispersion algorithm capable
of dedispersing high-time-resolution radio observations to many different
dispersion measures (DMs). This approach allows the removal of the dispersive
effects of the interstellar medium and enables searches for pulsed emission
from pulsars and other millisecond-duration transients at low observing
frequencies and/or high DMs where time broadening of the signal due to
dispersive smearing would otherwise severely reduce the sensitivity. The
implementation, called 'cdmt', for Coherent Dispersion Measure Trials, exploits
the parallel processing capability of general-purpose graphics processing units
to accelerate the computations. We describe the coherent dedispersion algorithm
and detail how cdmt implements the algorithm to efficiently compute many
coherent DM trials. We present the concept of a semi-coherent dedispersion
search, where coherently dedispersed trials at coarsely separated DMs are
subsequently incoherently dedispersed at finer steps in DM. The software is
used in an ongoing LOFAR pilot survey to test the feasibility of performing
semi-coherent dedispersion searches for millisecond pulsars at 135MHz. This
pilot survey has led to the discovery of a radio millisecond pulsar -- the
first at these low frequencies. This is the first time that such a broad and
comprehensive search in DM-space has been done using coherent dedispersion, and
we argue that future low-frequency pulsar searches using this approach are both
scientifically compelling and feasible. Finally, we compare the performance of
cdmt with other available alternatives.Comment: 8 pages, 7 figures, submitted to Astronomy and Computin
Data acquisition system for the MuLan muon lifetime experiment
We describe the data acquisition system for the MuLan muon lifetime
experiment at Paul Scherrer Institute. The system was designed to record muon
decays at rates up to 1 MHz and acquire data at rates up to 60 MB/sec. The
system employed a parallel network of dual-processor machines and repeating
acquisition cycles of deadtime-free time segments in order to reach the design
goals. The system incorporated a versatile scheme for control and diagnostics
and a custom web interface for monitoring experimental conditions.Comment: 19 pages, 8 figures, submitted to Nuclear Instruments and Methods
Performance Modeling and Evaluation of Distributed Deep Learning Frameworks on GPUs
Deep learning frameworks have been widely deployed on GPU servers for deep
learning applications in both academia and industry. In training deep neural
networks (DNNs), there are many standard processes or algorithms, such as
convolution and stochastic gradient descent (SGD), but the running performance
of different frameworks might be different even running the same deep model on
the same GPU hardware. In this study, we evaluate the running performance of
four state-of-the-art distributed deep learning frameworks (i.e., Caffe-MPI,
CNTK, MXNet, and TensorFlow) over single-GPU, multi-GPU, and multi-node
environments. We first build performance models of standard processes in
training DNNs with SGD, and then we benchmark the running performance of these
frameworks with three popular convolutional neural networks (i.e., AlexNet,
GoogleNet and ResNet-50), after that, we analyze what factors that result in
the performance gap among these four frameworks. Through both analytical and
experimental analysis, we identify bottlenecks and overheads which could be
further optimized. The main contribution is that the proposed performance
models and the analysis provide further optimization directions in both
algorithmic design and system configuration.Comment: Published at DataCom'201
Security Through Amnesia: A Software-Based Solution to the Cold Boot Attack on Disk Encryption
Disk encryption has become an important security measure for a multitude of
clients, including governments, corporations, activists, security-conscious
professionals, and privacy-conscious individuals. Unfortunately, recent
research has discovered an effective side channel attack against any disk
mounted by a running machine\cite{princetonattack}. This attack, known as the
cold boot attack, is effective against any mounted volume using
state-of-the-art disk encryption, is relatively simple to perform for an
attacker with even rudimentary technical knowledge and training, and is
applicable to exactly the scenario against which disk encryption is primarily
supposed to defend: an adversary with physical access. To our knowledge, no
effective software-based countermeasure to this attack supporting multiple
encryption keys has yet been articulated in the literature. Moreover, since no
proposed solution has been implemented in publicly available software, all
general-purpose machines using disk encryption remain vulnerable. We present
Loop-Amnesia, a kernel-based disk encryption mechanism implementing a novel
technique to eliminate vulnerability to the cold boot attack. We offer
theoretical justification of Loop-Amnesia's invulnerability to the attack,
verify that our implementation is not vulnerable in practice, and present
measurements showing our impact on I/O accesses to the encrypted disk is
limited to a slowdown of approximately 2x. Loop-Amnesia is written for x86-64,
but our technique is applicable to other register-based architectures. We base
our work on loop-AES, a state-of-the-art open source disk encryption package
for Linux.Comment: 13 pages, 4 figure
- …