28,443 research outputs found
Highly accelerated simulations of glassy dynamics using GPUs: caveats on limited floating-point precision
Modern graphics processing units (GPUs) provide impressive computing
resources, which can be accessed conveniently through the CUDA programming
interface. We describe how GPUs can be used to considerably speed up molecular
dynamics (MD) simulations for system sizes ranging up to about 1 million
particles. Particular emphasis is put on the numerical long-time stability in
terms of energy and momentum conservation, and caveats on limited
floating-point precision are issued. Strict energy conservation over 10^8 MD
steps is obtained by double-single emulation of the floating-point arithmetic
in accuracy-critical parts of the algorithm. For the slow dynamics of a
supercooled binary Lennard-Jones mixture, we demonstrate that the use of
single-floating point precision may result in quantitatively and even
physically wrong results. For simulations of a Lennard-Jones fluid, the
described implementation shows speedup factors of up to 80 compared to a serial
implementation for the CPU, and a single GPU was found to compare with a
parallelised MD simulation using 64 distributed cores.Comment: 12 pages, 7 figures, to appear in Comp. Phys. Comm., HALMD package
licensed under the GPL, see http://research.colberg.org/projects/halm
Managing Unbounded-Length Keys in Comparison-Driven Data Structures with Applications to On-Line Indexing
This paper presents a general technique for optimally transforming any
dynamic data structure that operates on atomic and indivisible keys by
constant-time comparisons, into a data structure that handles unbounded-length
keys whose comparison cost is not a constant. Examples of these keys are
strings, multi-dimensional points, multiple-precision numbers, multi-key data
(e.g.~records), XML paths, URL addresses, etc. The technique is more general
than what has been done in previous work as no particular exploitation of the
underlying structure of is required. The only requirement is that the insertion
of a key must identify its predecessor or its successor.
Using the proposed technique, online suffix tree can be constructed in worst
case time per input symbol (as opposed to amortized
time per symbol, achieved by previously known algorithms). To our knowledge,
our algorithm is the first that achieves worst case time per input
symbol. Searching for a pattern of length in the resulting suffix tree
takes time, where is the
number of occurrences of the pattern. The paper also describes more
applications and show how to obtain alternative methods for dealing with suffix
sorting, dynamic lowest common ancestors and order maintenance
Self-Improving Algorithms
We investigate ways in which an algorithm can improve its expected
performance by fine-tuning itself automatically with respect to an unknown
input distribution D. We assume here that D is of product type. More precisely,
suppose that we need to process a sequence I_1, I_2, ... of inputs I = (x_1,
x_2, ..., x_n) of some fixed length n, where each x_i is drawn independently
from some arbitrary, unknown distribution D_i. The goal is to design an
algorithm for these inputs so that eventually the expected running time will be
optimal for the input distribution D = D_1 * D_2 * ... * D_n.
We give such self-improving algorithms for two problems: (i) sorting a
sequence of numbers and (ii) computing the Delaunay triangulation of a planar
point set. Both algorithms achieve optimal expected limiting complexity. The
algorithms begin with a training phase during which they collect information
about the input distribution, followed by a stationary regime in which the
algorithms settle to their optimized incarnations.Comment: 26 pages, 8 figures, preliminary versions appeared at SODA 2006 and
SoCG 2008. Thorough revision to improve the presentation of the pape
Efficient Algorithms for the Closest Pair Problem and Applications
The closest pair problem (CPP) is one of the well studied and fundamental
problems in computing. Given a set of points in a metric space, the problem is
to identify the pair of closest points. Another closely related problem is the
fixed radius nearest neighbors problem (FRNNP). Given a set of points and a
radius , the problem is, for every input point , to identify all the
other input points that are within a distance of from . A naive
deterministic algorithm can solve these problems in quadratic time. CPP as well
as FRNNP play a vital role in computational biology, computational finance,
share market analysis, weather prediction, entomology, electro cardiograph,
N-body simulations, molecular simulations, etc. As a result, any improvements
made in solving CPP and FRNNP will have immediate implications for the solution
of numerous problems in these domains. We live in an era of big data and
processing these data take large amounts of time. Speeding up data processing
algorithms is thus much more essential now than ever before. In this paper we
present algorithms for CPP and FRNNP that improve (in theory and/or practice)
the best-known algorithms reported in the literature for CPP and FRNNP. These
algorithms also improve the best-known algorithms for related applications
including time series motif mining and the two locus problem in Genome Wide
Association Studies (GWAS)
SAFE: Self-Attentive Function Embeddings for Binary Similarity
The binary similarity problem consists in determining if two functions are
similar by only considering their compiled form. Advanced techniques for binary
similarity recently gained momentum as they can be applied in several fields,
such as copyright disputes, malware analysis, vulnerability detection, etc.,
and thus have an immediate practical impact. Current solutions compare
functions by first transforming their binary code in multi-dimensional vector
representations (embeddings), and then comparing vectors through simple and
efficient geometric operations. However, embeddings are usually derived from
binary code using manual feature extraction, that may fail in considering
important function characteristics, or may consider features that are not
important for the binary similarity problem. In this paper we propose SAFE, a
novel architecture for the embedding of functions based on a self-attentive
neural network. SAFE works directly on disassembled binary functions, does not
require manual feature extraction, is computationally more efficient than
existing solutions (i.e., it does not incur in the computational overhead of
building or manipulating control flow graphs), and is more general as it works
on stripped binaries and on multiple architectures. We report the results from
a quantitative and qualitative analysis that show how SAFE provides a
noticeable performance improvement with respect to previous solutions.
Furthermore, we show how clusters of our embedding vectors are closely related
to the semantic of the implemented algorithms, paving the way for further
interesting applications (e.g. semantic-based binary function search).Comment: Published in International Conference on Detection of Intrusions and
Malware, and Vulnerability Assessment (DIMVA) 201
- …