241,966 research outputs found
Optimization by Record Dynamics
Large dynamical changes in thermalizing glassy systems are triggered by
trajectories crossing record sized barriers, a behavior revealing the presence
of a hierarchical structure in configuration space. The observation is here
turned into a novel local search optimization algorithm dubbed Record Dynamics
Optimization, or RDO. RDO uses the Metropolis rule to accept or reject
candidate solutions depending on the value of a parameter akin to the
temperature, and minimizes the cost function of the problem at hand through
cycles where its `temperature' is raised and subsequently decreased in order to
expediently generate record high (and low) values of the cost function. Below,
RDO is introduced and then tested by searching the ground state of the
Edwards-Anderson spin-glass model, in two and three spatial dimensions. A
popular and highly efficient optimization algorithm, Parallel Tempering (PT) is
applied to the same problem as a benchmark. RDO and PT turn out to produce
solution of similar quality for similar numerical effort, but RDO is simpler to
program and additionally yields geometrical information on the system's
configuration space which is of interest in many applications. In particular,
the effectiveness of RDO strongly indicates the presence of the above mentioned
hierarchically organized configuration space, with metastable regions indexed
by the cost (or energy) of the transition states connecting them.Comment: 14 pages, 12 figure
Regular Expression Search on Compressed Text
We present an algorithm for searching regular expression matches in
compressed text. The algorithm reports the number of matching lines in the
uncompressed text in time linear in the size of its compressed version. We
define efficient data structures that yield nearly optimal complexity bounds
and provide a sequential implementation --zearch-- that requires up to 25% less
time than the state of the art.Comment: 10 pages, published in Data Compression Conference (DCC'19
Multi-engine packet classification hardware accelerator
As line rates increase, the task of designing high performance architectures with reduced power consumption for the processing of router traffic remains important. In this paper, we present a multi-engine packet classification hardware accelerator, which gives increased performance and reduced power consumption. It follows the basic idea of decision-tree based packet classification algorithms, such as HiCuts and HyperCuts, in which the hyperspace represented by the ruleset is recursively divided into smaller subspaces according to some heuristics. Each classification engine consists of a Trie Traverser which is responsible for finding the leaf node corresponding to the incoming packet, and a Leaf Node Searcher that reports the matching rule in the leaf node. The packet classification engine utilizes the possibility of ultra-wide memory word provided by FPGA block RAM to store the decision tree data structure, in an attempt to reduce the number of memory accesses needed for the classification. Since the clock rate of an individual engine cannot catch up to that of the internal memory, multiple classification engines are used to increase the throughput. The implementations in two different FPGAs show that this architecture can reach a searching speed of 169 million packets per second (mpps) with synthesized ACL, FW and IPC rulesets. Further analysis reveals that compared to state of the art TCAM solutions, a power savings of up to 72% and an increase in throughput of up to 27% can be achieved
B-LOG: A branch and bound methodology for the parallel execution of logic programs
We propose a computational methodology -"B-LOG"-, which offers the potential for an effective implementation of Logic Programming in a parallel computer. We also propose a weighting scheme to guide the search process through the graph and we apply the concepts of parallel "branch and bound" algorithms in order to perform a "best-first" search using an information theoretic bound. The concept of "session" is used to speed up the search process in a succession of similar queries. Within a session, we strongly modify the bounds in a local database, while bounds kept in a global database are weakly modified to provide a better initial condition for other sessions. We
also propose an implementation scheme based on a database
machine using "semantic paging", and the "B-LOG processor" based on a scoreboard driven controller
Automating Fault Tolerance in High-Performance Computational Biological Jobs Using Multi-Agent Approaches
Background: Large-scale biological jobs on high-performance computing systems
require manual intervention if one or more computing cores on which they
execute fail. This places not only a cost on the maintenance of the job, but
also a cost on the time taken for reinstating the job and the risk of losing
data and execution accomplished by the job before it failed. Approaches which
can proactively detect computing core failures and take action to relocate the
computing core's job onto reliable cores can make a significant step towards
automating fault tolerance.
Method: This paper describes an experimental investigation into the use of
multi-agent approaches for fault tolerance. Two approaches are studied, the
first at the job level and the second at the core level. The approaches are
investigated for single core failure scenarios that can occur in the execution
of parallel reduction algorithms on computer clusters. A third approach is
proposed that incorporates multi-agent technology both at the job and core
level. Experiments are pursued in the context of genome searching, a popular
computational biology application.
Result: The key conclusion is that the approaches proposed are feasible for
automating fault tolerance in high-performance computing systems with minimal
human intervention. In a typical experiment in which the fault tolerance is
studied, centralised and decentralised checkpointing approaches on an average
add 90% to the actual time for executing the job. On the other hand, in the
same experiment the multi-agent approaches add only 10% to the overall
execution time.Comment: Computers in Biology and Medicin
A trivariate interpolation algorithm using a cube-partition searching procedure
In this paper we propose a fast algorithm for trivariate interpolation, which
is based on the partition of unity method for constructing a global interpolant
by blending local radial basis function interpolants and using locally
supported weight functions. The partition of unity algorithm is efficiently
implemented and optimized by connecting the method with an effective
cube-partition searching procedure. More precisely, we construct a cube
structure, which partitions the domain and strictly depends on the size of its
subdomains, so that the new searching procedure and, accordingly, the resulting
algorithm enable us to efficiently deal with a large number of nodes.
Complexity analysis and numerical experiments show high efficiency and accuracy
of the proposed interpolation algorithm
- âŠ