241,966 research outputs found

    Optimization by Record Dynamics

    Full text link
    Large dynamical changes in thermalizing glassy systems are triggered by trajectories crossing record sized barriers, a behavior revealing the presence of a hierarchical structure in configuration space. The observation is here turned into a novel local search optimization algorithm dubbed Record Dynamics Optimization, or RDO. RDO uses the Metropolis rule to accept or reject candidate solutions depending on the value of a parameter akin to the temperature, and minimizes the cost function of the problem at hand through cycles where its `temperature' is raised and subsequently decreased in order to expediently generate record high (and low) values of the cost function. Below, RDO is introduced and then tested by searching the ground state of the Edwards-Anderson spin-glass model, in two and three spatial dimensions. A popular and highly efficient optimization algorithm, Parallel Tempering (PT) is applied to the same problem as a benchmark. RDO and PT turn out to produce solution of similar quality for similar numerical effort, but RDO is simpler to program and additionally yields geometrical information on the system's configuration space which is of interest in many applications. In particular, the effectiveness of RDO strongly indicates the presence of the above mentioned hierarchically organized configuration space, with metastable regions indexed by the cost (or energy) of the transition states connecting them.Comment: 14 pages, 12 figure

    Regular Expression Search on Compressed Text

    Full text link
    We present an algorithm for searching regular expression matches in compressed text. The algorithm reports the number of matching lines in the uncompressed text in time linear in the size of its compressed version. We define efficient data structures that yield nearly optimal complexity bounds and provide a sequential implementation --zearch-- that requires up to 25% less time than the state of the art.Comment: 10 pages, published in Data Compression Conference (DCC'19

    Multi-engine packet classification hardware accelerator

    Get PDF
    As line rates increase, the task of designing high performance architectures with reduced power consumption for the processing of router traffic remains important. In this paper, we present a multi-engine packet classification hardware accelerator, which gives increased performance and reduced power consumption. It follows the basic idea of decision-tree based packet classification algorithms, such as HiCuts and HyperCuts, in which the hyperspace represented by the ruleset is recursively divided into smaller subspaces according to some heuristics. Each classification engine consists of a Trie Traverser which is responsible for finding the leaf node corresponding to the incoming packet, and a Leaf Node Searcher that reports the matching rule in the leaf node. The packet classification engine utilizes the possibility of ultra-wide memory word provided by FPGA block RAM to store the decision tree data structure, in an attempt to reduce the number of memory accesses needed for the classification. Since the clock rate of an individual engine cannot catch up to that of the internal memory, multiple classification engines are used to increase the throughput. The implementations in two different FPGAs show that this architecture can reach a searching speed of 169 million packets per second (mpps) with synthesized ACL, FW and IPC rulesets. Further analysis reveals that compared to state of the art TCAM solutions, a power savings of up to 72% and an increase in throughput of up to 27% can be achieved

    B-LOG: A branch and bound methodology for the parallel execution of logic programs

    Get PDF
    We propose a computational methodology -"B-LOG"-, which offers the potential for an effective implementation of Logic Programming in a parallel computer. We also propose a weighting scheme to guide the search process through the graph and we apply the concepts of parallel "branch and bound" algorithms in order to perform a "best-first" search using an information theoretic bound. The concept of "session" is used to speed up the search process in a succession of similar queries. Within a session, we strongly modify the bounds in a local database, while bounds kept in a global database are weakly modified to provide a better initial condition for other sessions. We also propose an implementation scheme based on a database machine using "semantic paging", and the "B-LOG processor" based on a scoreboard driven controller

    Automating Fault Tolerance in High-Performance Computational Biological Jobs Using Multi-Agent Approaches

    Get PDF
    Background: Large-scale biological jobs on high-performance computing systems require manual intervention if one or more computing cores on which they execute fail. This places not only a cost on the maintenance of the job, but also a cost on the time taken for reinstating the job and the risk of losing data and execution accomplished by the job before it failed. Approaches which can proactively detect computing core failures and take action to relocate the computing core's job onto reliable cores can make a significant step towards automating fault tolerance. Method: This paper describes an experimental investigation into the use of multi-agent approaches for fault tolerance. Two approaches are studied, the first at the job level and the second at the core level. The approaches are investigated for single core failure scenarios that can occur in the execution of parallel reduction algorithms on computer clusters. A third approach is proposed that incorporates multi-agent technology both at the job and core level. Experiments are pursued in the context of genome searching, a popular computational biology application. Result: The key conclusion is that the approaches proposed are feasible for automating fault tolerance in high-performance computing systems with minimal human intervention. In a typical experiment in which the fault tolerance is studied, centralised and decentralised checkpointing approaches on an average add 90% to the actual time for executing the job. On the other hand, in the same experiment the multi-agent approaches add only 10% to the overall execution time.Comment: Computers in Biology and Medicin

    A trivariate interpolation algorithm using a cube-partition searching procedure

    Get PDF
    In this paper we propose a fast algorithm for trivariate interpolation, which is based on the partition of unity method for constructing a global interpolant by blending local radial basis function interpolants and using locally supported weight functions. The partition of unity algorithm is efficiently implemented and optimized by connecting the method with an effective cube-partition searching procedure. More precisely, we construct a cube structure, which partitions the domain and strictly depends on the size of its subdomains, so that the new searching procedure and, accordingly, the resulting algorithm enable us to efficiently deal with a large number of nodes. Complexity analysis and numerical experiments show high efficiency and accuracy of the proposed interpolation algorithm
    • 

    corecore