2,072 research outputs found
Conflict-free star-access in parallel memory systems
We study conflict-free data distribution schemes in parallel memories in multiprocessor system architectures. Given a host graph G, the problem is to map the nodes of G into memory modules such that any instance of a template type T in G can be accessed without memory conflicts. A conflict occurs if two or more nodes of T are mapped to the same memory module. The mapping algorithm should: (i) be fast in terms of data access (possibly mapping each node in constant time); (ii) minimize the required number of memory modules for accessing any instance in G of the given template type; and (iii) guarantee load balancing on the modules. In this paper, we consider conflict-free access to star templates. i.e., to any node of G along with all of its neighbors. Such a template type arises in many classical algorithms like breadth-first search in a graph, message broadcasting in networks, and nearest neighbor based approximation in numerical computation. We consider the star-template access problem on two specific host graphs-tori and hypercubes-that are also popular interconnection network topologies. The proposed conflict-free mappings on these graphs are fast, use an optimal or provably good number of memory modules, and guarantee load balancing. (C) 2006 Elsevier Inc. All rights reserved
Gunrock: GPU Graph Analytics
For large-scale graph analytics on the GPU, the irregularity of data access
and control flow, and the complexity of programming GPUs, have presented two
significant challenges to developing a programmable high-performance graph
library. "Gunrock", our graph-processing system designed specifically for the
GPU, uses a high-level, bulk-synchronous, data-centric abstraction focused on
operations on a vertex or edge frontier. Gunrock achieves a balance between
performance and expressiveness by coupling high performance GPU computing
primitives and optimization strategies with a high-level programming model that
allows programmers to quickly develop new graph primitives with small code size
and minimal GPU programming knowledge. We characterize the performance of
various optimization strategies and evaluate Gunrock's overall performance on
different GPU architectures on a wide range of graph primitives that span from
traversal-based algorithms and ranking algorithms, to triangle counting and
bipartite-graph-based algorithms. The results show that on a single GPU,
Gunrock has on average at least an order of magnitude speedup over Boost and
PowerGraph, comparable performance to the fastest GPU hardwired primitives and
CPU shared-memory graph libraries such as Ligra and Galois, and better
performance than any other GPU high-level graph library.Comment: 52 pages, invited paper to ACM Transactions on Parallel Computing
(TOPC), an extended version of PPoPP'16 paper "Gunrock: A High-Performance
Graph Processing Library on the GPU
A tabu search heuristic for the Equitable Coloring Problem
The Equitable Coloring Problem is a variant of the Graph Coloring Problem
where the sizes of two arbitrary color classes differ in at most one unit. This
additional condition, called equity constraints, arises naturally in several
applications. Due to the hardness of the problem, current exact algorithms can
not solve large-sized instances. Such instances must be addressed only via
heuristic methods. In this paper we present a tabu search heuristic for the
Equitable Coloring Problem. This algorithm is an adaptation of the dynamic
TabuCol version of Galinier and Hao. In order to satisfy equity constraints,
new local search criteria are given. Computational experiments are carried out
in order to find the best combination of parameters involved in the dynamic
tenure of the heuristic. Finally, we show the good performance of our heuristic
over known benchmark instances
Some Optimally Adaptive Parallel Graph Algorithms on EREW PRAM Model
The study of graph algorithms is an important area of research in computer science, since graphs offer useful tools to model many real-world situations. The commercial availability of parallel computers have led to the development of efficient parallel graph algorithms.
Using an exclusive-read and exclusive-write (EREW) parallel random access machine (PRAM) as the computation model with a fixed number of processors, we design and analyze parallel algorithms for seven undirected graph problems, such as, connected components, spanning forest, fundamental cycle set, bridges, bipartiteness, assignment problems, and approximate vertex coloring. For all but the last two problems, the input data structure is an unordered list of edges, and divide-and-conquer is the paradigm for designing algorithms. One of the algorithms to solve the assignment problem makes use of an appropriate variant of dynamic programming strategy. An elegant data structure, called the adjacency list matrix, used in a vertex-coloring algorithm avoids the sequential nature of linked adjacency lists.
Each of the proposed algorithms achieves optimal speedup, choosing an optimal granularity (thus exploiting maximum parallelism) which depends on the density or the number of vertices of the given graph. The processor-(time)2 product has been identified as a useful parameter to measure the cost-effectiveness of a parallel algorithm. We derive a lower bound on this measure for each of our algorithms
Massively parallel hybrid search for the partial Latin square extension problem
The partial Latin square extension problem is to fill as many as possible
empty cells of a partially filled Latin square. This problem is a useful model
for a wide range of relevant applications in diverse domains. This paper
presents the first massively parallel hybrid search algorithm for this
computationally challenging problem based on a transformation of the problem to
partial graph coloring. The algorithm features the following original elements.
Based on a very large population (with more than individuals) and modern
graphical processing units, the algorithm performs many local searches in
parallel to ensure an intensified exploitation of the search space. It employs
a dedicated crossover with a specific parent matching strategy to create a
large number of diversified and information-preserving offspring at each
generation. Extensive experiments on 1800 benchmark instances show a high
competitiveness of the algorithm compared with the current best performing
methods. Competitive results are also reported on the related Latin square
completion problem. Analyses are performed to shed lights on the understanding
of the main algorithmic components. The code of the algorithm will be made
publicly available
Expanding the expressive power of Monadic Second-Order logic on restricted graph classes
We combine integer linear programming and recent advances in Monadic
Second-Order model checking to obtain two new algorithmic meta-theorems for
graphs of bounded vertex-cover. The first shows that cardMSO1, an extension of
the well-known Monadic Second-Order logic by the addition of cardinality
constraints, can be solved in FPT time parameterized by vertex cover. The
second meta-theorem shows that the MSO partitioning problems introduced by Rao
can also be solved in FPT time with the same parameter. The significance of our
contribution stems from the fact that these formalisms can describe problems
which are W[1]-hard and even NP-hard on graphs of bounded tree-width.
Additionally, our algorithms have only an elementary dependence on the
parameter and formula. We also show that both results are easily extended from
vertex cover to neighborhood diversity.Comment: Accepted for IWOCA 201
A composable approach to design of newer techniques for large-scale denial-of-service attack attribution
Since its early days, the Internet has witnessed not only a phenomenal growth, but also a large number of security attacks, and in recent years, denial-of-service (DoS) attacks have emerged as one of the top threats. The stateless and destination-oriented Internet routing combined with the ability to harness a large number of compromised machines and the relative ease and low costs of launching such attacks has made this a hard problem to address. Additionally, the myriad requirements of scalability, incremental deployment, adequate user privacy protections, and appropriate economic incentives has further complicated the design of DDoS defense mechanisms. While the many research proposals to date have focussed differently on prevention, mitigation, or traceback of DDoS attacks, the lack of a comprehensive approach satisfying the different design criteria for successful attack attribution is indeed disturbing.
Our first contribution here has been the design of a composable data model that has helped us represent the various dimensions of the attack attribution problem, particularly the performance attributes of accuracy, effectiveness, speed and overhead, as orthogonal and mutually independent design considerations. We have then designed custom optimizations along each of these dimensions, and have further integrated them into a single composite model, to provide strong performance guarantees. Thus, the proposed model has given us a single framework that can not only address the individual shortcomings of the various known attack attribution techniques, but also provide a more wholesome counter-measure against DDoS attacks.
Our second contribution here has been a concrete implementation based on the proposed composable data model, having adopted a graph-theoretic approach to identify and subsequently stitch together individual edge fragments in the Internet graph to reveal the true routing path of any network data packet. The proposed approach has been analyzed through theoretical and experimental evaluation across multiple metrics, including scalability, incremental deployment, speed and efficiency of the distributed algorithm, and finally the total overhead associated with its deployment. We have thereby shown that it is realistically feasible to provide strong performance and scalability guarantees for Internet-wide attack attribution.
Our third contribution here has further advanced the state of the art by directly identifying individual path fragments in the Internet graph, having adopted a distributed divide-and-conquer approach employing simple recurrence relations as individual building blocks. A detailed analysis of the proposed approach on real-life Internet topologies with respect to network storage and traffic overhead, has provided a more realistic characterization. Thus, not only does the proposed approach lend well for simplified operations at scale but can also provide robust network-wide performance and security guarantees for Internet-wide attack attribution.
Our final contribution here has introduced the notion of anonymity in the overall attack attribution process to significantly broaden its scope. The highly invasive nature of wide-spread data gathering for network traceback continues to violate one of the key principles of Internet use today - the ability to stay anonymous and operate freely without retribution. In this regard, we have successfully reconciled these mutually divergent requirements to make it not only economically feasible and politically viable but also socially acceptable.
This work opens up several directions for future research - analysis of existing attack attribution techniques to identify further scope for improvements, incorporation of newer attributes into the design framework of the composable data model abstraction, and finally design of newer attack attribution techniques that comprehensively integrate the various attack prevention, mitigation and traceback techniques in an efficient manner
- …