1,193 research outputs found
An Iterative Vertex Enumeration Method for Objective Space Based Vector Optimization Algorithms
An application area of vertex enumeration problem (VEP) is the usage within
objective space based linear/convex {vector} optimization algorithms whose aim
is to generate (an approximation of) the Pareto frontier. In such algorithms,
VEP, which is defined in the objective space, is solved in each iteration and
it has a special structure. Namely, the recession cone of the polyhedron to be
generated is the {ordering} cone. We {consider and give a detailed description
of} a vertex enumeration procedure, which iterates by calling a modified
`double description (DD) method' that works for such unbounded polyhedrons. We
employ this procedure as a function of an existing objective space based
{vector} optimization algorithm (Algorithm 1); and test the performance of it
for randomly generated linear multiobjective optimization problems. We compare
the efficiency of this procedure with another existing DD method as well as
with the current vertex enumeration subroutine of Algorithm 1. We observe that
the modified procedure excels the others especially as the dimension of the
vertex enumeration problem (the number of objectives of the corresponding
multiobjective problem) increases
Converting between quadrilateral and standard solution sets in normal surface theory
The enumeration of normal surfaces is a crucial but very slow operation in
algorithmic 3-manifold topology. At the heart of this operation is a polytope
vertex enumeration in a high-dimensional space (standard coordinates).
Tollefson's Q-theory speeds up this operation by using a much smaller space
(quadrilateral coordinates), at the cost of a reduced solution set that might
not always be sufficient for our needs. In this paper we present algorithms for
converting between solution sets in quadrilateral and standard coordinates. As
a consequence we obtain a new algorithm for enumerating all standard vertex
normal surfaces, yielding both the speed of quadrilateral coordinates and the
wider applicability of standard coordinates. Experimentation with the software
package Regina shows this new algorithm to be extremely fast in practice,
improving speed for large cases by factors from thousands up to millions.Comment: 55 pages, 10 figures; v2: minor fixes only, plus a reformat for the
journal styl
High Performance Large Graph Analytics by Enhancing Locality
Graphs are widely used in a variety of domains for representing entities and their relationship to each other. Graph analytics helps to understand, detect, extract and visualize insightful relationships between different entities. Graph analytics has a wide range of applications in various domains including computational biology, commerce, intelligence, health care and transportation. The breadth of problems that require large graph analytics is growing rapidly resulting in a need for fast and efficient graph processing.
One of the major challenges in graph processing is poor locality of reference. Locality of reference refers to the phenomenon of frequently accessing the same memory location or adjacent memory locations. Applications with poor data locality reduce the effectiveness of the cache memory. They result in large number of cache misses, requiring access to high latency main memory. Therefore, it is essential to have good locality for good performance. Most graph processing applications have highly random memory access patterns. Coupled with the current large sizes of the graphs, they result in poor cache utilization. Additionally, the computation to data access ratio in many graph processing applications is very low, making it difficult to cover the memory latency using computation. It is also challenging to efficiently parallelize most graph applications. Many graphs in real world have unbalanced degree distribution. It is difficult to achieve a balanced workload for such graphs. The parallelism in graph applications is generally fine-grained in nature. This calls for efficient synchronization and communication between the processing units.
Techniques for enhancing locality have been well studied in the context of regular applications like linear algebra. Those techniques are in most cases not applicable to the graph problems. In this dissertation, we propose two techniques for enhancing locality in graph algorithms: access transformation and task-set reduction. Access transformation can be applied to algorithms to improve the spatial locality by changing the random access pattern to sequential access. It is applicable to iterative algorithms that process random vertices/edges in each iteration. The task-set reduction technique can be applied to enhance the temporal locality. It is applicable to algorithms which repeatedly access the same data to perform certain task. Using the two techniques, we propose novel algorithms for three graph problems: k-core decomposition, maximal clique enumeration and triangle listing. We have implemented the algorithms. The results show that these algorithms provide significant improvement in performance and also scale well
A 2D Parallel Triangle Counting Algorithm for Distributed-Memory Architectures
Triangle counting is a fundamental graph analytic operation that is used
extensively in network science and graph mining. As the size of the graphs that
needs to be analyzed continues to grow, there is a requirement in developing
scalable algorithms for distributed-memory parallel systems. To this end, we
present a distributed-memory triangle counting algorithm, which uses a 2D
cyclic decomposition to balance the computations and reduce the communication
overheads. The algorithm structures its communication and computational steps
such that it reduces its memory overhead and includes key optimizations that
leverage the sparsity of the graph and the way the computations are structured.
Experiments on synthetic and real-world graphs show that our algorithm obtains
an average relative speedup that range between 3.24 and 7.22 out of 10.56
across the datasets using 169 MPI ranks over the performance achieved by 16 MPI
ranks. Moreover, we obtain an average speedup of 10.2 times on comparison with
previously developed distributed-memory parallel algorithms.Comment: 10 pages, 3 figures, 48th International Conference on Parallel
Processin
- …