272,939 research outputs found
GraphX: Unifying Data-Parallel and Graph-Parallel Analytics
From social networks to language modeling, the growing scale and importance
of graph data has driven the development of numerous new graph-parallel systems
(e.g., Pregel, GraphLab). By restricting the computation that can be expressed
and introducing new techniques to partition and distribute the graph, these
systems can efficiently execute iterative graph algorithms orders of magnitude
faster than more general data-parallel systems. However, the same restrictions
that enable the performance gains also make it difficult to express many of the
important stages in a typical graph-analytics pipeline: constructing the graph,
modifying its structure, or expressing computation that spans multiple graphs.
As a consequence, existing graph analytics pipelines compose graph-parallel and
data-parallel systems using external storage systems, leading to extensive data
movement and complicated programming model.
To address these challenges we introduce GraphX, a distributed graph
computation framework that unifies graph-parallel and data-parallel
computation. GraphX provides a small, core set of graph-parallel operators
expressive enough to implement the Pregel and PowerGraph abstractions, yet
simple enough to be cast in relational algebra. GraphX uses a collection of
query optimization techniques such as automatic join rewrites to efficiently
implement these graph-parallel operators. We evaluate GraphX on real-world
graphs and workloads and demonstrate that GraphX achieves comparable
performance as specialized graph computation systems, while outperforming them
in end-to-end graph pipelines. Moreover, GraphX achieves a balance between
expressiveness, performance, and ease of use
Adjusting process count on demand for petascale global optimization⋆
There are many challenges that need to be met before efficient and reliable computation at the
petascale is possible. Many scientific and engineering codes running at the petascale are likely to
be memory intensive, which makes thrashing a serious problem for many petascale applications.
One way to overcome this challenge is to use a dynamic number of processes, so that the total
amount of memory available for the computation can be increased on demand. This paper
describes modifications made to the massively parallel global optimization code pVTdirect in
order to allow for a dynamic number of processes. In particular, the modified version of the
code monitors memory use and spawns new processes if the amount of available memory is
determined to be insufficient. The primary design challenges are discussed, and performance
results are presented and analyzed
Efficient Global Optimization of Actuator Based on A Surrogate Model Assisted Hybrid Algorithm
Computationally expensive numerical techniques are often involved in the actuator design optimization process, where efficiency is a major issue. Although surrogate-based optimization is a promising solution, the challenge to the optimization efficiency is still considerable. Aiming to address this challenge, a new method, called the parallel adjoint sensitivity and Gaussian process assisted hybrid optimization technique (PAGHO), is presented. The central concept is a new optimization framework employing computationally cheap partial derivatives obtained by the adjoint sensitivity method to tackle computationally expensive infill sampling for surrogate-based optimization. A silicon microactuator and a mathematical benchmark problem with different kinds of challenges are selected as the test cases. Comparison results show that PAGHO can obtain comparable results with popular global optimization methods, while at the same time having significant advantages in efficiency compared to standard global optimization methods and state-of-the-art surrogate-based optimization methods
Tree-based Coarsening and Partitioning of Complex Networks
Many applications produce massive complex networks whose analysis would
benefit from parallel processing. Parallel algorithms, in turn, often require a
suitable network partition. For solving optimization tasks such as graph
partitioning on large networks, multilevel methods are preferred in practice.
Yet, complex networks pose challenges to established multilevel algorithms, in
particular to their coarsening phase.
One way to specify a (recursive) coarsening of a graph is to rate its edges
and then contract the edges as prioritized by the rating. In this paper we (i)
define weights for the edges of a network that express the edges' importance
for connectivity, (ii) compute a minimum weight spanning tree with
respect to these weights, and (iii) rate the network edges based on the
conductance values of 's fundamental cuts. To this end, we also (iv)
develop the first optimal linear-time algorithm to compute the conductance
values of \emph{all} fundamental cuts of a given spanning tree. We integrate
the new edge rating into a leading multilevel graph partitioner and equip the
latter with a new greedy postprocessing for optimizing the maximum
communication volume (MCV). Experiments on bipartitioning frequently used
benchmark networks show that the postprocessing already reduces MCV by 11.3%.
Our new edge rating further reduces MCV by 10.3% compared to the previously
best rating with the postprocessing in place for both ratings. In total, with a
modest increase in running time, our new approach reduces the MCV of complex
network partitions by 20.4%
Optimization towards Efficiency and Stateful of dispel4py
Scientific workflows bridge scientific challenges with computational
resources. While dispel4py, a stream-based workflow system, offers mappings to
parallel enactment engines like MPI or Multiprocessing, its optimization
primarily focuses on dynamic process-to-task allocation for improved
performance. An efficiency gap persists, particularly with the growing emphasis
on conserving computing resources. Moreover, the existing dynamic optimization
lacks support for stateful applications and grouping operations. To address
these issues, our work introduces a novel hybrid approach for handling stateful
operations and groupings within workflows, leveraging a new Redis mapping. We
also propose an auto-scaling mechanism integrated into dispel4py's dynamic
optimization. Our experiments showcase the effectiveness of auto-scaling
optimization, achieving efficiency while upholding performance. In the best
case, auto-scaling reduces dispel4py's runtime to 87% compared to the baseline,
using only 76% of process resources. Importantly, our optimized stateful
dispel4py demonstrates a remarkable speedup, utilizing just 32% of the runtime
compared to the contender.Comment: 13 pages, 13 figure
- …