5,849 research outputs found
Parallel load balancing strategy for Volume-of-Fluid methods on 3-D unstructured meshes
© 2016. This version is made available under the CC-BY-NC-ND 4.0 license http://creativecommons.org/licenses/by-nc-nd/4.0/l Volume-of-Fluid (VOF) is one of the methods of choice to reproduce the interface motion in the simulation of multi-fluid flows. One of its main strengths is its accuracy in capturing sharp interface geometries, although requiring for it a number of geometric calculations. Under these circumstances, achieving parallel performance on current supercomputers is a must. The main obstacle for the parallelization is that the computing costs are concentrated only in the discrete elements that lie on the interface between fluids. Consequently, if the interface is not homogeneously distributed throughout the domain, standard domain decomposition (DD) strategies lead to imbalanced workload distributions. In this paper, we present a new parallelization strategy for general unstructured VOF solvers, based on a dynamic load balancing process complementary to the underlying DD. Its parallel efficiency has been analyzed and compared to the DD one using up to 1024 CPU-cores on an Intel SandyBridge based supercomputer. The results obtained on the solution of several artificially generated test cases show a speedup of up to similar to 12x with respect to the standard DD, depending on the interface size, the initial distribution and the number of parallel processes engaged. Moreover, the new parallelization strategy presented is of general purpose, therefore, it could be used to parallelize any VOF solver without requiring changes on the coupled flow solver. Finally, note that although designed for the VOF method, our approach could be easily adapted to other interface-capturing methods, such as the Level-Set, which may present similar workload imbalances. (C) 2014 Elsevier Inc. Allrights reserved.Peer ReviewedPostprint (author's final draft
Multi-GPU maximum entropy image synthesis for radio astronomy
The maximum entropy method (MEM) is a well known deconvolution technique in
radio-interferometry. This method solves a non-linear optimization problem with
an entropy regularization term. Other heuristics such as CLEAN are faster but
highly user dependent. Nevertheless, MEM has the following advantages: it is
unsupervised, it has a statistical basis, it has a better resolution and better
image quality under certain conditions. This work presents a high performance
GPU version of non-gridding MEM, which is tested using real and simulated data.
We propose a single-GPU and a multi-GPU implementation for single and
multi-spectral data, respectively. We also make use of the Peer-to-Peer and
Unified Virtual Addressing features of newer GPUs which allows to exploit
transparently and efficiently multiple GPUs. Several ALMA data sets are used to
demonstrate the effectiveness in imaging and to evaluate GPU performance. The
results show that a speedup from 1000 to 5000 times faster than a sequential
version can be achieved, depending on data and image size. This allows to
reconstruct the HD142527 CO(6-5) short baseline data set in 2.1 minutes,
instead of 2.5 days that takes a sequential version on CPU.Comment: 11 pages, 13 figure
Adaptive query parallelization in multi-core column stores
With the rise of multi-core CPU platforms, their optimal utilization
for in-memory OLAP workloads using column store databases has
become one of the biggest challenges. Some of the inherent limi-
tations in the achievable query parallelism are due to the degree of
parallelism dependency on the data skew, the overheads incurred by
thread coordination, and the hardware resource limits. Finding the
right balance between the degree of parallelism and the multi-core
utilizati
Achieving Energy Efficiency on Networking Systems with Optimization Algorithms and Compressed Data Structures
To cope with the increasing quantity, capacity and energy consumption of transmission and routing equipment in the Internet, energy efficiency of communication networks has attracted more and more attention from researchers around the world. In this dissertation, we proposed three methodologies to achieve energy efficiency on networking devices: the NP-complete problems and heuristics, the compressed data structures, and the combination of the first two methods.
We first consider the problem of achieving energy efficiency in Data Center Networks (DCN). We generalize the energy efficiency networking problem in data centers as optimal flow assignment problems, which is NP-complete, and then propose a heuristic called CARPO, a correlation-aware power optimization algorithm, that dynamically consolidate traffic flows onto a small set of links and switches in a DCN and then shut down unused network devices for power savings.
We then achieve energy efficiency on Internet routers by using the compressive data structure. A novel data structure called the Probabilistic Bloom Filter (PBF), which extends the classical bloom filter into the probabilistic direction, so that it can effectively identify heavy hitters with a small memory foot print to reduce energy consumption of network measurement.
To achieve energy efficiency on Wireless Sensor Networks (WSN), we developed one data collection protocol called EDAL, which stands for Energy-efficient Delay-aware Lifetime-balancing data collection. Based on the Open Vehicle Routing problem, EDAL exploits the topology requirements of Compressive Sensing (CS), then implement CS to save more energy on sensor nodes
Heuristics for optimizing 3D mapping missions over swarm-powered ad hoc clouds
Drones have been getting more and more popular in many economy sectors. Both
scientific and industrial communities aim at making the impact of drones even
more disruptive by empowering collaborative autonomous behaviors -- also known
as swarming behaviors -- within fleets of multiple drones. In swarming-powered
3D mapping missions, unmanned aerial vehicles typically collect the aerial
pictures of the target area whereas the 3D reconstruction process is performed
in a centralized manner. However, such approaches do not leverage computational
and storage resources from the swarm members.We address the optimization of a
swarm-powered distributed 3D mapping mission for a real-life humanitarian
emergency response application through the exploitation of a swarm-powered ad
hoc cloud. Producing the relevant 3D maps in a timely manner, even when the
cloud connectivity is not available, is crucial to increase the chances of
success of the operation. In this work, we present a mathematical programming
heuristic based on decomposition and a variable neighborhood search heuristic
to minimize the completion time of the 3D reconstruction process necessary in
such missions. Our computational results reveal that the proposed heuristics
either quickly reach optimality or improve the best known solutions for almost
all tested realistic instances comprising up to 1000 images and fifteen drones
DROP: Dimensionality Reduction Optimization for Time Series
Dimensionality reduction is a critical step in scaling machine learning
pipelines. Principal component analysis (PCA) is a standard tool for
dimensionality reduction, but performing PCA over a full dataset can be
prohibitively expensive. As a result, theoretical work has studied the
effectiveness of iterative, stochastic PCA methods that operate over data
samples. However, termination conditions for stochastic PCA either execute for
a predetermined number of iterations, or until convergence of the solution,
frequently sampling too many or too few datapoints for end-to-end runtime
improvements. We show how accounting for downstream analytics operations during
DR via PCA allows stochastic methods to efficiently terminate after operating
over small (e.g., 1%) subsamples of input data, reducing whole workload
runtime. Leveraging this, we propose DROP, a DR optimizer that enables speedups
of up to 5x over Singular-Value-Decomposition-based PCA techniques, and exceeds
conventional approaches like FFT and PAA by up to 16x in end-to-end workloads
Fuzzy multi-criteria simulated evolution for nurse re-rostering
Abstract: In a fuzzy environment where the decision making involves multiple criteria, fuzzy multi-criteria decision making approaches are a viable option. The nurse re-rostering problem is a typical complex problem situation, where scheduling decisions should consider fuzzy human preferences, such as nurse preferences, decision maker’s choices, and patient expectations. For effective nurse schedules, fuzzy theoretic evaluation approaches have to be used to incorporate the fuzzy human preferences and choices. The present study seeks to develop a fuzzy multi-criteria simulated evolution approach for the nurse re-rostering problem. Experimental results show that the fuzzy multi-criteria approach has a potential to solve large scale problems within reasonable computation times
- …