6,262 research outputs found
Toward sustainable data centers: a comprehensive energy management strategy
Data centers are major contributors to the emission of carbon dioxide to the atmosphere, and this contribution is expected to increase in the following years. This has encouraged the development of techniques to reduce the energy consumption and the environmental footprint of data centers. Whereas some of these techniques have succeeded to reduce the energy consumption of the hardware equipment of data centers (including IT, cooling, and power supply systems), we claim that sustainable data centers will be only possible if the problem is faced by means of a holistic approach that includes not only the aforementioned techniques but also intelligent and unifying solutions that enable a synergistic and energy-aware management of data centers.
In this paper, we propose a comprehensive strategy to reduce the carbon footprint of data centers that uses the energy as a driver of their management procedures. In addition, we present a holistic management architecture for sustainable data centers that implements the aforementioned strategy, and we propose design guidelines to accomplish each step of the proposed strategy, referring to related achievements and enumerating the main challenges that must be still solved.Peer ReviewedPostprint (author's final draft
Leveraging OpenStack and Ceph for a Controlled-Access Data Cloud
While traditional HPC has and continues to satisfy most workflows, a new
generation of researchers has emerged looking for sophisticated, scalable,
on-demand, and self-service control of compute infrastructure in a cloud-like
environment. Many also seek safe harbors to operate on or store sensitive
and/or controlled-access data in a high capacity environment.
To cater to these modern users, the Minnesota Supercomputing Institute
designed and deployed Stratus, a locally-hosted cloud environment powered by
the OpenStack platform, and backed by Ceph storage. The subscription-based
service complements existing HPC systems by satisfying the following unmet
needs of our users: a) on-demand availability of compute resources, b)
long-running jobs (i.e., days), c) container-based computing with
Docker, and d) adequate security controls to comply with controlled-access data
requirements.
This document provides an in-depth look at the design of Stratus with respect
to security and compliance with the NIH's controlled-access data policy.
Emphasis is placed on lessons learned while integrating OpenStack and Ceph
features into a so-called "walled garden", and how those technologies
influenced the security design. Many features of Stratus, including tiered
secure storage with the introduction of a controlled-access data "cache",
fault-tolerant live-migrations, and fully integrated two-factor authentication,
depend on recent OpenStack and Ceph features.Comment: 7 pages, 5 figures, PEARC '18: Practice and Experience in Advanced
Research Computing, July 22--26, 2018, Pittsburgh, PA, US
CSR5: An Efficient Storage Format for Cross-Platform Sparse Matrix-Vector Multiplication
Sparse matrix-vector multiplication (SpMV) is a fundamental building block
for numerous applications. In this paper, we propose CSR5 (Compressed Sparse
Row 5), a new storage format, which offers high-throughput SpMV on various
platforms including CPUs, GPUs and Xeon Phi. First, the CSR5 format is
insensitive to the sparsity structure of the input matrix. Thus the single
format can support an SpMV algorithm that is efficient both for regular
matrices and for irregular matrices. Furthermore, we show that the overhead of
the format conversion from the CSR to the CSR5 can be as low as the cost of a
few SpMV operations. We compare the CSR5-based SpMV algorithm with 11
state-of-the-art formats and algorithms on four mainstream processors using 14
regular and 10 irregular matrices as a benchmark suite. For the 14 regular
matrices in the suite, we achieve comparable or better performance over the
previous work. For the 10 irregular matrices, the CSR5 obtains average
performance improvement of 17.6\%, 28.5\%, 173.0\% and 293.3\% (up to 213.3\%,
153.6\%, 405.1\% and 943.3\%) over the best existing work on dual-socket Intel
CPUs, an nVidia GPU, an AMD GPU and an Intel Xeon Phi, respectively. For
real-world applications such as a solver with only tens of iterations, the CSR5
format can be more practical because of its low-overhead for format conversion.
The source code of this work is downloadable at
https://github.com/bhSPARSE/Benchmark_SpMV_using_CSR5Comment: 12 pages, 10 figures, In Proceedings of the 29th ACM International
Conference on Supercomputing (ICS '15
TaskInsight: Understanding Task Schedules Effects on Memory and Performance
Recent scheduling heuristics for task-based applications have managed to improve their by taking into account memory-related properties such as data locality and cache sharing. However, there is still a general lack of tools that can provide insights into why, and where, different schedulers improve memory behavior, and how this is related to the applications' performance.
To address this, we present TaskInsight, a technique to characterize the memory behavior of different task schedulers through the analysis of data reuse between tasks. TaskInsight provides high-level, quantitative information that can be correlated with tasks' performance variation over time to understand data reuse through the caches due to scheduling choices. TaskInsight is useful to diagnose and identify which scheduling decisions affected performance, when were they taken, and why the performance changed, both in single and multi-threaded executions.
We demonstrate how TaskInsight can diagnose examples where poor scheduling caused over 10% difference in performance for tasks of the same type, due to changes in the tasks' data reuse through the private and shared caches, in single and multi-threaded executions of the same application. This flexible insight is key for optimization in many contexts, including data locality, throughput, memory footprint or even energy efficiency.We thank the reviewers for their feedback. This work was supported by the Swedish Research Council, the Swedish Foundation for Strategic Research project FFL12-0051 and carried out within the Linnaeus Centre of Excellence UPMARC, Uppsala Programming for Multicore Architectures Research Center. This paper
was also published with the support of the HiPEAC network that received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement no. 687698.Peer ReviewedPostprint (published version
The Green500 List: Escapades to Exascale
Energy efficiency is now a top priority. The first
four years of the Green500 have seen the importance of en-
ergy efficiency in supercomputing grow from an afterthought
to the forefront of innovation as we near a point where sys-
tems will be forced to stop drawing more power. Even so,
the landscape of efficiency in supercomputing continues to
shift, with new trends emerging, and unexpected shifts in
previous predictions.
This paper offers an in-depth analysis of the new and
shifting trends in the Green500. In addition, the analysis of-
fers early indications of the track we are taking toward exas-
cale, and what an exascale machine in 2018 is likely to look
like. Lastly, we discuss the new efforts and collaborations
toward designing and establishing better metrics, method-
ologies and workloads for the measurement and analysis of
energy-efficient supercomputing
Contraints on radiative dark-matter decay from the cosmic microwave background
If dark matter decays to electromagnetically-interacting particles, it can
inject energy into the baryonic gas and thus affect the processes of
recombination and reionization. This leaves an imprint on the cosmic microwave
background (CMB): the large-scale polarization is enhanced, and the small-scale
temperature fluctuation is damped. We use the WMAP three-year data combined
with galaxy surveys to constrain radiatively decaying dark matter. Our new
limits to the dark-matter decay width are about ten times stronger than
previous limits. For dark-matter lifetimes that exceed the age of the Universe,
a limit of (95% CL) is
derived, where is the efficiency of converting decay energy into
ionization energy. Limits for lifetimes short compared with the age of the
Universe are also derived. We forecast improvements expected from the Planck
satellite.Comment: replaced with version published on PR
Enhancing speed and scalability of the ParFlow simulation code
Regional hydrology studies are often supported by high resolution simulations
of subsurface flow that require expensive and extensive computations. Efficient
usage of the latest high performance parallel computing systems becomes a
necessity. The simulation software ParFlow has been demonstrated to meet this
requirement and shown to have excellent solver scalability for up to 16,384
processes. In the present work we show that the code requires further
enhancements in order to fully take advantage of current petascale machines. We
identify ParFlow's way of parallelization of the computational mesh as a
central bottleneck. We propose to reorganize this subsystem using fast mesh
partition algorithms provided by the parallel adaptive mesh refinement library
p4est. We realize this in a minimally invasive manner by modifying selected
parts of the code to reinterpret the existing mesh data structures. We evaluate
the scaling performance of the modified version of ParFlow, demonstrating good
weak and strong scaling up to 458k cores of the Juqueen supercomputer, and test
an example application at large scale.Comment: The final publication is available at link.springer.co
Discovery of Five Recycled Pulsars in a High Galactic Latitude Survey
We present five recycled pulsars discovered during a 21-cm survey of
approximately 4,150 deg^2 between 15 deg and 30 deg from the galactic plane
using the Parkes radio telescope. One new pulsar, PSR J1528-3146, has a 61 ms
spin period and a massive white dwarf companion. Like many recycled pulsars
with heavy companions, the orbital eccentricity is relatively high (~0.0002),
consistent with evolutionary models that predict less time for circularization.
The four remaining pulsars have short spin periods (3 ms < P < 6 ms); three of
these have probable white dwarf binary companions and one (PSR J2010-1323) is
isolated. PSR J1600-3053 is relatively bright for its dispersion measure of
52.3 pc cm^-3 and promises good timing precision thanks to an intrinsically
narrow feature in its pulse profile, resolvable through coherent dedispersion.
In this survey, the recycled pulsar discovery rate was one per four days of
telescope time or one per 600 deg^2 of sky. The variability of these sources
implies that there are more millisecond pulsars that might be found by
repeating this survey.Comment: 15 pages, 3 figures, accepted for publication in Ap
- …