20,331 research outputs found
Leveraging OpenStack and Ceph for a Controlled-Access Data Cloud
While traditional HPC has and continues to satisfy most workflows, a new
generation of researchers has emerged looking for sophisticated, scalable,
on-demand, and self-service control of compute infrastructure in a cloud-like
environment. Many also seek safe harbors to operate on or store sensitive
and/or controlled-access data in a high capacity environment.
To cater to these modern users, the Minnesota Supercomputing Institute
designed and deployed Stratus, a locally-hosted cloud environment powered by
the OpenStack platform, and backed by Ceph storage. The subscription-based
service complements existing HPC systems by satisfying the following unmet
needs of our users: a) on-demand availability of compute resources, b)
long-running jobs (i.e., days), c) container-based computing with
Docker, and d) adequate security controls to comply with controlled-access data
requirements.
This document provides an in-depth look at the design of Stratus with respect
to security and compliance with the NIH's controlled-access data policy.
Emphasis is placed on lessons learned while integrating OpenStack and Ceph
features into a so-called "walled garden", and how those technologies
influenced the security design. Many features of Stratus, including tiered
secure storage with the introduction of a controlled-access data "cache",
fault-tolerant live-migrations, and fully integrated two-factor authentication,
depend on recent OpenStack and Ceph features.Comment: 7 pages, 5 figures, PEARC '18: Practice and Experience in Advanced
Research Computing, July 22--26, 2018, Pittsburgh, PA, US
4.45 Pflops Astrophysical N-Body Simulation on K computer -- The Gravitational Trillion-Body Problem
As an entry for the 2012 Gordon-Bell performance prize, we report performance
results of astrophysical N-body simulations of one trillion particles performed
on the full system of K computer. This is the first gravitational trillion-body
simulation in the world. We describe the scientific motivation, the numerical
algorithm, the parallelization strategy, and the performance analysis. Unlike
many previous Gordon-Bell prize winners that used the tree algorithm for
astrophysical N-body simulations, we used the hybrid TreePM method, for similar
level of accuracy in which the short-range force is calculated by the tree
algorithm, and the long-range force is solved by the particle-mesh algorithm.
We developed a highly-tuned gravity kernel for short-range forces, and a novel
communication algorithm for long-range forces. The average performance on 24576
and 82944 nodes of K computer are 1.53 and 4.45 Pflops, which correspond to 49%
and 42% of the peak speed.Comment: 10 pages, 6 figures, Proceedings of Supercomputing 2012
(http://sc12.supercomputing.org/), Gordon Bell Prize Winner. Additional
information is http://www.ccs.tsukuba.ac.jp/CCS/eng/gbp201
2HOT: An Improved Parallel Hashed Oct-Tree N-Body Algorithm for Cosmological Simulation
We report on improvements made over the past two decades to our adaptive
treecode N-body method (HOT). A mathematical and computational approach to the
cosmological N-body problem is described, with performance and scalability
measured up to 256k () processors. We present error analysis and
scientific application results from a series of more than ten 69 billion
() particle cosmological simulations, accounting for
floating point operations. These results include the first simulations using
the new constraints on the standard model of cosmology from the Planck
satellite. Our simulations set a new standard for accuracy and scientific
throughput, while meeting or exceeding the computational efficiency of the
latest generation of hybrid TreePM N-body methods.Comment: 12 pages, 8 figures, 77 references; To appear in Proceedings of SC
'1
Enhancing speed and scalability of the ParFlow simulation code
Regional hydrology studies are often supported by high resolution simulations
of subsurface flow that require expensive and extensive computations. Efficient
usage of the latest high performance parallel computing systems becomes a
necessity. The simulation software ParFlow has been demonstrated to meet this
requirement and shown to have excellent solver scalability for up to 16,384
processes. In the present work we show that the code requires further
enhancements in order to fully take advantage of current petascale machines. We
identify ParFlow's way of parallelization of the computational mesh as a
central bottleneck. We propose to reorganize this subsystem using fast mesh
partition algorithms provided by the parallel adaptive mesh refinement library
p4est. We realize this in a minimally invasive manner by modifying selected
parts of the code to reinterpret the existing mesh data structures. We evaluate
the scaling performance of the modified version of ParFlow, demonstrating good
weak and strong scaling up to 458k cores of the Juqueen supercomputer, and test
an example application at large scale.Comment: The final publication is available at link.springer.co
Meson-meson scattering lengths at maximum isospin from lattice QCD
We summarize our lattice QCD determinations of the pion-pion, pion-kaon and
kaon-kaon s-wave scattering lengths at maximal isospin with a particular focus
on the extrapolation to the physical point and the usage of next-to-leading
order chiral perturbation theory to do so. We employ data at three values of
the lattice spacing and pion masses ranging from around 230 MeV to around 450
MeV, applying Luescher's finite volume method to compute the scattering
lengths. We find that leading order chiral perturbation theory is surprisingly
close to our data even in the kaon-kaon case for our entire range of pion
masses.Comment: 10 pages, 8 figures, Presented at the 9th International Workshop on
Chiral Dynamics, Sept. 17-21, 2018, Duke University, Durham, NC, USA ,
submitted to PoS, (C18-09-17.6). Funding acknowledgements added in v2
replacement, comma added in abstract. In v3 replacement, corrected typo in
equation 6.2 which was referring to the pion-kaon reduced mass instead of the
pion mas
Analyzing and Modeling the Performance of the HemeLB Lattice-Boltzmann Simulation Environment
We investigate the performance of the HemeLB lattice-Boltzmann simulator for
cerebrovascular blood flow, aimed at providing timely and clinically relevant
assistance to neurosurgeons. HemeLB is optimised for sparse geometries,
supports interactive use, and scales well to 32,768 cores for problems with ~81
million lattice sites. We obtain a maximum performance of 29.5 billion site
updates per second, with only an 11% slowdown for highly sparse problems (5%
fluid fraction). We present steering and visualisation performance measurements
and provide a model which allows users to predict the performance, thereby
determining how to run simulations with maximum accuracy within time
constraints.Comment: Accepted by the Journal of Computational Science. 33 pages, 16
figures, 7 table
Performance and Power Analysis of HPC Workloads on Heterogenous Multi-Node Clusters
Performance analysis tools allow application developers to identify and characterize the inefficiencies that cause performance degradation in their codes, allowing for application optimizations. Due to the increasing interest in the High Performance Computing (HPC) community towards energy-efficiency issues, it is of paramount importance to be able to correlate performance and power figures within the same profiling and analysis tools. For this reason, we present a performance and energy-efficiency study aimed at demonstrating how a single tool can be used to collect most of the relevant metrics. In particular, we show how the same analysis techniques can be applicable on different architectures, analyzing the same HPC application on a high-end and a low-power cluster. The former cluster embeds Intel Haswell CPUs and NVIDIA K80 GPUs, while the latter is made up of NVIDIA Jetson TX1 boards, each hosting an Arm Cortex-A57 CPU and an NVIDIA Tegra X1 Maxwell GPU.The research leading to these results has received funding from the European Community’s Seventh Framework Programme [FP7/2007-2013] and Horizon 2020 under the Mont-Blanc projects [17], grant agreements n. 288777, 610402 and 671697. E.C. was partially founded by “Contributo 5 per mille assegnato all’Università degli Studi di Ferrara-dichiarazione dei redditi dell’anno 2014”. We thank the University of Ferrara and INFN Ferrara for the access to the COKA Cluster. We warmly thank the BSC tools group, supporting us for the smooth integration and test of our setup within Extrae and Paraver.Peer ReviewedPostprint (published version
Atmospheric dispersion of airborne pollen evidenced by near-surface and columnar measurements in Barcelona, Spain
Hourly measurements of pollen near-surface concentration and lidar-derived profiles of particle backscatter coefficients and of volume and particle depolarization ratios during a 5-day pollination event observed in Barcelona, Spain, between 27 – 31 March, 2015, are presented. Maximum hourly pollen concentrations of 4700 and 1200 m-3 h-1 were found for Platanus and Pinus, respectively, which represented together more than 80 % of the total pollen. Everyday a clear diurnal cycle caused by the vertical transport of the airborne pollen was visible on the lidar-derived profiles of the backscatter coefficient with maxima usually reached between 12 and 15 UT. A method based on the lidar polarization capabilities was used to retrieve the contribution of the pollen to the total signal. On average the diurnal (9 – 17 UT) pollen aerosol optical depth (AOD) was 0.05 which represented 29 % of the total AOD, the volume and particle depolarization ratios in the pollen plume were 0.08 and 0.14, respectively, and the diurnal mean of the height of the pollen plume was found at 1.24 km.
The dispersion of the Platanus and Pinus in the atmosphere was simulated with the Nonhydrostatic Multiscale Meteorological Model on the B grid at the Barcelona Supercomputing Center with a newly developed Chemical Transport Model (NMMB/BSC-CTM). Model near-surface daily concentrations were compared to our observations at two sites: in Barcelona and Bellaterra (12 km NE of Barcelona). Model hourly concentrations were compared to our observations in Barcelona.Peer ReviewedPostprint (author's final draft
- …