Search CORE

24 research outputs found

Benchmarking SciDB Data Import on HPC Systems

Author: Arcand William
Bergeron Bill
Bestor David
Brattain Laura
Byun Chansup
Gadepally Vijay
Houle Michael
Hubbell Matthew
Jones Michael
Kepner Jeremy
Klein Anna
Michaleas Peter
Milechin Lauren
Mullen Julie
Prout Andrew
Reuther Albert
Rosa Antonio
Samsi Siddharth
Yee Charles
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 23/09/2016
Field of study

SciDB is a scalable, computational database management system that uses an array model for data storage. The array data model of SciDB makes it ideally suited for storing and managing large amounts of imaging data. SciDB is designed to support advanced analytics in database, thus reducing the need for extracting data for analysis. It is designed to be massively parallel and can run on commodity hardware in a high performance computing (HPC) environment. In this paper, we present the performance of SciDB using simulated image data. The Dynamic Distributed Dimensional Data Model (D4M) software is used to implement the benchmark on a cluster running the MIT SuperCloud software stack. A peak performance of 2.2M database inserts per second was achieved on a single node of this system. We also show that SciDB and the D4M toolbox provide more efficient ways to access random sub-volumes of massive datasets compared to the traditional approaches of reading volumetric data from individual files. This work describes the D4M and SciDB tools we developed and presents the initial performance results. This performance was achieved by using parallel inserts, a in-database merging of arrays as well as supercomputing techniques, such as distributed arrays and single-program-multiple-data programming.Comment: 5 pages, 4 figures, IEEE High Performance Extreme Computing (HPEC) 2016, best paper finalis

arXiv.org e-Print Archive

Crossref

Lessons Learned from a Decade of Providing Interactive, On-Demand High Performance Computing to Scientists and Engineers

Author: Arcand William
Bergeron Bill
Bestor David
Byun Chansup
Gadepally Vijay
Houle Michael
Hubbell Matthew
Jones Michael
Kepner Jeremy
Klein Anna
Michaleas Peter
Milechin Lauren
Mullen Julia
Prout Andrew
Reuther Albert
Rosa Antonio
Samsi Siddharth
Yee Charles
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 05/03/2019
Field of study

For decades, the use of HPC systems was limited to those in the physical sciences who had mastered their domain in conjunction with a deep understanding of HPC architectures and algorithms. During these same decades, consumer computing device advances produced tablets and smartphones that allow millions of children to interactively develop and share code projects across the globe. As the HPC community faces the challenges associated with guiding researchers from disciplines using high productivity interactive tools to effective use of HPC systems, it seems appropriate to revisit the assumptions surrounding the necessary skills required for access to large computational systems. For over a decade, MIT Lincoln Laboratory has been supporting interactive, on-demand high performance computing by seamlessly integrating familiar high productivity tools to provide users with an increased number of design turns, rapid prototyping capability, and faster time to insight. In this paper, we discuss the lessons learned while supporting interactive, on-demand high performance computing from the perspectives of the users and the team supporting the users and the system. Building on these lessons, we present an overview of current needs and the technical solutions we are building to lower the barrier to entry for new users from the humanities, social, and biological sciences.Comment: 15 pages, 3 figures, First Workshop on Interactive High Performance Computing (WIHPC) 2018 held in conjunction with ISC High Performance 2018 in Frankfurt, German

arXiv.org e-Print Archive

Crossref

Achieving 100,000,000 database inserts per second using Accumulo and D4M

Author: Arcand William
Bergeron Bill
Bestor David
Byun Chansup
Gadepally Vijay
Hubbell Matthew
Kepner Jeremy
Michaleas Peter
Mullen Julie
Prout Andrew
Reuther Albert
Rosa Antonio
Yee Charles
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 18/06/2014
Field of study

The Apache Accumulo database is an open source relaxed consistency database that is widely used for government applications. Accumulo is designed to deliver high performance on unstructured data such as graphs of network data. This paper tests the performance of Accumulo using data from the Graph500 benchmark. The Dynamic Distributed Dimensional Data Model (D4M) software is used to implement the benchmark on a 216-node cluster running the MIT SuperCloud software stack. A peak performance of over 100,000,000 database inserts per second was achieved which is 100x larger than the highest previously published value for any other database. The performance scales linearly with the number of ingest clients, number of database servers, and data size. The performance was achieved by adapting several supercomputing techniques to this application: distributed arrays, domain decomposition, adaptive load balancing, and single-program-multiple-data programming.Comment: 6 pages; to appear in IEEE High Performance Extreme Computing (HPEC) 201

arXiv.org e-Print Archive

Crossref

Interactive Supercomputing on 40,000 Cores for Machine Learning and Data Analysis

Author: Arcand William
Bergeron Bill
Bestor David
Byun Chansup
Gadepally Vijay
Houle Michael
Hubbell Matthew
Jones Michael
Kepner Jeremy
Klein Anna
Michaleas Peter
Milechin Lauren
Mullen Julia
Prout Andrew
Reuther Albert
Rosa Antonio
Samsi Siddharth
Yee Charles
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 20/07/2018
Field of study

Interactive massively parallel computations are critical for machine learning and data analysis. These computations are a staple of the MIT Lincoln Laboratory Supercomputing Center (LLSC) and has required the LLSC to develop unique interactive supercomputing capabilities. Scaling interactive machine learning frameworks, such as TensorFlow, and data analysis environments, such as MATLAB/Octave, to tens of thousands of cores presents many technical challenges - in particular, rapidly dispatching many tasks through a scheduler, such as Slurm, and starting many instances of applications with thousands of dependencies. Careful tuning of launches and prepositioning of applications overcome these challenges and allow the launching of thousands of tasks in seconds on a 40,000-core supercomputer. Specifically, this work demonstrates launching 32,000 TensorFlow processes in 4 seconds and launching 262,000 Octave processes in 40 seconds. These capabilities allow researchers to rapidly explore novel machine learning architecture and data analysis algorithms.Comment: 6 pages, 7 figures, IEEE High Performance Extreme Computing Conference 201

arXiv.org e-Print Archive

Crossref

Parallel MATALAB Techniques

Author: Gadepally Vijay
Krishnamurthy Ashok
Samsi Siddharth
Publication venue: 'IntechOpen'
Publication date: 01/12/2009
Field of study

In this chapter, we show why parallel MATLAB is useful, provide a comparison of the different parallel MATLAB choices, and describe a number of applications in Signal and Image Processing: Audio Signal Processing, Synthetic Aperture Radar (SAR) Processing and Superconducting Quantum Interference Filters (SQIFs). Each of these applications have been parallelized using different methods (Task parallel and Data parallel techniques). The applications presented may be considered representative of type of problems faced by signal and image processing researchers. This chapter will also strive to serve as a guide to new signal and image processing parallel programmers, by suggesting a parallelization strategy that can be employed when developing a general parallel algorithm. The objective of this chapter is to help signal and image processing algorithm developers understand the advantages of using parallel MATLAB to tackle larger problems while staying within the powerful environment of MATLAB

IntechOpen

arXiv.org e-Print Archive

Crossref

pPython Performance Study

Author: Arcand William
Bergeron Bill
Bestor David
Byun Chansup
Gadepally Vijay
Houle Michael
Hubbell Matthew
Jananthan Hayden
Jones Michael
Kepner Jeremy
Klein Anna
Michaleas Peter
Milechin Lauren
Morales Guillermo
Mullen Julie
Prout Andrew
Reuther Albert
Rosa Antonio
Samsi Siddharth
Yee Charles
Publication venue
Publication date: 07/09/2023
Field of study

pPython seeks to provide a parallel capability that provides good speed-up without sacrificing the ease of programming in Python by implementing partitioned global array semantics (PGAS) on top of a simple file-based messaging library (PythonMPI) in pure Python. pPython follows a SPMD (single program multiple data) model of computation. pPython runs on a single-node (e.g., a laptop) running Windows, Linux, or MacOS operating systems or on any combination of heterogeneous systems that support Python, including on a cluster through a Slurm scheduler interface so that pPython can be executed in a massively parallel computing environment. It is interesting to see what performance pPython can achieve compared to the traditional socket-based MPI communication because of its unique file-based messaging implementation. In this paper, we present the point-to-point and collective communication performances of pPython and compare them with those obtained by using mpi4py with OpenMPI. For large messages, pPython demonstrates comparable performance as compared to mpi4py.Comment: arXiv admin note: substantial text overlap with arXiv:2208.1490

arXiv.org e-Print Archive

HPCmatlab: A Framework for Fast Prototyping of Parallel Applications in Matlab

Author: Dave Mukul
Guo Xinchen
Sayeed Mohamed
Publication venue: The Author(s). Published by Elsevier B.V.
Publication date: 31/12/2016
Field of study

AbstractThe HPCmatlab framework has been developed for Distributed Memory Programming in Matlab/Octave using the Message Passing Interface (MPI). The communication routines in the MPI library are implemented using MEX wrappers. Point-to-point, collective as well as one-sided communication is supported. Benchmarking results show better performance than the Mathworks Distributed Computing Server. HPCmatlab has been used to successfully parallelize and speed up Matlab applications developed for scientific computing. The application results show good scalability, while preserving the ease of programmability. HPCmatlab also enables shared memory programming using Pthreads and Parallel I/O using the ADIOS package

Elsevier - Publisher Connector

LLSuperCloud: Sharing HPC systems for diverse rapid prototyping

Author
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref