Search CORE

147 research outputs found

Somoclu: An Efficient Parallel Library for Self-Organizing Maps

Author: Gao Shi Chao
Lim Ik Soo
Wittek Peter
Zhao Li
Publication venue: 'Foundation for Open Access Statistic'
Publication date: 01/06/2017
Field of study

Somoclu is a massively parallel tool for training self-organizing maps on large data sets written in C++. It builds on OpenMP for multicore execution, and on MPI for distributing the workload across the nodes in a cluster. It is also able to boost training by using CUDA if graphics processing units are available. A sparse kernel is included, which is useful for high-dimensional but sparse data, such as the vector spaces common in text mining workflows. Python, R and MATLAB interfaces facilitate interactive use. Apart from fast execution, memory use is highly optimized, enabling training large emergent maps even on a single computer.Comment: 26 pages, 9 figures. The code is available at https://peterwittek.github.io/somoclu

arXiv.org e-Print Archive

Directory of Open Access Journals

Journal of Statistical Software

Bangor University Research Portal

A Survey of CUDA-based Multidimensional Scaling on GPU Architecture

Author: Osipyan Hasmik
Publication venue: OASIcs - OpenAccess Series in Informatics. 2015 Imperial College Computing Student Workshop (ICCSW 2015)
Publication date: 01/01/2015
Field of study

The need to analyze large amounts of multivariate data raises the fundamental problem of dimensionality reduction which is defined as a process of mapping data from high-dimensional space into low-dimensional. One of the most popular methods for handling this problem is multidimensional scaling. Due to the technological advances, the dimensionality of the input data as well as the amount of processed data is increasing steadily but the requirement of processing these data within a reasonable time frame still remains an open problem. Recent development in graphics hardware allows to perform generic parallel computations on powerful hardware and provides an opportunity to solve many time-constrained problems in both graphical and non-graphical domain. The purpose of this survey is to describe and analyze recent implementations of multidimensional scaling algorithms on graphics processing units and present the applicability of these algorithms on such architectures based on the experimental results which show a decrease of execution time for multi-level approaches

Dagstuhl Research Online Publication Server

Proceedings of the First International Workshop on Sustainable Ultrascale Computing Systems (NESUS 2014): Porto, Portugal

Author: Barbosa Jorge
Carretero Pérez Jesús
García Blas Francisco Javier
Morla Ricardo
Publication venue
Publication date: 01/01/2014
Field of study

Proceedings of: First International Workshop on Sustainable Ultrascale Computing Systems (NESUS 2014). Porto (Portugal), August 27-28, 2014

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Universidad Carlos III de Madrid e-Archivo

Algorithms for Advection on Hybrid Parallel Computers

Author: White James Buford, III
Publication venue: TRACE: Tennessee Research and Creative Exchange
Publication date: 01/05/2011
Field of study

Current climate models have a limited ability to increase spatial resolution because numerical stability requires the time step to decrease. I describe initial experiments with two independent but complementary strategies for attacking this time barrier . First I describe computational experiments exploring the performance improvements from overlapping computation and communication on hybrid parallel computers. My test case is explicit time integration of linear advection with constant uniform velocity in a three-dimensional periodic domain. I present results for Fortran implementations using various combinations of MPI, OpenMP, and CUDA, with and without overlap of computation and communication. Second I describe a semi-Lagrangian method for tracer transport that is stable for arbitrary Courant numbers, along with a parallel implementation discretized on the cubed sphere. It shows optimal accuracy at Courant numbers of 10-20, more than an order of magnitude higher than explicit methods. Finally I describe the development and stability analyses of the time integrators and advection methods I used for my experiments. I develop explicit single-step methods with stability up to Courant numbers of one in each dimension, hybrid explicit-implict methods with stability for arbitrary Courant numbers, and interpolation operators that enable the arbitrary stability of semi-Lagrangian methods

University of Tennessee, Knoxville: Trace

Doctor of Philosophy in Computing

Author: Grosset Pascal
Publication venue: University of Utah
Publication date: 01/01/2016
Field of study

dissertationThe aim of direct volume rendering is to facilitate exploration and understanding of three-dimensional scalar fields referred to as volume datasets. Improving understanding is done by improving depth perception, whereas facilitating exploration is done by speeding up volume rendering. In this dissertation, improving both depth perception and rendering speed is considered. The impact of depth of field (DoF) on depth perception in direct volume rendering is evaluated by conducting a user study in which the test subjects had to choose which of two features, located at different depths, appeared to be in front in a volume-rendered image. Whereas DoF was expected to improve perception in all cases, the user study revealed that if used on the back feature, DoF reduced depth perception, whereas it produced a marked improvement when used on the front feature. We then worked on improving the speed of volume rendering on distributed memory machines. Distributed volume rendering has three stages: loading, rendering, and compositing. In this dissertation, the focus is on image compositing, more specifically, trying to optimize communication in image compositing algorithms. For that, we have developed the Task Overlapped Direct Send Tree image compositing algorithm, which works on both CPU- and GPU-accelerated supercomputers, which focuses on communication avoidance and overlapping communication with computation; the Dynamically Scheduled Region-Based image compositing algorithm that uses spatial and temporal awareness to efficiently schedule communication among compositing nodes, and a rendering and compositing pipeline that allows both image compositing and rendering to be done on GPUs of GPU-accelerated supercomputers. We tested these on CPU- and GPU-accelerated supercomputers and explain how these improvements allow us to obtain better performance than image compositing algorithms that focus on load-balancing and algorithms that have no spatial and temporal awareness of the rendering and compositing stages

The University of Utah: J. Willard Marriott Digital Library

Ultrafast Error-Bounded Lossy Compression for Scientific Datasets

Author: Cappello Franck
Di Sheng
Liang Xin
Tao Dingwen
Tian Jiannan
Yu Xiaodong
Zhao Kai
Publication venue: Scholars\u27 Mine
Publication date: 27/06/2022
Field of study

Today\u27s scientific high-performance computing applications and advanced instruments are producing vast volumes of data across a wide range of domains, which impose a serious burden on data transfer and storage. Error-bounded lossy compression has been developed and widely used in the scientific community because it not only can significantly reduce the data volumes but also can strictly control the data distortion based on the user-specified error bound. Existing lossy compressors, however, cannot offer ultrafast compression speed, which is highly demanded by numerous applications or use cases (such as in-memory compression and online instrument data compression). In this paper, we propose a novel ultrafast error-bounded lossy compressor that can obtain fairly high compression performance on both CPUs and GPUs and with reasonably high compression ratios. The key contributions are threefold. (1) We propose a generic error-bounded lossy compression framework - -called SZx - -that achieves ultrafast performance through its novel design comprising only lightweight operations such as bitwise and addition/subtraction operations, while still keeping a high compression ratio. (2) We implement SZx on both CPUs and GPUs and optimize the performance according to their architectures. (3) We perform a comprehensive evaluation with six real-world production-level scientific datasets on both CPUs and GPUs. Experiments show that SZx is 2∼16x faster than the second-fastest existing error-bounded lossy compressor (either SZ or ZFP) on CPUs and GPUs, with respect to both compression and decompression

Missouri University of Science and Technology (Missouri S&T): Scholars' Mine

On the Porting and Optimisation of Physics Simulations for Heterogeneous Parallel Processors

Author: Martineau Matt J
Publication venue
Publication date: 25/06/2019
Field of study

Explore Bristol Research

Proceedings, MSVSCC 2015

Author: Old Dominion University Department of Modeling, Simulation & Visualization Engineering
Old Dominion University Virginia Modeling, Analysis & Simulation Center
Publication venue: ODU Digital Commons
Publication date: 16/04/2015
Field of study

The Virginia Modeling, Analysis and Simulation Center (VMASC) of Old Dominion University hosted the 2015 Modeling, Simulation, & Visualization Student capstone Conference on April 16th. The Capstone Conference features students in Modeling and Simulation, undergraduates and graduate degree programs, and fields from many colleges and/or universities. Students present their research to an audience of fellow students, faculty, judges, and other distinguished guests. For the students, these presentations afford them the opportunity to impart their innovative research to members of the M&S community from academic, industry, and government backgrounds. Also participating in the conference are faculty and judges who have volunteered their time to impart direct support to their students’ research, facilitate the various conference tracks, serve as judges for each of the tracks, and provide overall assistance to this conference. 2015 marks the ninth year of the VMASC Capstone Conference for Modeling, Simulation and Visualization. This year our conference attracted a number of fine student written papers and presentations, resulting in a total of 51 research works that were presented. This year’s conference had record attendance thanks to the support from the various different departments at Old Dominion University, other local Universities, and the United States Military Academy, at West Point. We greatly appreciated all of the work and energy that has gone into this year’s conference, it truly was a highly collaborative effort that has resulted in a very successful symposium for the M&S community and all of those involved. Below you will find a brief summary of the best papers and best presentations with some simple statistics of the overall conference contribution. Followed by that is a table of contents that breaks down by conference track category with a copy of each included body of work. Thank you again for your time and your contribution as this conference is designed to continuously evolve and adapt to better suit the authors and M&S supporters. Dr.Yuzhong Shen Graduate Program Director, MSVE Capstone Conference Chair John ShullGraduate Student, MSVE Capstone Conference Student Chai

Old Dominion University

Visualisation Studio for the analysis of massive datasets

Author: Tucker Roy Colin
Publication venue: Plymouth University
Publication date: 01/01/2016
Field of study

This thesis describes the research underpinning and the development of a cross platform application for the analysis of simultaneously recorded multi-dimensional spike trains. These spike trains are believed to carry the neural code that encodes information in a biological brain. A number of statistical methods already exist to analyse the temporal relationships between the spike trains. Historically, hundreds of spike trains have been simultaneously recorded, however as a result of technological advances recording capability has increased. The analysis of thousands of simultaneously recorded spike trains is now a requirement. Effective analysis of large data sets requires software tools that fully exploit the capabilities of modern research computers and effectively manage and present large quantities of data. To be effective such software tools must; be targeted at the field under study, be engineered to exploit the full compute power of research computers and prevent information overload of the researcher despite presenting a large and complex data set. The Visualisation Studio application produced in this thesis brings together the fields of neuroscience, software engineering and information visualisation to produce a software tool that meets these criteria. A visual programming language for neuroscience is produced that allows for extensive pre-processing of spike train data prior to visualisation. The computational challenges of analysing thousands of spike trains are addressed using parallel processing to fully exploit the modern researcher’s computer hardware. In the case of the computationally intensive pairwise cross-correlation analysis the option to use a high performance compute cluster (HPC) is seamlessly provided. Finally the principles of information visualisation are applied to key visualisations in neuroscience so that the researcher can effectively manage and visually explore the resulting data sets. The final visualisations can typically represent data sets 10 times larger than previously while remaining highly interactiv

Plymouth Electronic Archive and Research Library