Search CORE

49 research outputs found

Leveraging Grammars For OpenMP Development in Supercomputing Environments

Author: Hunter Samuel
Publication venue: SMU Scholar
Publication date: 15/12/2018
Field of study

This thesis proposes a solution to streamline the process of using supercomputing re- sources on Southern Methodist University’s ManeFrame II supercomputer. A large segment of the research community that uses ManeFrame II belong outside of the computer science department and the Lyle School of Engineering. While users know how to apply compu- tation to their field, their knowledge does not necessarily extend to the suite of tools and operating system that are required to use ManeFrame II. To solve this, the thesis proposes an interface for those who have little knowledge of Linux and SLURM to be able to use the supercomputing resources that SMU’s Center for Scientific Computation provides. OpenMP is a compiler extension for C, C++ and Fortran that generates a binary using multithreading using in-code directives. With knowledge of OpenMP, researchers are already able to split their code into multiple threads of execution. However, because of the complexity of Linux and SLURM, using OpenMP with the supercomputer can be problematic. This thesis focuses on the user of ANTLR, a programming language recognition tool. This tool allows for the insertion of directives into code which serves to generate batch files that are compatible with the supercomputer scheduling software, SLURM. With the batch file, the user is then able to submit their code to the supercomputer. Additional tools around this core piece of software facilitate a usable interface. In order to make the tool accessible to those without a background in software, the proposed forward- facing solution is a user interface to upload their code and returns a batch file that the user can use to run their code. This eliminates the need for a new user to download, compile and run the ANTLR distribution to generate a batch file. By abstracting away these complexities into a web interface, the solution can generate a batch submission file for the user. Additional tooling assists the user in finding empty nodes for code execution, testing the compilation of their code on the supercomputer and running a timed sample of their code to ensure that OpenMP is leading to a speedup in execution time

Southern Methodist University

SMU Digital Repository

The Federal Big Data Research and Development Strategic Plan

Author: Big Data Senior Steering Group
THE NETWORKING AND INFORMATION TECHNOLOGY RESEARCH AND DEVELOPMENT PROGRAM
Publication venue: DigitalCommons@University of Nebraska - Lincoln
Publication date: 01/04/2016
Field of study

This document was developed through the contributions of the NITRD Big Data SSG members and staff. A special thanks and appreciation to the core team of editors, writers, and reviewers: Lida Beninson (NSF), Quincy Brown (NSF), Elizabeth Burrows (NSF), Dana Hunter (NSF), Craig Jolley (USAID), Meredith Lee (DHS), Nishal Mohan (NSF), Chloe Poston (NSF), Renata Rawlings-Goss (NSF), Carly Robinson (DOE Science), Alejandro Suarez (NSF), Martin Wiener (NSF), and Fen Zhao (NSF). A national Big Data1 innovation ecosystem is essential to enabling knowledge discovery from and confident action informed by the vast resource of new and diverse datasets that are rapidly becoming available in nearly every aspect of life. Big Data has the potential to radically improve the lives of all Americans. It is now possible to combine disparate, dynamic, and distributed datasets and enable everything from predicting the future behavior of complex systems to precise medical treatments, smart energy usage, and focused educational curricula. Government agency research and public-private partnerships, together with the education and training of future data scientists, will enable applications that directly benefit society and the economy of the Nation. To derive the greatest benefits from the many, rich sources of Big Data, the Administration announced a “Big Data Research and Development Initiative” on March 29, 2012.2 Dr. John P. Holdren, Assistant to the President for Science and Technology and Director of the Office of Science and Technology Policy, stated that the initiative “promises to transform our ability to use Big Data for scientific discovery, environmental and biomedical research, education, and national security.” The Federal Big Data Research and Development Strategic Plan (Plan) builds upon the promise and excitement of the myriad applications enabled by Big Data with the objective of guiding Federal agencies as they develop and expand their individual mission-driven programs and investments related to Big Data. The Plan is based on inputs from a series of Federal agency and public activities, and a shared vision: We envision a Big Data innovation ecosystem in which the ability to analyze, extract information from, and make decisions and discoveries based upon large, diverse, and real-time datasets enables new capabilities for Federal agencies and the Nation at large; accelerates the process of scientific discovery and innovation; leads to new fields of research and new areas of inquiry that would otherwise be impossible; educates the next generation of 21st century scientists and engineers; and promotes new economic growth. The Plan is built around seven strategies that represent key areas of importance for Big Data research and development (R&D). Priorities listed within each strategy highlight the intended outcomes that can be addressed by the missions and research funding of NITRD agencies. These include advancing human understanding in all branches of science, medicine, and security; ensuring the Nation’s continued leadership in research and development; and enhancing the Nation’s ability to address pressing societal and environmental issues facing the Nation and the world through research and development

The Federal Big Data Research and Development Strategic Plan

Author: Big Data Senior Steering Group
THE NETWORKING AND INFORMATION TECHNOLOGY RESEARCH AND DEVELOPMENT PROGRAM
Publication venue: DigitalCommons@University of Nebraska - Lincoln
Publication date: 01/04/2016
Field of study

DigitalCommons@University of Nebraska

Accelerators for Data Processing

Author: Koçberber Yusuf Onur
Publication venue: Lausanne, EPFL
Publication date: 08/09/2015
Field of study

The explosive growth in digital data and its growing role in real-time analytics motivate the design of high-performance database management systems (DBMSs). Meanwhile, slowdown in supply voltage scaling has stymied improvements in core performance and ushered an era of power-limited chips. These developments motivate the design of software and hardware DBMS accelerators that (1) maximize utility by accelerating the dominant operations, and (2) provide flexibility in the choice of DBMS, data layout, and data types. In this thesis, we identify pointer-intensive data structure operations as a key performance and efficiency bottleneck in data analytics workloads. We observe that data analytics tasks include a large number of independent data structure lookups, each of which is characterized by dependent long-latency memory accesses due to pointer chasing. Unfortunately, exploiting such inter-lookup parallelism to overlap memory accesses from different lookups is not possible within the limited instruction window of modern out-of-order cores. Similarly, software prefetching techniques attempt to exploit inter-lookup parallelism by statically staging independent lookups, and hence break down in the face of irregularity across lookup stages. Based on these observations, we provide a dynamic software acceleration scheme for exploiting inter-lookup parallelism to hide the memory access latency despite the irregularities across lookups. Furthermore, we propose a programmable hardware accelerator to maximize the efficiency of the data structure lookups. As a result, through flexible hardware and software techniques we eliminate a key efficiency and performance bottleneck in data analytics operations

Infoscience - École polytechnique fédérale de Lausanne

Design and Implementation of a Domain Specific Language for Deep Learning

Author: Huang Xiao Bing
Publication venue: UWM Digital Commons
Publication date: 01/05/2018
Field of study

\textit {Deep Learning} (DL) has found great success in well-diversified areas such as machine vision, speech recognition, big data analysis, and multimedia understanding recently. However, the existing state-of-the-art DL frameworks, e.g. Caffe2, Theano, TensorFlow, MxNet, Torch7, and CNTK, are programming libraries with fixed user interfaces, internal representations, and execution environments. Modifying the code of DL layers or data structure is very challenging without in-depth understanding of the underlying implementation. The optimization of the code and execution in these tools is often limited and relies on the specific DL computation graph manipulation and scheduling that lack systematic and universal strategies. Furthermore, most of these tools demand many dependencies beside the tool itself and require to be built to some specific platforms for DL training or inference. \\\\ \noindent This dissertation presents {\it DeepDSL}, a \textit {domain specific language} (DSL) embedded in Scala, that compiles DL networks encoded with DeepDSL to efficient, compact, and portable Java source programs for DL training and inference. DeepDSL represents DL networks as abstract tensor functions, performs symbolic gradient derivations to generate the Intermediate Representation (IR), optimizes the IR expressions, and compiles the optimized IR expressions to cross-platform Java code that is easily modifiable and debuggable. Also, the code directly runs on GPU without additional dependencies except a small set of \textit{JNI} (Java Native Interface) wrappers for invoking the underneath GPU libraries. Moreover, DeepDSL provides static analysis for memory consumption and error detection. \\\\ \noindent DeepDSL\footnote{Our previous results are reported in~\cite{zhao2017}; design and implementation details are summarized in~\cite{Zhao2018}.} has been evaluated with many current state-of-the-art DL networks (e.g. Alexnet, GoogleNet, VGG, Overfeat, and Deep Residual Network). While the DSL code is highly compact with less than 100 lines for each of the network, the Java source code generated by the DeepDSL compiler is highly efficient. Our experiments show that the output java source has very competitive runtime performance and memory efficiency compared to the existing DL frameworks

University of Wisconsin-Milwaukee

Working Notes from the 1992 AAAI Workshop on Automating Software Design. Theme: Domain Specific Software Design

Author: Barstow David
Keller Richard M.
Lowry Michael R.
Tong Christopher H.
Publication venue
Publication date
Field of study

The goal of this workshop is to identify different architectural approaches to building domain-specific software design systems and to explore issues unique to domain-specific (vs. general-purpose) software design. Some general issues that cut across the particular software design domain include: (1) knowledge representation, acquisition, and maintenance; (2) specialized software design techniques; and (3) user interaction and user interface

NASA Technical Reports Server

Domain Specific Memory Management for Large Scale Data Analytics

Author: Shyamshankar Panchapakesan Chitra
Publication venue: 'The Busan Gyeongnam Mathematical Society'
Publication date: 15/04/2019
Field of study

Hardware trends over the last several decades have lead to shifting priorities with respect to performance bottlenecks in the implementations of dataflows typically present in large-scale data analytics applications. In particular, efficient use of main memory has emerged as a critical aspect of dataflow implementation, due to the proliferation of multi-core architectures, as well as the rapid development of faster-than-disk storage media. At the same time, the wealth of static domain-specific information about applications remains an untapped resource when it comes to optimizing the use of memory in a dataflow application. We propose a compilation-based approach to the synthesis of memory-efficient dataflow implementations, using static analysis to extract and leverage domain-specific information about the application. Our program transformations use the combined results of type, effect, and provenance analyses to infer time- and space- effective placement of primitive memory operations, precluding the need for dynamic memory management and its attendant costs. The experimental evaluation of implementations synthesized with our framework shows both the importance of optimizing for memory performance, as well as significant benefits of our approach, along multiple dimensions. Finally, we also demonstrate a framework for formally verifying the soundness of these transformations, laying the foundation for their use as a component of a more general implementation synthesis ecosystem

JScholarship

The Role of Distributed Computing in Big Data Science: Case Studies in Forensics and Bioinformatics

Author: Roscigno Gianluca
Publication venue: Universita degli studi di Salerno
Publication date: 24/05/2016
Field of study

2014 - 2015The era of Big Data is leading the generation of large amounts of data, which require storage and analysis capabilities that can be only ad- dressed by distributed computing systems. To facilitate large-scale distributed computing, many programming paradigms and frame- works have been proposed, such as MapReduce and Apache Hadoop, which transparently address some issues of distributed systems and hide most of their technical details. Hadoop is currently the most popular and mature framework sup- porting the MapReduce paradigm, and it is widely used to store and process Big Data using a cluster of computers. The solutions such as Hadoop are attractive, since they simplify the transformation of an application from non-parallel to the distributed one by means of general utilities and without many skills. However, without any algorithm engineering activity, some target applications are not alto- gether fast and e cient, and they can su er from several problems and drawbacks when are executed on a distributed system. In fact, a distributed implementation is a necessary but not su cient condition to obtain remarkable performance with respect to a non-parallel coun- terpart. Therefore, it is required to assess how distributed solutions are run on a Hadoop cluster, and/or how their performance can be improved to reduce resources consumption and completion times. In this dissertation, we will show how Hadoop-based implementations can be enhanced by using carefully algorithm engineering activity, tuning, pro ling and code improvements. It is also analyzed how to achieve these goals by working on some critical points, such as: data local computation, input split size, number and granularity of tasks, cluster con guration, input/output representation, etc. i In particular, to address these issues, we choose some case studies coming from two research areas where the amount of data is rapidly increasing, namely, Digital Image Forensics and Bioinformatics. We mainly describe full- edged implementations to show how to design, engineer, improve and evaluate Hadoop-based solutions for Source Camera Identi cation problem, i.e., recognizing the camera used for taking a given digital image, adopting the algorithm by Fridrich et al., and for two of the main problems in Bioinformatics, i.e., alignment- free sequence comparison and extraction of k-mer cumulative or local statistics. The results achieved by our improved implementations show that they are substantially faster than the non-parallel counterparts, and re- markably faster than the corresponding Hadoop-based naive imple- mentations. In some cases, for example, our solution for k-mer statis- tics is approximately 30× faster than our Hadoop-based naive im- plementation, and about 40× faster than an analogous tool build on Hadoop. In addition, our applications are also scalable, i.e., execution times are (approximately) halved by doubling the computing units. Indeed, algorithm engineering activities based on the implementation of smart improvements and supported by careful pro ling and tun- ing may lead to a much better experimental performance avoiding potential problems. We also highlight how the proposed solutions, tips, tricks and insights can be used in other research areas and problems. Although Hadoop simpli es some tasks of the distributed environ- ments, we must thoroughly know it to achieve remarkable perfor- mance. It is not enough to be an expert of the application domain to build Hadop-based implementations, indeed, in order to achieve good performance, an expert of distributed systems, algorithm engi- neering, tuning, pro ling, etc. is also required. Therefore, the best performance depend heavily on the cooperation degree between the domain expert and the distributed algorithm engineer. [edited by Author]XIV n.s

EleA@UniSA - Università degli Studi di Salerno

Proceedings of the 21st Conference on Formal Methods in Computer-Aided Design – FMCAD 2021

Author
Publication venue: TU Wien Academic Press
Publication date: 18/10/2021
Field of study

The Conference on Formal Methods in Computer-Aided Design (FMCAD) is an annual conference on the theory and applications of formal methods in hardware and system verification. FMCAD provides a leading forum to researchers in academia and industry for presenting and discussing groundbreaking methods, technologies, theoretical results, and tools for reasoning formally about computing systems. FMCAD covers formal aspects of computer-aided system design including verification, specification, synthesis, and testing

Directory of Open Access Books (DOAB)