49 research outputs found
Leveraging Grammars For OpenMP Development in Supercomputing Environments
This thesis proposes a solution to streamline the process of using supercomputing re- sources on Southern Methodist University’s ManeFrame II supercomputer. A large segment of the research community that uses ManeFrame II belong outside of the computer science department and the Lyle School of Engineering. While users know how to apply compu- tation to their field, their knowledge does not necessarily extend to the suite of tools and operating system that are required to use ManeFrame II. To solve this, the thesis proposes an interface for those who have little knowledge of Linux and SLURM to be able to use the supercomputing resources that SMU’s Center for Scientific Computation provides.
OpenMP is a compiler extension for C, C++ and Fortran that generates a binary using multithreading using in-code directives. With knowledge of OpenMP, researchers are already able to split their code into multiple threads of execution. However, because of the complexity of Linux and SLURM, using OpenMP with the supercomputer can be problematic. This thesis focuses on the user of ANTLR, a programming language recognition tool. This tool allows for the insertion of directives into code which serves to generate batch files that are compatible with the supercomputer scheduling software, SLURM. With the batch file, the user is then able to submit their code to the supercomputer.
Additional tools around this core piece of software facilitate a usable interface. In order to make the tool accessible to those without a background in software, the proposed forward- facing solution is a user interface to upload their code and returns a batch file that the user can use to run their code. This eliminates the need for a new user to download, compile and run the ANTLR distribution to generate a batch file.
By abstracting away these complexities into a web interface, the solution can generate a batch submission file for the user. Additional tooling assists the user in finding empty nodes for code execution, testing the compilation of their code on the supercomputer and running a timed sample of their code to ensure that OpenMP is leading to a speedup in execution time
The Federal Big Data Research and Development Strategic Plan
This document was developed through the contributions of the NITRD Big Data SSG members and staff. A special thanks and appreciation to the core team of editors, writers, and reviewers: Lida Beninson (NSF), Quincy Brown (NSF), Elizabeth Burrows (NSF), Dana Hunter (NSF), Craig Jolley (USAID), Meredith Lee (DHS), Nishal Mohan (NSF), Chloe Poston (NSF), Renata Rawlings-Goss (NSF), Carly Robinson (DOE Science), Alejandro Suarez (NSF), Martin Wiener (NSF), and Fen Zhao (NSF).
A national Big Data1 innovation ecosystem is essential to enabling knowledge discovery from and confident action informed by the vast resource of new and diverse datasets that are rapidly becoming available in nearly every aspect of life. Big Data has the potential to radically improve the lives of all Americans. It is now possible to combine disparate, dynamic, and distributed datasets and enable everything from predicting the future behavior of complex systems to precise medical treatments, smart energy usage, and focused educational curricula. Government agency research and public-private partnerships, together with the education and training of future data scientists, will enable applications that directly benefit society and the economy of the Nation.
To derive the greatest benefits from the many, rich sources of Big Data, the Administration announced a “Big Data Research and Development Initiative” on March 29, 2012.2 Dr. John P. Holdren, Assistant to the President for Science and Technology and Director of the Office of Science and Technology Policy, stated that the initiative “promises to transform our ability to use Big Data for scientific discovery, environmental and biomedical research, education, and national security.”
The Federal Big Data Research and Development Strategic Plan (Plan) builds upon the promise and excitement of the myriad applications enabled by Big Data with the objective of guiding Federal agencies as they develop and expand their individual mission-driven programs and investments related to Big Data. The Plan is based on inputs from a series of Federal agency and public activities, and a shared vision: We envision a Big Data innovation ecosystem in which the ability to analyze, extract information from, and make decisions and discoveries based upon large, diverse, and real-time datasets enables new capabilities for Federal agencies and the Nation at large; accelerates the process of scientific discovery and innovation; leads to new fields of research and new areas of inquiry that would otherwise be impossible; educates the next generation of 21st century scientists and engineers; and promotes new economic growth.
The Plan is built around seven strategies that represent key areas of importance for Big Data research and development (R&D). Priorities listed within each strategy highlight the intended outcomes that can be addressed by the missions and research funding of NITRD agencies. These include advancing human understanding in all branches of science, medicine, and security; ensuring the Nation’s continued leadership in research and development; and enhancing the Nation’s ability to address pressing societal and environmental issues facing the Nation and the world through research and development
The Federal Big Data Research and Development Strategic Plan
This document was developed through the contributions of the NITRD Big Data SSG members and staff. A special thanks and appreciation to the core team of editors, writers, and reviewers: Lida Beninson (NSF), Quincy Brown (NSF), Elizabeth Burrows (NSF), Dana Hunter (NSF), Craig Jolley (USAID), Meredith Lee (DHS), Nishal Mohan (NSF), Chloe Poston (NSF), Renata Rawlings-Goss (NSF), Carly Robinson (DOE Science), Alejandro Suarez (NSF), Martin Wiener (NSF), and Fen Zhao (NSF).
A national Big Data1 innovation ecosystem is essential to enabling knowledge discovery from and confident action informed by the vast resource of new and diverse datasets that are rapidly becoming available in nearly every aspect of life. Big Data has the potential to radically improve the lives of all Americans. It is now possible to combine disparate, dynamic, and distributed datasets and enable everything from predicting the future behavior of complex systems to precise medical treatments, smart energy usage, and focused educational curricula. Government agency research and public-private partnerships, together with the education and training of future data scientists, will enable applications that directly benefit society and the economy of the Nation.
To derive the greatest benefits from the many, rich sources of Big Data, the Administration announced a “Big Data Research and Development Initiative” on March 29, 2012.2 Dr. John P. Holdren, Assistant to the President for Science and Technology and Director of the Office of Science and Technology Policy, stated that the initiative “promises to transform our ability to use Big Data for scientific discovery, environmental and biomedical research, education, and national security.”
The Federal Big Data Research and Development Strategic Plan (Plan) builds upon the promise and excitement of the myriad applications enabled by Big Data with the objective of guiding Federal agencies as they develop and expand their individual mission-driven programs and investments related to Big Data. The Plan is based on inputs from a series of Federal agency and public activities, and a shared vision: We envision a Big Data innovation ecosystem in which the ability to analyze, extract information from, and make decisions and discoveries based upon large, diverse, and real-time datasets enables new capabilities for Federal agencies and the Nation at large; accelerates the process of scientific discovery and innovation; leads to new fields of research and new areas of inquiry that would otherwise be impossible; educates the next generation of 21st century scientists and engineers; and promotes new economic growth.
The Plan is built around seven strategies that represent key areas of importance for Big Data research and development (R&D). Priorities listed within each strategy highlight the intended outcomes that can be addressed by the missions and research funding of NITRD agencies. These include advancing human understanding in all branches of science, medicine, and security; ensuring the Nation’s continued leadership in research and development; and enhancing the Nation’s ability to address pressing societal and environmental issues facing the Nation and the world through research and development
Accelerators for Data Processing
The explosive growth in digital data and its growing role in real-time analytics motivate the design of high-performance database management systems (DBMSs). Meanwhile, slowdown in supply voltage scaling has stymied improvements in core performance and ushered an era of power-limited chips. These developments motivate the design of software and hardware DBMS accelerators that (1) maximize utility by accelerating the dominant operations, and (2) provide flexibility in the choice of DBMS, data layout, and data types. In this thesis, we identify pointer-intensive data structure operations as a key performance and efficiency bottleneck in data analytics workloads. We observe that data analytics tasks include a large number of independent data structure lookups, each of which is characterized by dependent long-latency memory accesses due to pointer chasing. Unfortunately, exploiting such inter-lookup parallelism to overlap memory accesses from different lookups is not possible within the limited instruction window of modern out-of-order cores. Similarly, software prefetching techniques attempt to exploit inter-lookup parallelism by statically staging independent lookups, and hence break down in the face of irregularity across lookup stages. Based on these observations, we provide a dynamic software acceleration scheme for exploiting inter-lookup parallelism to hide the memory access latency despite the irregularities across lookups. Furthermore, we propose a programmable hardware accelerator to maximize the efficiency of the data structure lookups. As a result, through flexible hardware and software techniques we eliminate a key efficiency and performance bottleneck in data analytics operations
Design and Implementation of a Domain Specific Language for Deep Learning
\textit {Deep Learning} (DL) has found great success in well-diversified areas such as machine vision, speech recognition, big data analysis, and multimedia understanding recently. However, the existing state-of-the-art DL frameworks, e.g. Caffe2, Theano, TensorFlow, MxNet, Torch7, and CNTK, are programming libraries with fixed user interfaces, internal representations, and execution environments. Modifying the code of DL layers or data structure is very challenging without in-depth understanding of the underlying implementation. The optimization of the code and execution in these tools is often limited and relies on the specific DL computation graph manipulation and scheduling that lack systematic and universal strategies. Furthermore, most of these tools demand many dependencies beside the tool itself and require to be built to some specific platforms for DL training or inference.
\\\\
\noindent This dissertation presents {\it DeepDSL}, a \textit {domain specific language} (DSL) embedded in Scala, that compiles DL networks encoded with DeepDSL to efficient, compact, and portable Java source programs for DL training and inference. DeepDSL represents DL networks as abstract tensor functions, performs symbolic gradient derivations to generate the Intermediate Representation (IR), optimizes the IR expressions, and compiles the optimized IR expressions to cross-platform Java code that is easily modifiable and debuggable. Also, the code directly runs on GPU without additional dependencies except a small set of \textit{JNI} (Java Native Interface) wrappers for invoking the underneath GPU libraries. Moreover, DeepDSL provides static analysis for memory consumption and error detection.
\\\\
\noindent DeepDSL\footnote{Our previous results are reported in~\cite{zhao2017}; design and implementation details are summarized in~\cite{Zhao2018}.} has been evaluated with many current state-of-the-art DL networks (e.g. Alexnet, GoogleNet, VGG, Overfeat, and Deep Residual Network). While the DSL code is highly compact with less than 100 lines for each of the network, the Java source code generated by the DeepDSL compiler is highly efficient. Our experiments show that the output java source has very competitive runtime performance and memory efficiency compared to the existing DL frameworks
Working Notes from the 1992 AAAI Workshop on Automating Software Design. Theme: Domain Specific Software Design
The goal of this workshop is to identify different architectural approaches to building domain-specific software design systems and to explore issues unique to domain-specific (vs. general-purpose) software design. Some general issues that cut across the particular software design domain include: (1) knowledge representation, acquisition, and maintenance; (2) specialized software design techniques; and (3) user interaction and user interface
Domain Specific Memory Management for Large Scale Data Analytics
Hardware trends over the last several decades have lead to shifting priorities
with respect to performance bottlenecks in the implementations of dataflows
typically present in large-scale data analytics applications. In particular,
efficient use of main memory has emerged as a critical aspect of dataflow
implementation, due to the proliferation of multi-core architectures, as well as
the rapid development of faster-than-disk storage media. At the same time, the
wealth of static domain-specific information about applications remains an
untapped resource when it comes to optimizing the use of memory in a dataflow
application.
We propose a compilation-based approach to the synthesis of memory-efficient
dataflow implementations, using static analysis to extract and leverage
domain-specific information about the application. Our program transformations
use the combined results of type, effect, and provenance analyses to infer time-
and space- effective placement of primitive memory operations, precluding the
need for dynamic memory management and its attendant costs. The experimental
evaluation of implementations synthesized with our framework shows both the
importance of optimizing for memory performance, as well as significant benefits
of our approach, along multiple dimensions.
Finally, we also demonstrate a framework for formally verifying the soundness of
these transformations, laying the foundation for their use as a component of a
more general implementation synthesis ecosystem
The Role of Distributed Computing in Big Data Science: Case Studies in Forensics and Bioinformatics
2014 - 2015The era of Big Data is leading the generation of large amounts of data,
which require storage and analysis capabilities that can be only ad-
dressed by distributed computing systems. To facilitate large-scale
distributed computing, many programming paradigms and frame-
works have been proposed, such as MapReduce and Apache Hadoop,
which transparently address some issues of distributed systems and
hide most of their technical details.
Hadoop is currently the most popular and mature framework sup-
porting the MapReduce paradigm, and it is widely used to store and
process Big Data using a cluster of computers. The solutions such
as Hadoop are attractive, since they simplify the transformation
of an application from non-parallel to the distributed one by means
of general utilities and without many skills. However, without any
algorithm engineering activity, some target applications are not alto-
gether fast and e cient, and they can su er from several problems
and drawbacks when are executed on a distributed system. In fact, a
distributed implementation is a necessary but not su cient condition
to obtain remarkable performance with respect to a non-parallel coun-
terpart. Therefore, it is required to assess how distributed solutions
are run on a Hadoop cluster, and/or how their performance can be
improved to reduce resources consumption and completion times.
In this dissertation, we will show how Hadoop-based implementations
can be enhanced by using carefully algorithm engineering activity,
tuning, pro ling and code improvements. It is also analyzed how to
achieve these goals by working on some critical points, such as: data
local computation, input split size, number and granularity of tasks,
cluster con guration, input/output representation, etc.
i
In particular, to address these issues, we choose some case studies
coming from two research areas where the amount of data is rapidly
increasing, namely, Digital Image Forensics and Bioinformatics. We
mainly describe full- edged implementations to show how to design,
engineer, improve and evaluate Hadoop-based solutions for Source
Camera Identi cation problem, i.e., recognizing the camera used for
taking a given digital image, adopting the algorithm by Fridrich et al.,
and for two of the main problems in Bioinformatics, i.e., alignment-
free sequence comparison and extraction of k-mer cumulative or local
statistics.
The results achieved by our improved implementations show that they
are substantially faster than the non-parallel counterparts, and re-
markably faster than the corresponding Hadoop-based naive imple-
mentations. In some cases, for example, our solution for k-mer statis-
tics is approximately 30Ă— faster than our Hadoop-based naive im-
plementation, and about 40Ă— faster than an analogous tool build on
Hadoop. In addition, our applications are also scalable, i.e., execution
times are (approximately) halved by doubling the computing units.
Indeed, algorithm engineering activities based on the implementation
of smart improvements and supported by careful pro ling and tun-
ing may lead to a much better experimental performance avoiding
potential problems.
We also highlight how the proposed solutions, tips, tricks and insights
can be used in other research areas and problems.
Although Hadoop simpli es some tasks of the distributed environ-
ments, we must thoroughly know it to achieve remarkable perfor-
mance. It is not enough to be an expert of the application domain
to build Hadop-based implementations, indeed, in order to achieve
good performance, an expert of distributed systems, algorithm engi-
neering, tuning, pro ling, etc. is also required. Therefore, the best
performance depend heavily on the cooperation degree between the
domain expert and the distributed algorithm engineer. [edited by Author]XIV n.s
Proceedings of the 21st Conference on Formal Methods in Computer-Aided Design – FMCAD 2021
The Conference on Formal Methods in Computer-Aided Design (FMCAD) is an annual conference on the theory and applications of formal methods in hardware and system verification. FMCAD provides a leading forum to researchers in academia and industry for presenting and discussing groundbreaking methods, technologies, theoretical results, and tools for reasoning formally about computing systems. FMCAD covers formal aspects of computer-aided system design including verification, specification, synthesis, and testing