2,741 research outputs found
Deductive Optimization of Relational Data Storage
Optimizing the physical data storage and retrieval of data are two key
database management problems. In this paper, we propose a language that can
express a wide range of physical database layouts, going well beyond the row-
and column-based methods that are widely used in database management systems.
We use deductive synthesis to turn a high-level relational representation of a
database query into a highly optimized low-level implementation which operates
on a specialized layout of the dataset. We build a compiler for this language
and conduct experiments using a popular database benchmark, which shows that
the performance of these specialized queries is competitive with a
state-of-the-art in memory compiled database system
AUTOMATING DATA-LAYOUT DECISIONS IN DOMAIN-SPECIFIC LANGUAGES
A long-standing challenge in High-Performance Computing (HPC) is the simultaneous achievement of programmer productivity and hardware computational efficiency. The challenge has been exacerbated by the onset of multi- and many-core CPUs and accelerators. Only a few expert programmers have been able to hand-code domain-specific data transformations and vectorization schemes needed to extract the best possible performance on such architectures. In this research, we examined the possibility of automating these methods by developing a Domain-Specific Language (DSL) framework. Our DSL approach extends C++14 by embedding into it a high-level data-parallel array language, and by using a domain-specific compiler to compile to hybrid-parallel code. We also implemented an array index-space transformation algebra within this high-level array language to manipulate array data-layouts and data-distributions. The compiler introduces a novel method for SIMD auto-vectorization based on array data-layouts. Our new auto-vectorization technique is shown to outperform the default auto-vectorization strategy by up to 40% for stencil computations. The compiler also automates distributed data movement with overlapping of local compute with remote data movement using polyhedral integer set analysis. Along with these main innovations, we developed a new technique using C++ template metaprogramming for developing embedded DSLs using C++. We also proposed a domain-specific compiler intermediate representation that simplifies data flow analysis of abstract DSL constructs. We evaluated our framework by constructing a DSL for the HPC grand-challenge domain of lattice quantum chromodynamics. Our DSL yielded performance gains of up to twice the flop rate over existing production C code for selected kernels. This gain in performance was obtained while using less than one-tenth the lines of code. The performance of this DSL was also competitive with the best hand-optimized and hand-vectorized code, and is an order of magnitude better than existing production DSLs.Doctor of Philosoph
Code Generation for Efficient Query Processing in Managed Runtimes
In this paper we examine opportunities arising from the conver-gence of two trends in data management: in-memory database sys-tems (IMDBs), which have received renewed attention following the availability of affordable, very large main memory systems; and language-integrated query, which transparently integrates database queries with programming languages (thus addressing the famous ‘impedance mismatch ’ problem). Language-integrated query not only gives application developers a more convenient way to query external data sources like IMDBs, but also to use the same querying language to query an application’s in-memory collections. The lat-ter offers further transparency to developers as the query language and all data is represented in the data model of the host program-ming language. However, compared to IMDBs, this additional free-dom comes at a higher cost for query evaluation. Our vision is to improve in-memory query processing of application objects by introducing database technologies to managed runtimes. We focus on querying and we leverage query compilation to im-prove query processing on application objects. We explore dif-ferent query compilation strategies and study how they improve the performance of query processing over application data. We take C] as the host programming language as it supports language-integrated query through the LINQ framework. Our techniques de-liver significant performance improvements over the default LINQ implementation. Our work makes important first steps towards a future where data processing applications will commonly run on machines that can store their entire datasets in-memory, and will be written in a single programming language employing language-integrated query and IMDB-inspired runtimes to provide transparent and highly efficient querying. 1
Hydrogen turbine power conversion system assessment
A three part technical study was conducted whereby parametric technical and economic feasibility data were developed on several power conversion systems suitable for the generation of central station electric power through the combustion of hydrogen and the use of the resulting heat energy in turbogenerator equipment. The study assessed potential applications of hydrogen-fueled power conversion systems and identified the three most promising candidates: (1) Ericsson Cycle, (2) gas turbine, and (3) direct steam injection system for fossil fuel as well as nuclear powerplants. A technical and economic evaluation was performed on the three systems from which the direct injection system (fossil fuel only) was selected for a preliminary conceptual design of an integrated hydrogen-fired power conversion system
A metadata-enhanced framework for high performance visual effects
This thesis is devoted to reducing the interactive latency of image processing computations in
visual effects. Film and television graphic artists depend upon low-latency feedback to receive
a visual response to changes in effect parameters. We tackle latency with a domain-specific optimising
compiler which leverages high-level program metadata to guide key computational and
memory hierarchy optimisations. This metadata encodes static and dynamic information about
data dependence and patterns of memory access in the algorithms constituting a visual effect –
features that are typically difficult to extract through program analysis – and presents it to the
compiler in an explicit form. By using domain-specific information as a substitute for program
analysis, our compiler is able to target a set of complex source-level optimisations that a vendor
compiler does not attempt, before passing the optimised source to the vendor compiler for
lower-level optimisation.
Three key metadata-supported optimisations are presented. The first is an adaptation of
space and schedule optimisation – based upon well-known compositions of the loop fusion and
array contraction transformations – to the dynamic working sets and schedules of a runtimeparameterised
visual effect. This adaptation sidesteps the costly solution of runtime code generation
by specialising static parameters in an offline process and exploiting dynamic metadata to
adapt the schedule and contracted working sets at runtime to user-tunable parameters. The second
optimisation comprises a set of transformations to generate SIMD ISA-augmented source code.
Our approach differs from autovectorisation by using static metadata to identify parallelism, in
place of data dependence analysis, and runtime metadata to tune the data layout to user-tunable
parameters for optimal aligned memory access. The third optimisation comprises a related set
of transformations to generate code for SIMT architectures, such as GPUs. Static dependence
metadata is exploited to guide large-scale parallelisation for tens of thousands of in-flight threads.
Optimal use of the alignment-sensitive, explicitly managed memory hierarchy is achieved by identifying
inter-thread and intra-core data sharing opportunities in memory access metadata.
A detailed performance analysis of these optimisations is presented for two industrially developed
visual effects. In our evaluation we demonstrate up to 8.1x speed-ups on Intel and AMD
multicore CPUs and up to 6.6x speed-ups on NVIDIA GPUs over our best hand-written implementations
of these two effects. Programmability is enhanced by automating the generation of
SIMD and SIMT implementations from a single programmer-managed scalar representation
Efficient query processing in managed runtimes
This thesis presents strategies to improve the query evaluation performance over
huge volumes of relational-like data that is stored in the memory space of managed
applications. Storing and processing application data in the memory space of managed
applications is motivated by the convergence of two recent trends in data management.
First, dropping DRAM prices have led to memory capacities that allow the entire working
set of an application to fit into main memory and to the emergence of in-memory
database systems (IMDBs). Second, language-integrated query transparently integrates
query processing syntax into programming languages and, therefore, allows complex
queries to be composed in the application. IMDBs typically serve as data stores to applications
written in an object-oriented language running on a managed runtime. In
this thesis, we propose a deeper integration of the two by storing all application data in
the memory space of the application and using language-integrated query, combined
with query compilation techniques, to provide fast query processing.
As a starting point, we look into storing data as runtime-managed objects in collection
types provided by the programming language. Queries are formulated using
language-integrated query and dynamically compiled to specialized functions that produce
the result of the query in a more efficient way by leveraging query compilation
techniques similar to those used in modern database systems. We show that the generated
query functions significantly improve query processing performance compared to
the default execution model for language-integrated query. However, we also identify
additional inefficiencies that can only be addressed by processing queries using low-level
techniques which cannot be applied to runtime-managed objects. To address this,
we introduce a staging phase in the generated code that makes query-relevant managed
data accessible to low-level query code. Our experiments in .NET show an improvement
in query evaluation performance of up to an order of magnitude over the default
language-integrated query implementation.
Motivated by additional inefficiencies caused by automatic garbage collection, we
introduce a new collection type, the black-box collection. Black-box collections integrate
the in-memory storage layer of a relational database system to store data and hide
the internal storage layout from the application by employing existing object-relational
mapping techniques (hence, the name black-box). Our experiments show that black-box
collections provide better query performance than runtime-managed collections
by allowing the generated query code to directly access the underlying relational in-memory
data store using low-level techniques. Black-box collections also outperform
a modern commercial database system. By removing huge volumes of collection data
from the managed heap, black-box collections further improve the overall performance
and response time of the application and improve the application’s scalability when
facing huge volumes of collection data.
To enable a deeper integration of the data store with the application, we introduce
self-managed collections. Self-managed collections are a new type of collection for
managed applications that, in contrast to black-box collections, store objects. As the
data elements stored in the collection are objects, they are directly accessible from the
application using references which allows for better integration of the data store with
the application. Self-managed collections manually manage the memory of objects
stored within them in a private heap that is excluded from garbage collection. We introduce
a special collection syntax and a novel type-safe manual memory management
system for this purpose. As was the case for black-box collections, self-managed collections
improve query performance by utilizing a database-inspired data layout and
allowing the use of low-level techniques. By also supporting references between collection
objects, they outperform black-box collections
Application-Specific Heterogeneous Network-on-Chip Design
Cataloged from PDF version of article.As a result of increasing communication demands, application-specific and scalable Network-on-Chips (NoCs) have emerged to connect processing cores and subsystems in Multiprocessor System-on-Chips. A challenge in application-specific NoC design is to find the right balance among different tradeoffs, such as communication latency, power consumption and chip area. We propose a novel approach that generates latency-aware heterogeneous NoC topology. Experimental results show that our approach improves the total communication latency up to 27% with modest power consumption. © 2013 The Author 2013. Published by Oxford University Press on behalf of The British Computer Society
- …