201 research outputs found
Main Memory Adaptive Indexing for Multi-core Systems
Adaptive indexing is a concept that considers index creation in databases as
a by-product of query processing; as opposed to traditional full index creation
where the indexing effort is performed up front before answering any queries.
Adaptive indexing has received a considerable amount of attention, and several
algorithms have been proposed over the past few years; including a recent
experimental study comparing a large number of existing methods. Until now,
however, most adaptive indexing algorithms have been designed single-threaded,
yet with multi-core systems already well established, the idea of designing
parallel algorithms for adaptive indexing is very natural. In this regard only
one parallel algorithm for adaptive indexing has recently appeared in the
literature: The parallel version of standard cracking. In this paper we
describe three alternative parallel algorithms for adaptive indexing, including
a second variant of a parallel standard cracking algorithm. Additionally, we
describe a hybrid parallel sorting algorithm, and a NUMA-aware method based on
sorting. We then thoroughly compare all these algorithms experimentally; along
a variant of a recently published parallel version of radix sort. Parallel
sorting algorithms serve as a realistic baseline for multi-threaded adaptive
indexing techniques. In total we experimentally compare seven parallel
algorithms. Additionally, we extensively profile all considered algorithms. The
initial set of experiments considered in this paper indicates that our parallel
algorithms significantly improve over previously known ones. Our results
suggest that, although adaptive indexing algorithms are a good design choice in
single-threaded environments, the rules change considerably in the parallel
case. That is, in future highly-parallel environments, sorting algorithms could
be serious alternatives to adaptive indexing.Comment: 26 pages, 7 figure
Database (Lecture) Streams on the Cloud: Experience Report on Teaching an Undergrad Database Lecture During a Pandemic
This is an experience report on teaching the undergrad lecture Big Data Engineering at Saarland University in summer term 2020 online. We describe our teaching philosophy, the tools used, what worked and what did not work. As we received extremely positive feedback from the students, we will continue to use the same teaching model for other lectures in the future
What if an SQL Statement Returned a Database?
Every SQL statement is limited to return a single, possibly denormalized,
table. This design decision has far reaching consequences. (1.) for databases
users in terms of slow query performance, long query result transfer times,
usability-issues of SQL in web applications and object-relational mappers. In
addition, (2.) for database architects it has consequences when designing query
optimizers leading to logical (algebraic) join enumeration effort, memory
consumption for intermediate result materialization, and physical operator
selection effort. So basically, the entire query optimization stack is shaped
by that design decision. In this paper, we argue that the single-table
limitation should be dropped. We extend the SELECT-clause of SQL by a keyword
'RESULTDB' to support returning a result database. Our approach has clear
semantics, i.e. our extended SQL returns subsets of all tables with only those
tuples that would be part of the traditional (single-table) query result set,
however without performing any denormalization through joins. Our SQL-extension
is downward compatible. Moreover, we discuss the surprisingly long list of
benefits of our approach. First, for database users: far simpler and more
readable application code, better query performance, smaller query results,
better query result transfer times. Second, for database architects, we present
how to leverage existing closed source systems as well as change open source
database systems to support our feature. We propose a couple of algorithms to
integrate our feature into both closed-source as well as open source database
systems. We present an initial experimental study with promising results
Laser Beam Welding of Hard to Weld Al Alloys for a Regional Aircraft Fuselage Design – First Results
AbstractLight weight design of fuselage structures is a major goal for future aircrafts to reduce structural weight for increased efficiency regarding fuel consumption. One objective is to validate and demonstrate the technology that offer the best opportunities of weight reduction and short production time. It involves the development of laser welding technologies for difficult weldable high strength aluminum alloys, containing Cu and / or Li. Another objective is to identify and evaluate approaches for first welding trials on T-joints of the alloy 2139 which are very promising regarding weld seam quality and achieved mechanical properties
Only Aggressive Elephants are Fast Elephants
Yellow elephants are slow. A major reason is that they consume their inputs
entirely before responding to an elephant rider's orders. Some clever riders
have trained their yellow elephants to only consume parts of the inputs before
responding. However, the teaching time to make an elephant do that is high. So
high that the teaching lessons often do not pay off. We take a different
approach. We make elephants aggressive; only this will make them very fast. We
propose HAIL (Hadoop Aggressive Indexing Library), an enhancement of HDFS and
Hadoop MapReduce that dramatically improves runtimes of several classes of
MapReduce jobs. HAIL changes the upload pipeline of HDFS in order to create
different clustered indexes on each data block replica. An interesting feature
of HAIL is that we typically create a win-win situation: we improve both data
upload to HDFS and the runtime of the actual Hadoop MapReduce job. In terms of
data upload, HAIL improves over HDFS by up to 60% with the default replication
factor of three. In terms of query execution, we demonstrate that HAIL runs up
to 68x faster than Hadoop. In our experiments, we use six clusters including
physical and EC2 clusters of up to 100 nodes. A series of scalability
experiments also demonstrates the superiority of HAIL.Comment: VLDB201
- …