138,467 research outputs found
Sequentializing Parameterized Programs
We exhibit assertion-preserving (reachability preserving) transformations
from parameterized concurrent shared-memory programs, under a k-round
scheduling of processes, to sequential programs. The salient feature of the
sequential program is that it tracks the local variables of only one thread at
any point, and uses only O(k) copies of shared variables (it does not use extra
counters, not even one counter to keep track of the number of threads).
Sequentialization is achieved using the concept of a linear interface that
captures the effect an unbounded block of processes have on the shared state in
a k-round schedule. Our transformation utilizes linear interfaces to
sequentialize the program, and to ensure the sequential program explores only
reachable states and preserves local invariants.Comment: In Proceedings FIT 2012, arXiv:1207.348
DD-AMG on QPACE 3
We describe our experience porting the Regensburg implementation of the
DD-AMG solver from QPACE 2 to QPACE 3. We first review how the code was
ported from the first generation Intel Xeon Phi processor (Knights Corner) to
its successor (Knights Landing). We then describe the modifications in the
communication library necessitated by the switch from InfiniBand to Omni-Path.
Finally, we present the performance of the code on a single processor as well
as the scaling on many nodes, where in both cases the speedup factor is close
to the theoretical expectations.Comment: 12 pages, 6 figures, Proceedings of Lattice 201
A Cost-based Optimizer for Gradient Descent Optimization
As the use of machine learning (ML) permeates into diverse application
domains, there is an urgent need to support a declarative framework for ML.
Ideally, a user will specify an ML task in a high-level and easy-to-use
language and the framework will invoke the appropriate algorithms and system
configurations to execute it. An important observation towards designing such a
framework is that many ML tasks can be expressed as mathematical optimization
problems, which take a specific form. Furthermore, these optimization problems
can be efficiently solved using variations of the gradient descent (GD)
algorithm. Thus, to decouple a user specification of an ML task from its
execution, a key component is a GD optimizer. We propose a cost-based GD
optimizer that selects the best GD plan for a given ML task. To build our
optimizer, we introduce a set of abstract operators for expressing GD
algorithms and propose a novel approach to estimate the number of iterations a
GD algorithm requires to converge. Extensive experiments on real and synthetic
datasets show that our optimizer not only chooses the best GD plan but also
allows for optimizations that achieve orders of magnitude performance speed-up.Comment: Accepted at SIGMOD 201
Evolving NoSQL Databases Without Downtime
NoSQL databases like Redis, Cassandra, and MongoDB are increasingly popular
because they are flexible, lightweight, and easy to work with. Applications
that use these databases will evolve over time, sometimes necessitating (or
preferring) a change to the format or organization of the data. The problem we
address in this paper is: How can we support the evolution of high-availability
applications and their NoSQL data online, without excessive delays or
interruptions, even in the presence of backward-incompatible data format
changes?
We present KVolve, an extension to the popular Redis NoSQL database, as a
solution to this problem. KVolve permits a developer to submit an upgrade
specification that defines how to transform existing data to the newest
version. This transformation is applied lazily as applications interact with
the database, thus avoiding long pause times. We demonstrate that KVolve is
expressive enough to support substantial practical updates, including format
changes to RedisFS, a Redis-backed file system, while imposing essentially no
overhead in general use and minimal pause times during updates.Comment: Update to writing/structur
Tester versus Bug: A Generic Framework for Model-Based Testing via Games
We propose a generic game-based approach for test case generation. We set up
a game between the tester and the System Under Test, in such a way that test
cases correspond to game strategies, and the conformance relation ioco
corresponds to alternating refinement. We show that different test assumptions
from the literature can be easily incorporated, by slightly varying the moves
in the games and their outcomes. In this way, our framework allows a wide
plethora of game-theoretic techniques to be deployed for model based testing.Comment: In Proceedings GandALF 2018, arXiv:1809.0241
On-Demand Big Data Integration: A Hybrid ETL Approach for Reproducible Scientific Research
Scientific research requires access, analysis, and sharing of data that is
distributed across various heterogeneous data sources at the scale of the
Internet. An eager ETL process constructs an integrated data repository as its
first step, integrating and loading data in its entirety from the data sources.
The bootstrapping of this process is not efficient for scientific research that
requires access to data from very large and typically numerous distributed data
sources. a lazy ETL process loads only the metadata, but still eagerly. Lazy
ETL is faster in bootstrapping. However, queries on the integrated data
repository of eager ETL perform faster, due to the availability of the entire
data beforehand.
In this paper, we propose a novel ETL approach for scientific data
integration, as a hybrid of eager and lazy ETL approaches, and applied both to
data as well as metadata. This way, Hybrid ETL supports incremental integration
and loading of metadata and data from the data sources. We incorporate a
human-in-the-loop approach, to enhance the hybrid ETL, with selective data
integration driven by the user queries and sharing of integrated data between
users. We implement our hybrid ETL approach in a prototype platform, Obidos,
and evaluate it in the context of data sharing for medical research. Obidos
outperforms both the eager ETL and lazy ETL approaches, for scientific research
data integration and sharing, through its selective loading of data and
metadata, while storing the integrated data in a scalable integrated data
repository.Comment: Pre-print Submitted to the DMAH Special Issue of the Springer DAPD
Journa
Reachability Analysis of Communicating Pushdown Systems
The reachability analysis of recursive programs that communicate
asynchronously over reliable FIFO channels calls for restrictions to ensure
decidability. Our first result characterizes communication topologies with a
decidable reachability problem restricted to eager runs (i.e., runs where
messages are either received immediately after being sent, or never received).
The problem is EXPTIME-complete in the decidable case. The second result is a
doubly exponential time algorithm for bounded context analysis in this setting,
together with a matching lower bound. Both results extend and improve previous
work from La Torre et al
- …