625 research outputs found
Recommended from our members
Deploying Web-based Visual Exploration Tools on the Grid
We discuss a web-based portal for the exploration, encapsulation, and dissemination of visualization results over the Grid. This portal integrates three components: an interface client for structured visualization exploration, a visualization web application to manage the generation and capture of the visualization results, and a centralized portal application server to access and manage grid resources. Our approach uses standard web technologies to make the system accessible with minimal user setup. We demonstrate the usefulness of the developed system using an example for Adaptive Mesh Refinement (AMR) data visualization
Interoperability in the OpenDreamKit Project: The Math-in-the-Middle Approach
OpenDreamKit --- "Open Digital Research Environment Toolkit for the
Advancement of Mathematics" --- is an H2020 EU Research Infrastructure project
that aims at supporting, over the period 2015--2019, the ecosystem of
open-source mathematical software systems. From that, OpenDreamKit will deliver
a flexible toolkit enabling research groups to set up Virtual Research
Environments, customised to meet the varied needs of research projects in pure
mathematics and applications.
An important step in the OpenDreamKit endeavor is to foster the
interoperability between a variety of systems, ranging from computer algebra
systems over mathematical databases to front-ends. This is the mission of the
integration work package (WP6). We report on experiments and future plans with
the \emph{Math-in-the-Middle} approach. This information architecture consists
in a central mathematical ontology that documents the domain and fixes a joint
vocabulary, combined with specifications of the functionalities of the various
systems. Interaction between systems can then be enriched by pivoting off this
information architecture.Comment: 15 pages, 7 figure
DALiuGE: A Graph Execution Framework for Harnessing the Astronomical Data Deluge
The Data Activated Liu Graph Engine - DALiuGE - is an execution framework for
processing large astronomical datasets at a scale required by the Square
Kilometre Array Phase 1 (SKA1). It includes an interface for expressing complex
data reduction pipelines consisting of both data sets and algorithmic
components and an implementation run-time to execute such pipelines on
distributed resources. By mapping the logical view of a pipeline to its
physical realisation, DALiuGE separates the concerns of multiple stakeholders,
allowing them to collectively optimise large-scale data processing solutions in
a coherent manner. The execution in DALiuGE is data-activated, where each
individual data item autonomously triggers the processing on itself. Such
decentralisation also makes the execution framework very scalable and flexible,
supporting pipeline sizes ranging from less than ten tasks running on a laptop
to tens of millions of concurrent tasks on the second fastest supercomputer in
the world. DALiuGE has been used in production for reducing interferometry data
sets from the Karl E. Jansky Very Large Array and the Mingantu Ultrawide
Spectral Radioheliograph; and is being developed as the execution framework
prototype for the Science Data Processor (SDP) consortium of the Square
Kilometre Array (SKA) telescope. This paper presents a technical overview of
DALiuGE and discusses case studies from the CHILES and MUSER projects that use
DALiuGE to execute production pipelines. In a companion paper, we provide
in-depth analysis of DALiuGE's scalability to very large numbers of tasks on
two supercomputing facilities.Comment: 31 pages, 12 figures, currently under review by Astronomy and
Computin
VerdictDB: Universalizing Approximate Query Processing
Despite 25 years of research in academia, approximate query processing (AQP)
has had little industrial adoption. One of the major causes of this slow
adoption is the reluctance of traditional vendors to make radical changes to
their legacy codebases, and the preoccupation of newer vendors (e.g.,
SQL-on-Hadoop products) with implementing standard features. Additionally, the
few AQP engines that are available are each tied to a specific platform and
require users to completely abandon their existing databases---an unrealistic
expectation given the infancy of the AQP technology. Therefore, we argue that a
universal solution is needed: a database-agnostic approximation engine that
will widen the reach of this emerging technology across various platforms.
Our proposal, called VerdictDB, uses a middleware architecture that requires
no changes to the backend database, and thus, can work with all off-the-shelf
engines. Operating at the driver-level, VerdictDB intercepts analytical queries
issued to the database and rewrites them into another query that, if executed
by any standard relational engine, will yield sufficient information for
computing an approximate answer. VerdictDB uses the returned result set to
compute an approximate answer and error estimates, which are then passed on to
the user or application. However, lack of access to the query execution layer
introduces significant challenges in terms of generality, correctness, and
efficiency. This paper shows how VerdictDB overcomes these challenges and
delivers up to 171 speedup (18.45 on average) for a variety of
existing engines, such as Impala, Spark SQL, and Amazon Redshift, while
incurring less than 2.6% relative error. VerdictDB is open-sourced under Apache
License.Comment: Extended technical report of the paper that appeared in Proceedings
of the 2018 International Conference on Management of Data, pp. 1461-1476.
ACM, 201
- …