6,892 research outputs found
On-Demand Big Data Integration: A Hybrid ETL Approach for Reproducible Scientific Research
Scientific research requires access, analysis, and sharing of data that is
distributed across various heterogeneous data sources at the scale of the
Internet. An eager ETL process constructs an integrated data repository as its
first step, integrating and loading data in its entirety from the data sources.
The bootstrapping of this process is not efficient for scientific research that
requires access to data from very large and typically numerous distributed data
sources. a lazy ETL process loads only the metadata, but still eagerly. Lazy
ETL is faster in bootstrapping. However, queries on the integrated data
repository of eager ETL perform faster, due to the availability of the entire
data beforehand.
In this paper, we propose a novel ETL approach for scientific data
integration, as a hybrid of eager and lazy ETL approaches, and applied both to
data as well as metadata. This way, Hybrid ETL supports incremental integration
and loading of metadata and data from the data sources. We incorporate a
human-in-the-loop approach, to enhance the hybrid ETL, with selective data
integration driven by the user queries and sharing of integrated data between
users. We implement our hybrid ETL approach in a prototype platform, Obidos,
and evaluate it in the context of data sharing for medical research. Obidos
outperforms both the eager ETL and lazy ETL approaches, for scientific research
data integration and sharing, through its selective loading of data and
metadata, while storing the integrated data in a scalable integrated data
repository.Comment: Pre-print Submitted to the DMAH Special Issue of the Springer DAPD
Journa
Extending Nunchaku to Dependent Type Theory
Nunchaku is a new higher-order counterexample generator based on a sequence
of transformations from polymorphic higher-order logic to first-order logic.
Unlike its predecessor Nitpick for Isabelle, it is designed as a stand-alone
tool, with frontends for various proof assistants. In this short paper, we
present some ideas to extend Nunchaku with partial support for dependent types
and type classes, to make frontends for Coq and other systems based on
dependent type theory more useful.Comment: In Proceedings HaTT 2016, arXiv:1606.0542
Lazy unification with inductive simplification
Unification in the presence of an equational theory is an important problem in theorem-proving and in the integration of functional and logic programming languages. This paper presents an improvement of the proposed lazy unification methods by incorporating simplification with inductive axioms into the unification process. Inductive simplification reduces the search space so that in some case infinite search spaces are reduced to finite ones. Consequently, more efficient unification algorithms can be achieved. We prove soundness and completeness of our method for equational theories represented by ground confluent and terminating rewrite systems
- …