Search CORE

2,041 research outputs found

Dynamic data transformation for low latency querying in big data systems

Author: De Turck Filip
Ordonez Ante Leandro
Van Seghbroeck Gregory
Vanhove Thomas
Volckaert Bruno
Wauters Tim
Publication venue
Publication date: 01/01/2017
Field of study

Map-Based Transparent Persistence for Very Large Models

Author: Cabot Jordi
Gómez Abel
Sunyé Gerson
Tisi Massimo
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 11/04/2015
Field of study

International audienceThe progressive industrial adoption of Model-Driven Engineering (MDE) is fostering the development of large tool ecosystems like the Eclipse Modeling project. These tools are built on top of a set of base technologies that have been primarily designed for small-scale scenarios, where models are manually developed. In particular, efficient runtime manipulation for large-scale models is an under-studied problem and this is hampering the application of MDE to several industrial scenarios.In this paper we introduce and evaluate a map-based persistence model for MDE tools. We use this model to build a transparent persistence layer for modeling tools, on top of a map-based database engine. The layer can be plugged into the Eclipse Modeling Framework, lowering execution times and memory consumption levels of other existing approaches. Empirical tests are performed based on a typical industrial scenario, model-driven reverse engineering, where very large software models originate from the analysis of massive code bases. The layer is freely distributed and can be immediately used for enhancing the scalability of any existing Eclipse Modeling tool

INRIA a CCSD electronic archive server

HAL Mines Nantes

A generic persistence model for CLP systems (and two useful implementations)

Author: Cabeza Gras Daniel
Carro Liñares Manuel
Correas Fernandez Jesús
Gómez J. M.
Hermenegildo Manuel V.
Publication venue: Facultad de Informática (UPM)
Publication date: 01/08/2003
Field of study

This paper describes a model of persistence in (C)LP languages and two different and practically very useful ways to implement this model in current systems. The fundamental idea is that persistence is a characteristic of certain dynamic predicates (Le., those which encapsulate state). The main effect of declaring a predicate persistent is that the dynamic changes made to such predicates persist from one execution to the next one. After proposing a syntax for declaring persistent predicates, a simple, file-based implementation of the concept is presented and some examples shown. An additional implementation is presented which stores persistent predicates in an external datábase. The abstraction of the concept of persistence from its implementation allows developing applications which can store their persistent predicates alternatively in files or databases with only a few simple changes to a declaration stating the location and modality used for persistent storage. The paper presents the model, the implementation approach in both the cases of using files and relational databases, a number of optimizations of the process (using information obtained from static global analysis and goal clustering), and performance results from an implementation of these ideas

Archivo Digital UPM

A workload‑driven approach for view selection in large dimensional datasets

Author: De Turck Filip
Ordonez Ante Leandro
Van Seghbroeck Gregory
Volckaert Bruno
Wauters Tim
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2020
Field of study

The information explosion the world has witnessed in the last two decades has forced businesses to adopt a data-driven culture for them to be competitive. These data-driven businesses have access to countless sources of information, and face the challenge of making sense of overwhelming amounts of data in a efficient and reliable manner, which implies the execution of read-intensive operations. In the context of this challenge, a framework for the dynamic read-optimization of large dimensional datasets has been designed, and on top of it a workload-driven mechanism for automatic materialized view selection and creation has been developed. This paper presents an extensive description of this mechanism, along with a proof-of-concept implementation of it and its corresponding performance evaluation. Results show that the proposed mechanism is able to derive a limited but comprehensive set of views leading to a drop in query latency ranging from 80% to 99.99% at the expense of 13% of the disk space used by the base dataset. This way, the devised mechanism enables speeding up query execution by building materialized views that match the actual demand of query workloads

Ghent University Academic Bibliography

Data Provenance and Management in Radio Astronomy: A Stream Computing Approach

Author: Biem Alain
Elmegreen Bruce
Ensor Andrew
Gulyaev Sergei
Mahmoud Mahmoud S.
Publication venue
Publication date: 12/12/2011
Field of study

New approaches for data provenance and data management (DPDM) are required for mega science projects like the Square Kilometer Array, characterized by extremely large data volume and intense data rates, therefore demanding innovative and highly efficient computational paradigms. In this context, we explore a stream-computing approach with the emphasis on the use of accelerators. In particular, we make use of a new generation of high performance stream-based parallelization middleware known as InfoSphere Streams. Its viability for managing and ensuring interoperability and integrity of signal processing data pipelines is demonstrated in radio astronomy. IBM InfoSphere Streams embraces the stream-computing paradigm. It is a shift from conventional data mining techniques (involving analysis of existing data from databases) towards real-time analytic processing. We discuss using InfoSphere Streams for effective DPDM in radio astronomy and propose a way in which InfoSphere Streams can be utilized for large antennae arrays. We present a case-study: the InfoSphere Streams implementation of an autocorrelating spectrometer, and using this example we discuss the advantages of the stream-computing approach and the utilization of hardware accelerators

arXiv.org e-Print Archive

AUT Scholarly Commons

Recommended from our members

Technical Issues in the Development of Knowledge-Based Services for the Semantic Web

Author: Bernaras Amaia
Pedrinaci Carlos
Smithers Tim
Publication venue
Publication date: 01/01/2005
Field of study

The Semantic Web aims to extend the current Web with formal semantics in order to improve how users experience the Web, by ameliorating current activities and supporting the automation of some others. So far, current Semantic Web prototypes mostly aim at collecting and exposing information. Still, a semantic layer can support applying Knowledge-Based Systems techniques to the development of brand-new fully-ﬂedged Knowledge-Based Services for the Web. In this paper, we present the technical issues that have to be faced in the development of such a kind of application by presenting the Online Design of Events Application: a Semantic Web-based design support system that assists event organisers in the process of preparing events such as workshops and conferences, by eﬀectively reasoning over an inter-organisational process across the Web

Open Research Online (The Open University)

Mogwaï: a Framework to Handle Complex Queries on Large Models

Author: Cabot Jordi
Daniel Gwendal
Sunyé Gerson
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/06/2016
Field of study

International audienceWhile Model Driven Engineering is gaining more industrial interest, scalability issues when managing large models have become a major problem in current modeling frameworks. Scalable model persistence has been achieved by using NoSQL backends for model storage, but existing modeling framework APIs have not evolved accordingly, limiting NoSQL query performance benefits. In this paper we present the Mogwaï a scalable and efficient model query framework based on a direct translation of OCL queries to Gremlin, a query language supported by several NoSQL databases. Generated Gremlin expressions are computed inside the database itself, bypassing limitations of existing framework APIs and improving overall performance, as confirmed by our experimental results showing an improvement of execution time up to a factor of 20 and a reduction of the memory overhead up to a factor of 75 for large models

INRIA a CCSD electronic archive server

HAL Mines Nantes