6 research outputs found

    Model query transformation framework- MQT: from EMF-based model query languages to persistence-spefic query languages

    Get PDF
    Memory problems of XML Metadata Interchange (XMI) (default persistence in Eclipse Modelling Framework (EMF)) when operating large models, have motivated the appearance of alternative mechanisms for persistence of EMF models. Most recent approaches propose using database back-ends. These approaches provide support for querying models using EMF-based model query languages (Plain EMF, Object Constraint Language (OCL), EMF Query, Epsilon Object Language (EOL), etc.). However, these languages commonly require loading in-memory all the model elements that are involved in the query. In the case of queries that traverse models (most commonly used type of queries) they require to load entire model in-memory. This loading strategy causes memory problems when operated models are large. Most database back-ends provide database-specific query languages that leverage capabilities of the database engine (better performance) and without requiring in-memory load of models for query execution (lower memory footprint). For example, Structured Query Language (SQL) is a query language for relational databases and Cypher is for Neo4J databases. In this dissertation we present MQT-Engine, a framework that supports execution of model query languages but with the e ciency (in terms of memory and performance) of a database-specifoc query language. To achieve this, MQT-Engine provides a two-step query transformation mechanism: forst, queries expressed with a model query language are transformed into a Query Language Independent Model (QLI Model); and then QLI Model is transformed into a database-specifoc query that is executed directly over the database. This mechanism provides extensibility and reusability to the framework, since it facilitates the inclusion of new query languages at both sides of the transformation. A prototype of the framework is provided. It supports transformation of EOL queries into SQL queries that are executed directly over a relational Connected Data Objects (CDO) repository. The prototype has been evaluated with two experimental evaluations. First evaluation is based on the reverse engineering domain. It compares time and memory usage required by MQT-Engine and other query languages (EMF API, OCL and SQL) to execute a set of queries over models persisted with CDO. Second evaluation is based on the railway domain, and compares performance results of MQT-Engine and other query languages (EMF API, OCL, IncQuery, SQL, etc.) for executing a set of queries. Obtained results show that MQT-Engine is able to execute successfully all the evaluated experiments. MQT-Engine is one of the evaluated solutions showing best performance results for first execution of model queries. In the case of query languages executed over CDO repositories, it is the faster solution and the one requiring less memory. For example, for the largest model in the reverse engineering case it is up to 162 times faster than a model query language executed at client-side, and it requires 23 times less memory. Additionally, the query transformation overload is constant and small (less than 2 seconds). These results validate the main goal of this dissertation: to provide a framework that gives to the model engineers the ability for specifying queries in a model query language, and then execute them with a performance and memory footprint similar to that of a persistence-specific query language. However, the framework has a set of limitations: the approach should be optimized when queries are subsequently executed; it only supports nonmodification model traversal queries; and the prototype is specific for EOL queries over CDO repositories with DBStore. Therefore, it is planned to extend the framework and address these limitations in a future version.Los problemas de memoria de XMI (mecanismo de persistencia por defecto en EMF) cuando se trabaja con modelos grandes, han motivado la aparición de mecanismos de persistencia alternativos para los modelos EMF. Los enfoques más recientes proponen el uso de bases de datos para la persistencia de los modelos. La mayoría de estos enfoques soportan la ejecución de operaciones usando lenguajes de consulta de modelos basados en EMF (EMF API, OCL, EMF Query, EOL, etc.). Sin embargo, este tipo de lenguajes necesitan almacenar en memoria al menos todos los elementos implicados en la consulta (todos los elementos del modelo en las consultas que recorren completamente el modelo consultado). Esta estrategia de carga de la información para hacer las consultas provoca problemas de memoria cuando los modelos son de gran tamaño. La mayoría de las bases de datos tienen lenguajes específicos que aprovechan las capacidades del motor de la base de datos (mayor rapidez) y sin la necesidad de cargar en memoria los modelos (menor uso de memoria). Por ejemplo, SQL es el lenguaje específico para las bases de datos relacionales y Cypher para las bases de datos Neo4J. Este trabajo propone MQT-Engine, un framework que permite ejecutar lenguajes de consulta para modelos con tiempos de ejecución y uso de memoria similares al de un lenguaje específico de base de datos. MQT-Engine realiza una transformación en dos pasos de las consultas: primero transforma las consultas que han sido escritas con un lenguaje de consulta para modelos en un modelo que es independiente del lenguaje (QLI Model); después, el modelo generado se transforma en una consulta equivalente, pero escrita con un lenguaje específico de base de datos. La transformación en dos pasos proporciona extensibilidad y reusabilidad ya que facilita la inclusión de nuevos lenguajes. Se ha implementado un prototipo de MQT-Engine que transforma consultas EOL en SQL y las ejecuta directamente sobre un repositorio CDO. El prototipo se ha evaluado con dos casos de uso. El primero está basado en el dominio de la ingeniería inversa. Se han comparado los tiempos de ejecución y el uso de memoria que necesitan MQT-Engine y otros lenguajes de consulta (EMF API, OCL y SQL) para ejecutar una serie de consultas sobre modelos persistidos en CDO. El segundo caso de uso está basado en el dominio de los ferrocarriles y compara los tiempos de ejecución que necesitan MQT-Engine y otros lenguajes (EMF API, OCL, IncQuery, etc.) para ejecutar varias consultas. Los resultados obtenidos muestran que MQT-Engine es capaz de ejecutar correctamente todos los experimentos y además es una de las soluciones con mejores tiempos para la primera ejecución de las consultas de modelos. MQTEngine es la opción más rápida y que necesita menos memoria entre los lenguajes ejecutados sobre repositorios CDO. Por ejemplo, en el caso del modelo más grande de ingeniería inversa, MQT-Engine es 162 veces más rápido y necesita 23 veces menos memoria que los lenguajes de consulta de modelos ejecutados al lado del cliente. Además, la sobrecarga de la transformación es pequeña y constante (menos de 2 segundos). Estos resultados prueban el objetivo principal de esta tesis: proporcionar un framework que permite a los ingenieros de modelos definir las consultas con un lenguaje de consulta de modelos y además ejecutarlas con una con tiempos de ejecución y uso de memoria similares a los de un lenguaje específico de bases de datos. Sin embargo, la solución tiene una serie de limitaciones: solo soporta consultas que recorren el modelo completamente y sin modificarlo; el prototipo es específico para consultas en EOL y sobre repositorios CDO (relacionales); y habría que optimizar la ejecución de las consultas cuando estas se ejecutan más de una vez. Se ha planeado resolver estas limitaciones en versiones futuras del trabajo

    Extensibility of Enterprise Modelling Languages

    Get PDF
    Die Arbeit adressiert insgesamt drei Forschungsschwerpunkte. Der erste Schwerpunkt setzt sich mit zu entwickelnden BPMN-Erweiterungen auseinander und stellt deren methodische Implikationen im Rahmen der bestehenden Sprachstandards dar. Dies umfasst zum einen ganz konkrete Spracherweiterungen wie z. B. BPMN4CP, eine BPMN-Erweiterung zur multi-perspektivischen Modellierung von klinischen Behandlungspfaden. Zum anderen betrifft dieser Teil auch modellierungsmethodische Konsequenzen, um parallel sowohl die zugrunde liegende Sprache (d. h. das BPMN-Metamodell) als auch die Methode zur Erweiterungsentwicklung zu verbessern und somit den festgestellten Unzulänglichkeiten zu begegnen. Der zweite Schwerpunkt adressiert die Untersuchung von sprachunabhängigen Fragen der Erweiterbarkeit, welche sich entweder während der Bearbeitung des ersten Teils ergeben haben oder aus dessen Ergebnissen induktiv geschlossen wurden. Der Forschungsschwerpunkt fokussiert dabei insbesondere eine Konsolidierung bestehender Terminologien, die Beschreibung generisch anwendbarer Erweiterungsmechanismen sowie die nutzerorientierte Analyse eines potentiellen Erweiterungsbedarfs. Dieser Teil bereitet somit die Entwicklung einer generischen Erweiterungsmethode grundlegend vor. Hierzu zählt auch die fundamentale Auseinandersetzung mit Unternehmensmodellierungssprachen generell, da nur eine ganzheitliche, widerspruchsfreie und integrierte Sprachdefinition Erweiterungen überhaupt ermöglichen und gelingen lassen kann. Dies betrifft beispielsweise die Spezifikation der intendierten Semantik einer Sprache

    Enabling Model-Driven Live Analytics For Cyber-Physical Systems: The Case of Smart Grids

    Get PDF
    Advances in software, embedded computing, sensors, and networking technologies will lead to a new generation of smart cyber-physical systems that will far exceed the capabilities of today’s embedded systems. They will be entrusted with increasingly complex tasks like controlling electric grids or autonomously driving cars. These systems have the potential to lay the foundations for tomorrow’s critical infrastructures, to form the basis of emerging and future smart services, and to improve the quality of our everyday lives in many areas. In order to solve their tasks, they have to continuously monitor and collect data from physical processes, analyse this data, and make decisions based on it. Making smart decisions requires a deep understanding of the environment, internal state, and the impacts of actions. Such deep understanding relies on efficient data models to organise the sensed data and on advanced analytics. Considering that cyber-physical systems are controlling physical processes, decisions need to be taken very fast. This makes it necessary to analyse data in live, as opposed to conventional batch analytics. However, the complex nature combined with the massive amount of data generated by such systems impose fundamental challenges. While data in the context of cyber-physical systems has some similar characteristics as big data, it holds a particular complexity. This complexity results from the complicated physical phenomena described by this data, which makes it difficult to extract a model able to explain such data and its various multi-layered relationships. Existing solutions fail to provide sustainable mechanisms to analyse such data in live. This dissertation presents a novel approach, named model-driven live analytics. The main contribution of this thesis is a multi-dimensional graph data model that brings raw data, domain knowledge, and machine learning together in a single model, which can drive live analytic processes. This model is continuously updated with the sensed data and can be leveraged by live analytic processes to support decision-making of cyber-physical systems. The presented approach has been developed in collaboration with an industrial partner and, in form of a prototype, applied to the domain of smart grids. The addressed challenges are derived from this collaboration as a response to shortcomings in the current state of the art. More specifically, this dissertation provides solutions for the following challenges: First, data handled by cyber-physical systems is usually dynamic—data in motion as opposed to traditional data at rest—and changes frequently and at different paces. Analysing such data is challenging since data models usually can only represent a snapshot of a system at one specific point in time. A common approach consists in a discretisation, which regularly samples and stores such snapshots at specific timestamps to keep track of the history. Continuously changing data is then represented as a finite sequence of such snapshots. Such data representations would be very inefficient to analyse, since it would require to mine the snapshots, extract a relevant dataset, and finally analyse it. For this problem, this thesis presents a temporal graph data model and storage system, which consider time as a first-class property. A time-relative navigation concept enables to analyse frequently changing data very efficiently. Secondly, making sustainable decisions requires to anticipate what impacts certain actions would have. Considering complex cyber-physical systems, it can come to situations where hundreds or thousands of such hypothetical actions must be explored before a solid decision can be made. Every action leads to an independent alternative from where a set of other actions can be applied and so forth. Finding the sequence of actions that leads to the desired alternative, requires to efficiently create, represent, and analyse many different alternatives. Given that every alternative has its own history, this creates a very high combinatorial complexity of alternatives and histories, which is hard to analyse. To tackle this problem, this dissertation introduces a multi-dimensional graph data model (as an extension of the temporal graph data model) that enables to efficiently represent, store, and analyse many different alternatives in live. Thirdly, complex cyber-physical systems are often distributed, but to fulfil their tasks these systems typically need to share context information between computational entities. This requires analytic algorithms to reason over distributed data, which is a complex task since it relies on the aggregation and processing of various distributed and constantly changing data. To address this challenge, this dissertation proposes an approach to transparently distribute the presented multi-dimensional graph data model in a peer-to-peer manner and defines a stream processing concept to efficiently handle frequent changes. Fourthly, to meet future needs, cyber-physical systems need to become increasingly intelligent. To make smart decisions, these systems have to continuously refine behavioural models that are known at design time, with what can only be learned from live data. Machine learning algorithms can help to solve this unknown behaviour by extracting commonalities over massive datasets. Nevertheless, searching a coarse-grained common behaviour model can be very inaccurate for cyber-physical systems, which are composed of completely different entities with very different behaviour. For these systems, fine-grained learning can be significantly more accurate. However, modelling, structuring, and synchronising many fine-grained learning units is challenging. To tackle this, this thesis presents an approach to define reusable, chainable, and independently computable fine-grained learning units, which can be modelled together with and on the same level as domain data. This allows to weave machine learning directly into the presented multi-dimensional graph data model. In summary, this thesis provides an efficient multi-dimensional graph data model to enable live analytics of complex, frequently changing, and distributed data of cyber-physical systems. This model can significantly improve data analytics for such systems and empower cyber-physical systems to make smart decisions in live. The presented solutions combine and extend methods from model-driven engineering, [email protected], data analytics, database systems, and machine learning
    corecore