13 research outputs found

    Verifying UML/OCL operation contracts

    Get PDF
    In current model-driven development approaches, software models are the primary artifacts of the development process. Therefore, assessment of their correctness is a key issue to ensure the quality of the final application. Research on model consistency has focused mostly on the models' static aspects. Instead, this paper addresses the verification of their dynamic aspects, expressed as a set of operations defined by means of pre/postcondition contracts. This paper presents an automatic method based on Constraint Programming to verify UML models extended with OCL constraints and operation contracts. In our approach, both static and dynamic aspects are translated into a Constraint Satisfaction Problem. Then, compliance of the operations with respect to several correctness properties such as operation executability or determinism are formally verified

    DL-lite with attributes and datatypes

    Get PDF
    We extend the DL-Lite languages by means of attributes and datatypes. Attributes -- a notion borrowed from data models -- associate concrete values from datatypes to abstract objects and in this way complement roles, which describe relationships between abstract objects. The extended languages remain tractable (with a notable exception) even though they contain both existential and (a limited form of) universal quantification. We present complexity results for two most important reasoning problems in DL-Lite: combined complexity of knowledge base satisfiability and data complexity of positive existential query answering

    Verification of Evolving Graph-structured Data under Expressive Path Constraints

    Get PDF
    Integrity constraints play a central role in databases and, among other applications, are fundamental for preserving data integrity when databases evolve as a result of operations manipulating the data. In this context, an important task is that of static verification, which consists in deciding whether a given set of constraints is preserved after the execution of a given sequence of operations, for every possible database satisfying the initial constraints. In this paper, we consider constraints over graph-structured data formulated in an expressive Description Logic (DL) that allows for regular expressions over binary relations and their inverses, generalizing many of the well-known path constraint languages proposed for semi-structured data in the last two decades. In this setting, we study the problem of static verification, for operations expressed in a simple yet flexible language built from additions and deletions of complex DL expressions. We establish undecidability of the general setting, and identify suitable restricted fragments for which we obtain tight complexity results, building on techniques developed in our previous work for simpler DLs. As a by-product, we obtain new (un)decidability results for the implication problem of path constraints, and improve previous upper bounds on the complexity of the problem

    Lenguajes austeros de modelado conceptual de datos basados en evidencias

    Get PDF
    Multiple logic-based reconstructions of UML class diagram, Entity Relationship diagrams, and Obect-Role Model diagrams exists. They mainly cover various fragments of these Conceptual Data Modelling Languages and none are formalised such that the logic applies simultaneously for the three language families as a unifying mechanism. This hampers interchangeability, interoperability, and tooling support. In addition, due to the lack of a systematic design process of the logic used for the formalisation, hidden choices permeate the formalisations that have rendered them incompatible. We aim to address these problems, first, by structuring the logic design process in a methodological way. We generalise and extend the DSL design process to logic language design. In particular, a new phase of ontological analysis of language features is included, to apply to logic language design more generally and, in particular, by incorporating an ontological analysis of language features in the process. Second, we specify minimal logic profiles availing of this extended process, including the ontological commitments embedded in the languages, of evidence gathered of language feature usage, and of computational complexity insights from Description Logics (DL). The profiles characterise the essential logic structure needed to handle the semantics of conceptual models, therewith enabling the development of interoperability tools. No known DL language matches exactly the features of those profiles and the common core is in the tractable DL ACJfl. Although hardly any inconsistencies can be derived with the profiles, it is promising for scalable runtime use of conceptual data models.Existen varias reconstrucciones basadas en lógica de lenguajes de modelado conceptual como EER, diagramas de clases UML y ORM. Principalmente cubren fragmentos de estos lenguajes, y sus formalizaciones no están hechas para que se apliquen simultáneamente a estas tres familias de lenguajes como un mecanismo de unificación. Este hecho atenta contra el intercambio y la interoperabilidad de los modelos y el desarrollo de herramientas de soporte. Además, dada la falta de un proceso sistemático de diseño, ciertas decisiones ocultas en la representación lógica hacen que las formalizaciones sean incompatibles. En este trabajo nos proponemos atacar este problema, proponiendo primero un proceso de diseño lógico que puede ser aplicado en forma metodológica. Se generaliza y extiende el proceso DSL para que se pueda aplicar al diseño de lenguajes lógicos en general, incorporando análisis ontológico de las características del lenguaje. Segundo, se especifican perfiles lógicos minimales que sacan provecho de este proceso extendido, incluyendo los compromisos ontológicos asumidos, de evidencia de uso de las características del lenguaje, y de los propiedades computacionales de las Lógicas Descriptivas (DL, description logics). Estos perfiles caracterizan la estructura lógica esencial que se necesita para manejar la semántica de los modelos conceptuales, habilitando el desarrollo de herramientas automáticas de interoperabilidad. No existe correspondencia exacta directa entre estos perfiles y fragmentos conocidos de lenguajes DL, y el núcleo común es pequeño (la lógica tratable ACNT). Aunque es muy poca la posibilidad de derivar inconsistencias dentro de estos perfiles, es prometedor su uso en modelos conceptuales dado su complejidad en tiempo escalable.Facultad de Informátic

    Evidence-based lean logic profiles for conceptual data modelling languages

    Get PDF
    Multiple logic-based reconstruction of conceptual data modelling languages such as EER, UML Class Diagrams, and ORM exists. They mainly cover various fragments of the languages and none are formalised such that the logic applies simultaneously for all three modelling language families as unifying mechanism. This hampers interchangeability, interoperability, and tooling support. In addition, due to the lack of a systematic design process of the logic used for the formalisation, hidden choices permeate the formalisations that have rendered them incompatible. We aim to address these problems, first, by structuring the logic design process in a methodological way. We generalise and extend the DSL design process to apply to logic language design more generally and, in particular, by incorporating an ontological analysis of language features in the process. Second, availing of this extended process, of evidence gathered of language feature usage, and of computational complexity insights from Description Logics (DL), we specify logic profiles taking into account the ontological commitments embedded in the languages. The profiles characterise the minimum logic structure needed to handle the semantics of conceptual models, enabling the development of interoperability tools. There is no known DL language that matches exactly the features of those profiles and the common core is small (in the tractable ALNI). Although hardly any inconsistencies can be derived with the profiles, it is promising for scalable runtime use of conceptual data models

    An ontology-driven unifying metamodel of UML Class Diagrams, EER, and ORM2

    Get PDF
    Software interoperability and application integration can be realized \linebreak through using their respective conceptual data models, which may be represented in different conceptual data modeling languages. Such modeling languages seem similar, yet are known to be distinct. Several translations between subsets of the languages' features exist, but there is no unifying framework that respects most language features of the static structural components and constraints. We aim to fill this gap. To this end, we designed a common and unified ontology-driven metamodel of the static, structural components and constraints in such a way that it unifies ER, EER, UML Class Diagrams v2.4.1, and ORM and ORM2 such that each one is a proper fragment of the consistent metamodel. The paper also presents some notable insights into the relatively few common entities and constraints, an analysis on roles, relationships, and attributes, and other modeling motivations are discussed. We describe two practical use cases of the metamodel, being a quantitative assessment of the entities of 30 models in ER/EER, UML, and ORM/ORM2, and a qualitative evaluation of inter-model assertions

    Metadata-driven data integration

    Get PDF
    Cotutela: Universitat Politècnica de Catalunya i Université Libre de Bruxelles, IT4BI-DC programme for the joint Ph.D. degree in computer science.Data has an undoubtable impact on society. Storing and processing large amounts of available data is currently one of the key success factors for an organization. Nonetheless, we are recently witnessing a change represented by huge and heterogeneous amounts of data. Indeed, 90% of the data in the world has been generated in the last two years. Thus, in order to carry on these data exploitation tasks, organizations must first perform data integration combining data from multiple sources to yield a unified view over them. Yet, the integration of massive and heterogeneous amounts of data requires revisiting the traditional integration assumptions to cope with the new requirements posed by such data-intensive settings. This PhD thesis aims to provide a novel framework for data integration in the context of data-intensive ecosystems, which entails dealing with vast amounts of heterogeneous data, from multiple sources and in their original format. To this end, we advocate for an integration process consisting of sequential activities governed by a semantic layer, implemented via a shared repository of metadata. From an stewardship perspective, this activities are the deployment of a data integration architecture, followed by the population of such shared metadata. From a data consumption perspective, the activities are virtual and materialized data integration, the former an exploratory task and the latter a consolidation one. Following the proposed framework, we focus on providing contributions to each of the four activities. We begin proposing a software reference architecture for semantic-aware data-intensive systems. Such architecture serves as a blueprint to deploy a stack of systems, its core being the metadata repository. Next, we propose a graph-based metadata model as formalism for metadata management. We focus on supporting schema and data source evolution, a predominant factor on the heterogeneous sources at hand. For virtual integration, we propose query rewriting algorithms that rely on the previously proposed metadata model. We additionally consider semantic heterogeneities in the data sources, which the proposed algorithms are capable of automatically resolving. Finally, the thesis focuses on the materialized integration activity, and to this end, proposes a method to select intermediate results to materialize in data-intensive flows. Overall, the results of this thesis serve as contribution to the field of data integration in contemporary data-intensive ecosystems.Les dades tenen un impacte indubtable en la societat. La capacitat d’emmagatzemar i processar grans quantitats de dades disponibles és avui en dia un dels factors claus per l’èxit d’una organització. No obstant, avui en dia estem presenciant un canvi representat per grans volums de dades heterogenis. En efecte, el 90% de les dades mundials han sigut generades en els últims dos anys. Per tal de dur a terme aquestes tasques d’explotació de dades, les organitzacions primer han de realitzar una integració de les dades, combinantles a partir de diferents fonts amb l’objectiu de tenir-ne una vista unificada d’elles. Per això, aquest fet requereix reconsiderar les assumpcions tradicionals en integració amb l’objectiu de lidiar amb els requisits imposats per aquests sistemes de tractament massiu de dades. Aquesta tesi doctoral té com a objectiu proporcional un nou marc de treball per a la integració de dades en el context de sistemes de tractament massiu de dades, el qual implica lidiar amb una gran quantitat de dades heterogènies, provinents de múltiples fonts i en el seu format original. Per això, proposem un procés d’integració compost d’una seqüència d’activitats governades per una capa semàntica, la qual és implementada a partir d’un repositori de metadades compartides. Des d’una perspectiva d’administració, aquestes activitats són el desplegament d’una arquitectura d’integració de dades, seguit per la inserció d’aquestes metadades compartides. Des d’una perspectiva de consum de dades, les activitats són la integració virtual i materialització de les dades, la primera sent una tasca exploratòria i la segona una de consolidació. Seguint el marc de treball proposat, ens centrem en proporcionar contribucions a cada una de les quatre activitats. La tesi inicia proposant una arquitectura de referència de software per a sistemes de tractament massiu de dades amb coneixement semàntic. Aquesta arquitectura serveix com a planell per a desplegar un conjunt de sistemes, sent el repositori de metadades al seu nucli. Posteriorment, proposem un model basat en grafs per a la gestió de metadades. Concretament, ens centrem en donar suport a l’evolució d’esquemes i fonts de dades, un dels factors predominants en les fonts de dades heterogènies considerades. Per a l’integració virtual, proposem algorismes de rescriptura de consultes que usen el model de metadades previament proposat. Com a afegitó, considerem heterogeneïtat semàntica en les fonts de dades, les quals els algorismes de rescriptura poden resoldre automàticament. Finalment, la tesi es centra en l’activitat d’integració materialitzada. Per això proposa un mètode per a seleccionar els resultats intermedis a materialitzar un fluxes de tractament intensiu de dades. En general, els resultats d’aquesta tesi serveixen com a contribució al camp d’integració de dades en els ecosistemes de tractament massiu de dades contemporanisLes données ont un impact indéniable sur la société. Le stockage et le traitement de grandes quantités de données disponibles constituent actuellement l’un des facteurs clés de succès d’une entreprise. Néanmoins, nous assistons récemment à un changement représenté par des quantités de données massives et hétérogènes. En effet, 90% des données dans le monde ont été générées au cours des deux dernières années. Ainsi, pour mener à bien ces tâches d’exploitation des données, les organisations doivent d’abord réaliser une intégration des données en combinant des données provenant de sources multiples pour obtenir une vue unifiée de ces dernières. Cependant, l’intégration de quantités de données massives et hétérogènes nécessite de revoir les hypothèses d’intégration traditionnelles afin de faire face aux nouvelles exigences posées par les systèmes de gestion de données massives. Cette thèse de doctorat a pour objectif de fournir un nouveau cadre pour l’intégration de données dans le contexte d’écosystèmes à forte intensité de données, ce qui implique de traiter de grandes quantités de données hétérogènes, provenant de sources multiples et dans leur format d’origine. À cette fin, nous préconisons un processus d’intégration constitué d’activités séquentielles régies par une couche sémantique, mise en oeuvre via un dépôt partagé de métadonnées. Du point de vue de la gestion, ces activités consistent à déployer une architecture d’intégration de données, suivies de la population de métadonnées partagées. Du point de vue de la consommation de données, les activités sont l’intégration de données virtuelle et matérialisée, la première étant une tâche exploratoire et la seconde, une tâche de consolidation. Conformément au cadre proposé, nous nous attachons à fournir des contributions à chacune des quatre activités. Nous commençons par proposer une architecture logicielle de référence pour les systèmes de gestion de données massives et à connaissance sémantique. Une telle architecture consiste en un schéma directeur pour le déploiement d’une pile de systèmes, le dépôt de métadonnées étant son composant principal. Ensuite, nous proposons un modèle de métadonnées basé sur des graphes comme formalisme pour la gestion des métadonnées. Nous mettons l’accent sur la prise en charge de l’évolution des schémas et des sources de données, facteur prédominant des sources hétérogènes sous-jacentes. Pour l’intégration virtuelle, nous proposons des algorithmes de réécriture de requêtes qui s’appuient sur le modèle de métadonnées proposé précédemment. Nous considérons en outre les hétérogénéités sémantiques dans les sources de données, que les algorithmes proposés sont capables de résoudre automatiquement. Enfin, la thèse se concentre sur l’activité d’intégration matérialisée et propose à cette fin une méthode de sélection de résultats intermédiaires à matérialiser dans des flux des données massives. Dans l’ensemble, les résultats de cette thèse constituent une contribution au domaine de l’intégration des données dans les écosystèmes contemporains de gestion de données massivesPostprint (published version
    corecore