646 research outputs found

    Data quality evaluation through data quality rules and data provenance.

    Get PDF
    The application and exploitation of large amounts of data play an ever-increasing role in today’s research, government, and economy. Data understanding and decision making heavily rely on high quality data; therefore, in many different contexts, it is important to assess the quality of a dataset in order to determine if it is suitable to be used for a specific purpose. Moreover, as the access to and the exchange of datasets have become easier and more frequent, and as scientists increasingly use the World Wide Web to share scientific data, there is a growing need to know the provenance of a dataset (i.e., information about the processes and data sources that lead to its creation) in order to evaluate its trustworthiness. In this work, data quality rules and data provenance are used to evaluate the quality of datasets. Concerning the first topic, the applied solution consists in the identification of types of data constraints that can be useful as data quality rules and in the development of a software tool to evaluate a dataset on the basis of a set of rules expressed in the XML markup language. We selected some of the data constraints and dependencies already considered in the data quality field, but we also used order dependencies and existence constraints as quality rules. In addition, we developed some algorithms to discover the types of dependencies used in the tool. To deal with the provenance of data, the Open Provenance Model (OPM) was adopted, an experimental query language for querying OPM graphs stored in a relational database was implemented, and an approach to design OPM graphs was proposed

    Relational Extensions : Object-Relational and XML Extensions

    Get PDF
    Mòdul 2 del llibre Database Architecture. UOC, 20122022/202

    Providing in RDBMSs the Flexibility to Work with Various Non-Relational Data Models

    Get PDF
    The inability of pure relational DBMSs to meet the new requirements of the applications which have emerged on the web has led to the advent of NoSQL DBMSs In recent years significant progress has been made in integrating into relational DBMSs the features essentials for taking into consideration these new requirements which mainly concern flexibility performances horizontal scaling and very high availability This paper focuses on the features which can enable the relational DBMSs to provide applications with the flexibility to work with various non-relational data models while providing the guarantees of independence integrity and performance of query evaluatio

    An OCL-Based approach to derive constraint test cases for database applications

    Get PDF
    The development of database applications in most CASE tools has been insufficient because most of these tools do not provide the software necessary to validate these appli-cations. Validation means ensuring whether a given application fulfils the user require-ments. We suggest validation of database applications by using the functional testing technique, which is a fundamental black-box testing technique for checking the software without being concerned about its implementation and structure. Our main contribu-tion to this work is in providing a MDA approach for deriving testing software from the OCL specification of the integrity constraints. This testing software is used to validate the database applications, which are used to enforce these constraints. The generated testing software includes three components: validation queries, test cases and initial data inserted before the testing process. Our approach is implemented as an add-in tool in Rational Rose called OCL2TestSW.This work has been partially supported by the project Thuban: Natural Interaction Platform for Virtual Attending in Real Environments (TIN2008-02711), and also by the Spanish research projects: MA2VICMR: Improving the access, analysis and visibility of the multilingual and multimedia information in web for the Region of Madrid (S2009/TIC-1542).Publicad

    Global Semantic Integrity Constraint Checking for a System of Databases

    Get PDF
    In today’s emerging information systems, it is natural to have data distributed across multiple sites. We define a System of Databases (SyDb) as a collection of autonomous and heterogeneous databases. R-SyDb (System of Relational Databases) is a restricted form of SyDb, referring to a collection of relational databases, which are independent. Similarly, X-SyDb (System of XML Databases) refers to a collection of XML databases. Global integrity constraints ensure integrity and consistency of data spanning multiple databases. In this dissertation, we present (i) Constraint Checker, a general framework of a mobile agent based approach for checking global constraints on R-SyDb, and (ii) XConstraint Checker, a general framework for checking global XML constraints on X-SyDb. Furthermore, we formalize multiple efficient algorithms for varying semantic integrity constraints involving both arithmetic and aggregate predicates. The algorithms take as input an update statement, list of all global semantic integrity constraints with arithmetic predicates or aggregate predicates and outputs sub-constraints to be executed on remote sites. The algorithms are efficient since (i) constraint check is carried out at compile time, i.e. before executing update statement; hence we save time and resources by avoiding rollbacks, and (ii) the implementation exploits parallelism. We have also implemented a prototype of systems and algorithms for both R-SyDb and X-SyDb. We also present performance evaluations of the system

    Proceedings of Monterey Workshop 2001 Engineering Automation for Sofware Intensive System Integration

    Get PDF
    The 2001 Monterey Workshop on Engineering Automation for Software Intensive System Integration was sponsored by the Office of Naval Research, Air Force Office of Scientific Research, Army Research Office and the Defense Advance Research Projects Agency. It is our pleasure to thank the workshop advisory and sponsors for their vision of a principled engineering solution for software and for their many-year tireless effort in supporting a series of workshops to bring everyone together.This workshop is the 8 in a series of International workshops. The workshop was held in Monterey Beach Hotel, Monterey, California during June 18-22, 2001. The general theme of the workshop has been to present and discuss research works that aims at increasing the practical impact of formal methods for software and systems engineering. The particular focus of this workshop was "Engineering Automation for Software Intensive System Integration". Previous workshops have been focused on issues including, "Real-time & Concurrent Systems", "Software Merging and Slicing", "Software Evolution", "Software Architecture", "Requirements Targeting Software" and "Modeling Software System Structures in a fastly moving scenario".Office of Naval ResearchAir Force Office of Scientific Research Army Research OfficeDefense Advanced Research Projects AgencyApproved for public release, distribution unlimite

    Relaxed Functional Dependencies - A Survey of Approaches

    Get PDF
    Recently, there has been a renovated interest in functional dependencies due to the possibility of employing them in several advanced database operations, such as data cleaning, query relaxation, record matching, and so forth. In particular, the constraints defined for canonical functional dependencies have been relaxed to capture inconsistencies in real data, patterns of semantically related data, or semantic relationships in complex data types. In this paper, we have surveyed 35 of such functional dependencies, providing a classification criteria, motivating examples, and a systematic analysis of them

    A Formal Approach to Specification, Analysis and Implementation of Policy-Based Systems

    Get PDF
    The design of modern computing systems largely exploits structured sets of declarative rules called policies. Their principled use permits controlling a wide variety of system aspects and achieving separation of concerns between the managing and functional parts of systems. These so-called policy-based systems are utilised within different application domains, from network management and autonomic computing to access control and emergency handling. The various policy-based proposals from the literature lack however a comprehensive methodology supporting the whole life-cycle of system development: specification, analysis and implementation. In this thesis we propose formally-defined tool-assisted methodologies for supporting the development of policy-based access control and autonomic computing systems. We first present FACPL, a formal language that defines a core, yet expressive syntax for the specification of attribute-based access control policies. On the base of its denotational semantics, we devise a constraint-based analysis approach that enables the automatic verification of different properties of interest on policies. We then present PSCEL, a FACPL-based formal language for the specification of autonomic computing systems. FACPL policies are employed to enforce authorisation controls and context-dependent adaptation strategies. To statically point out the effects of policies on system behaviours, we rely again on a constraint-based analysis approach and reason on progress properties of PSCEL systems. The implementation of the languages and their analyses provides us some practical software tools. The effectiveness of the proposed solutions is illustrated through real-world case studies from the e-Health and autonomic computing domains

    The 2nd Twente Data Management Workshop (TDM'06) on Uncertainty in Databases

    Get PDF

    Maintaining Integrity Constraints in Semantic Web

    Get PDF
    As an expressive knowledge representation language for Semantic Web, Web Ontology Language (OWL) plays an important role in areas like science and commerce. The problem of maintaining integrity constraints arises because OWL employs the Open World Assumption (OWA) as well as the Non-Unique Name Assumption (NUNA). These assumptions are typically suitable for representing knowledge distributed across the Web, where the complete knowledge about a domain cannot be assumed, but make it challenging to use OWL itself for closed world integrity constraint validation. Integrity constraints (ICs) on ontologies have to be enforced; otherwise conflicting results would be derivable from the same knowledge base (KB). The current trends of incorporating ICs into OWL are based on its query language SPARQL, alternative semantics, or logic programming. These methods usually suffer from limited types of constraints they can handle, and/or inherited computational expensiveness. This dissertation presents a comprehensive and efficient approach to maintaining integrity constraints. The design enforces data consistency throughout the OWL life cycle, including the processes of OWL generation, maintenance, and interactions with other ontologies. For OWL generation, the Paraconsistent model is used to maintain integrity constraints during the relational database to OWL translation process. Then a new rule-based language with set extension is introduced as a platform to allow users to specify constraints, along with a demonstration of 18 commonly used constraints written in this language. In addition, a new constraint maintenance system, called Jena2Drools, is proposed and implemented, to show its effectiveness and efficiency. To further handle inconsistencies among multiple distributed ontologies, this work constructs a framework to break down global constraints into several sub-constraints for efficient parallel validation
    • …
    corecore