5,439 research outputs found

    A Review of integrity constraint maintenance and view updating techniques

    Get PDF
    Two interrelated problems may arise when updating a database. On one hand, when an update is applied to the database, integrity constraints may become violated. In such case, the integrity constraint maintenance approach tries to obtain additional updates to keep integrity constraints satisfied. On the other hand, when updates of derived or view facts are requested, a view updating mechanism must be applied to translate the update request into correct updates of the underlying base facts. This survey reviews the research performed on integrity constraint maintenance and view updating. It is proposed a general framework to classify and to compare methods that tackle integrity constraint maintenance and/or view updating. Then, we analyze some of these methods in more detail to identify their actual contribution and the main limitations they may present.Postprint (published version

    Functorial Data Migration

    Get PDF
    In this paper we present a simple database definition language: that of categories and functors. A database schema is a small category and an instance is a set-valued functor on it. We show that morphisms of schemas induce three "data migration functors", which translate instances from one schema to the other in canonical ways. These functors parameterize projections, unions, and joins over all tables simultaneously and can be used in place of conjunctive and disjunctive queries. We also show how to connect a database and a functional programming language by introducing a functorial connection between the schema and the category of types for that language. We begin the paper with a multitude of examples to motivate the definitions, and near the end we provide a dictionary whereby one can translate database concepts into category-theoretic concepts and vice-versa.Comment: 30 page

    Asynchronous replication of eventually consistent updatable views

    Get PDF
    Users of software applications expect fast response times and high availability. This is despite several applications moving from local devices and into the cloud. A cloud-based application that could function locally will now be unavailable if a network partition occurs. A fundamental challenge in distributed systems is maintaining the right tradeoffs between strong consistency, high availability, and tolerance to network partitions. The impossibility of achieving all three properties is described in the CAP theorem. To guarantee the highest degree of responsiveness and availability, applications could be run entirely locally on a device without directly relying on cloud services. Software that can be run locally without a direct dependency on cloud services are called local-first software. Being local-first means that consistency guarantees may need to be relaxed. Weaker consistency, such as eventual consistency, can be used instead of strong consistency. Implementing conflict-free replicated data types (CRDTs) is a provably correct way to achieve eventual consistency. These data types guarantee that the state of different replicas will converge towards a common state when a system becomes connected and quiescent. The drawback of using CRDTs is that they are unbounded in their growth. This means they can quickly become too large to handle using less capable devices like smartphones, tablets, or other edge devices. To mitigate this, partial replication can be implemented to replicate only the data each device needs. This comes with the added benefit of limiting the information users obtain, thus possibly improving security and privacy. The main contribution of this thesis is a new approach to partial replication. It is based on an existing asynchronously replicated relational database to support local-first software and guarantees eventual consistency. The new approach uses database views to define partial replicas. The database views are made updatable by drawing inspiration from the large body of research on updatable views. We differentiate ourselves from earlier work on non-distributed updatable views by guaranteeing that the views are eventually consistent. The approach is evaluated to ensure it can be used for real scenarios. The approach has proved to be usable in the scenarios. The replication of database views has also been experimentally tested to ensure that our approach to partial replication is viable for less capable devices

    Efficient Management of Short-Lived Data

    Full text link
    Motivated by the increasing prominence of loosely-coupled systems, such as mobile and sensor networks, which are characterised by intermittent connectivity and volatile data, we study the tagging of data with so-called expiration times. More specifically, when data are inserted into a database, they may be tagged with time values indicating when they expire, i.e., when they are regarded as stale or invalid and thus are no longer considered part of the database. In a number of applications, expiration times are known and can be assigned at insertion time. We present data structures and algorithms for online management of data tagged with expiration times. The algorithms are based on fully functional, persistent treaps, which are a combination of binary search trees with respect to a primary attribute and heaps with respect to a secondary attribute. The primary attribute implements primary keys, and the secondary attribute stores expiration times in a minimum heap, thus keeping a priority queue of tuples to expire. A detailed and comprehensive experimental study demonstrates the well-behavedness and scalability of the approach as well as its efficiency with respect to a number of competitors.Comment: switched to TimeCenter latex styl

    Updating XML Views

    Get PDF
    Update operations over XML views are essential for applications using XML views. In this dissertation work, we provide scalable solutions to support updating through XML views defined over relational databases. Especially we focus on the update-public semantic, where updates are always public (made to the public database), and the update-local semantic, where update effects are first kept local and then made public as and when required. Towards this, we propose the clean extended-source theory for determining whether a correct view update translation exists, which then serves as a theoretical foundation for us to design practical XML view updating algorithms. Under update-public semantic, state-of-the-art view updating work focus on identifying the correct update translation purely on the data. We instead take a schema-centric solution, which utilizes the schema of the underlying source to effectively prune updates that are guaranteed to be not translatable and pass updates that are guaranteed to be translatable directly to the SQL engine. Only those updates that could not be classified using schema knowledge are finally analyzed by examining the data. This required data-level check is further optimized under schema guidance to prune the search space for finding a correct translation. As the first work addressing the update-local semantic, we propose a practical framework, called LoGo. LoGo Localizes the view update translation, while preserves the properties of views being side-effect free and updates being always updatable. LoGo also supports on-demand merging of the local database of the subject viewinto the public database (also called global database), while still guaranteeing the subject view being free of side effects. A flexible synchronization service is provided in LoGo that enables all other views defined over the same public database to be refreshed, i.e., synchronized with the publically committed changes, if so desired. Further, given that XMLis an ordered datamodel,we propose an ordersensitive solution named O-HUX to support XML view updating with order. We have implemented the algorithms, along with respective optimization techniques. Experimental results confirm the effectiveness of the proposed services, and highlight its performance characteristics

    Technology for large-scale translation of clinical practice guidelines : a pilot study of the performance of a hybrid human and computer-assisted approach

    Get PDF
    Background: The construction of EBMPracticeNet, a national electronic point-of-care information platform in Belgium, was initiated in 2011 to optimize quality of care by promoting evidence-based decision-making. The project involved, among other tasks, the translation of 940 EBM Guidelines of Duodecim Medical Publications from English into Dutch and French. Considering the scale of the translation process, it was decided to make use of computer-aided translation performed by certificated translators with limited expertise in medical translation. Our consortium used a hybrid approach, involving a human translator supported by a translation memory (using SDL Trados Studio), terminology recognition (using SDL Multiterm termbases) from medical termbases and support from online machine translation. This has resulted in a validated translation memory which is now in use for the translation of new and updated guidelines. Objective: The objective of this study was to evaluate the performance of the hybrid human and computer-assisted approach in comparison with translation unsupported by translation memory and terminology recognition. A comparison was also made with the translation efficiency of an expert medical translator. Methods: We conducted a pilot trial in which two sets of 30 new and 30 updated guidelines were randomized to one of three groups. Comparable guidelines were translated (a) by certificated junior translators without medical specialization using the hybrid method (b) by an experienced medical translator without this support and (c) by the same junior translators without the support of the validated translation memory. A medical proofreader who was blinded for the translation procedure, evaluated the translated guidelines for acceptability and adequacy. Translation speed was measured by recording translation and post-editing time. The Human Translation Edit Rate was calculated as a metric to evaluate the quality of the translation. A further evaluation was made of translation acceptability and adequacy. Results: The average number of words per guideline was 1,195 and the mean total translation time was 100.2 min/1,000 words. No meaningful differences were found in the translation speed for new guidelines. The translation of updated guidelines was 59 min/1,000 words faster (95% CI 2-115; P=.044) in the computer-aided group. Revisions due to terminology accounted for one third of the overall revisions by the medical proofreader. Conclusions: Use of the hybrid human and computer-aided translation by a non-expert translator makes the translation of updates of clinical practice guidelines faster and cheaper because of the benefits of translation memory. For the translation of new guidelines there was no apparent benefit in comparison with the efficiency of translation unsupported by translation memory (whether by an expert or non-expert translator

    Towards an Efficient Evaluation of General Queries

    Get PDF
    Database applications often require to evaluate queries containing quantifiers or disjunctions, e.g., for handling general integrity constraints. Existing efficient methods for processing quantifiers depart from the relational model as they rely on non-algebraic procedures. Looking at quantified query evaluation from a new angle, we propose an approach to process quantifiers that makes use of relational algebra operators only. Our approach performs in two phases. The first phase normalizes the queries producing a canonical form. This form permits to improve the translation into relational algebra performed during the second phase. The improved translation relies on a new operator - the complement-join - that generalizes the set difference, on algebraic expressions of universal quantifiers that avoid the expensive division operator in many cases, and on a special processing of disjunctions by means of constrained outer-joins. Our method achieves an efficiency at least comparable with that of previous proposals, better in most cases. Furthermore, it is considerably simpler to implement as it completely relies on relational data structures and operators
    • …
    corecore