3 research outputs found

    Static Analysis of Partial Referential Integrity for Better Quality SQL Data

    Get PDF
    Referential integrity ensures the consistency of data between database relations. The SQL standard proposes different semantics to deal with partial information under referential integrity. Simple semantics neglects tuples with nulls, and enjoys built-in support by commercial database systems. Partial semantics does check tuples with nulls, but does not enjoy built-in support. We investigate this mismatch between the SQL standard and real database systems. Indeed, insight is gained into the trade-off between cleaner data under partial semantics and the efficiency of checking simple semantics. The cost for referential integrity checking is evaluated for various dataset sizes, indexing structures and degrees of cleanliness. While the cost of partial semantics exceeds that of simple semantics, their performance trends follow similar patterns under growing database sizes. Applying multiple index structures and exploiting appropriate validation mechanisms increase the efficiency of checking partial semantics

    XML-Based Heterogeneous Database Integration For Data Warehouse Creation

    Get PDF

    Refining imprecise data by integrity constraints

    No full text
    [[abstract]]Uncertain data in databases were originally denoted as null values, which represent the meaning of ‘values unknown at present.” Null values were generalized into partial values, which correspond to a set of possible values, to provide a more powerful notion. In this paper, we derive some properties to refine partial values into more informative ones. In some cases, they can even be refined into definite values. Such a refinement is possible when there exist range constraint on attribute domains, or referential integrities, functional dependencies, or multivalued dependencies among attributes. Our work actually eliminates redundant elements in a partial value. By this process, we not only provide a more concise and informative answer to users, but also speedup the computation of queries issued afterward. Besides, it reduces the communication cost when imprecise data are requested to be transmitted from one site to another site in a distributed environment
    corecore