168,305 research outputs found

    Agreement graphs and data dependencies

    Get PDF
    The problem of deciding whether a join dependency [R] and a set F of functional dependencies logically imply an embedded join dependency [S] is known to be NP-complete. It is shown that if the set F of functional dependencies is required to be embedded in R, the problem can be decided in polynomial time. The problem is approached by introducing agreement graphs, a type of graph structure which helps expose the combinatorial structure of dependency implication problems. Agreement graphs provide an alternative formalism to tableaus and extend the application of graph and hypergraph theory in relational database research;Agreement graphs are also given a more abstract definition and are used to define agreement graph dependencies (AGDs). It is shown that AGDs are equivalent to Fagin\u27s (unirelational) embedded implicational dependencies. A decision method is given for the AGD implication problem. Although the implication problem for AGDs is undecidable, the decision method works in many cases and lends insight into dependency implication. A number of properties of agreement graph dependencies are given and directions for future research are suggested

    Student copula method in rainfall distribution

    Get PDF
    Copulas are tools for modelling dependence of several random variables. The term copula was first used in the work of Sklar (1959) and is derived from the latin word copulare, to connect or to join. The main purpose of copulas is to describe the interrelation of several random variables. (Thorsten Schmidt, 2006). Copula is a function that joins the two distributions and known as dependence functions. Copula connect multivariate distribution function to its univariate marginal distribution. When we have two models having the problems relating to dependence, we can join that models becoming one model using marginal function. So, the dependency is taken care. It means that copula played an important role to join multivariate distributions to their one dimensional marginal distribution function

    Robust and Skew-resistant Parallel Joins in Shared-Nothing Systems

    Get PDF
    The performance of joins in parallel database management systems is critical for data intensive operations such as querying. Since data skew is common in many applications, poorly engineered join operations result in load imbalance and performance bottlenecks. State-of-the-art methods designed to handle this problem offer significant improvements over naive implementations. However, performance could be further improved by removing the dependency on global skew knowledge and broadcasting. In this paper, we propose PRPQ (partial redistribution & partial query), an efficient and robust join algorithm for processing large-scale joins over distributed systems. We present the detailed implementation and a quantitative evaluation of our method. The experimental results demonstrate that the proposed PRPQ algorithm is indeed robust and scalable under a wide range of skew conditions. Specifically, compared to the state-of-art PRPD method, we achieve 16% - 167% performance improvement and 24% - 54% less network communication under different join workloads
    corecore