13 research outputs found

    Towards Efficient Sequential Pattern Mining in Temporal Uncertain Databases

    Get PDF
    Uncertain sequence databases are widely used to model data with inaccurate or imprecise timestamps in many real world applications. In this paper, we use uniform distributions to model uncertain timestamps and adopt possible world semantics to interpret temporal uncertain database. We design an incremental approach to manage temporal uncertainty efficiently, which is integrated into the classic pattern-growth SPM algorithm to mine uncertain sequential patterns. Extensive experiments prove that our algorithm performs well in both efficiency and scalability

    Mining Uncertain Sequential Patterns in Iterative MapReduce

    Get PDF
    This paper proposes a sequential pattern mining (SPM) algorithm in large scale uncertain databases. Uncertain sequence databases are widely used to model inaccurate or imprecise timestamped data in many real applications, where traditional SPM algorithms are inapplicable because of data uncertainty and scalability. In this paper, we develop an efficient approach to manage data uncertainty in SPM and design an iterative MapReduce framework to execute the uncertain SPM algorithm in parallel. We conduct extensive experiments in both synthetic and real uncertain datasets. And the experimental results prove that our algorithm is efficient and scalable

    A Truly Dynamic Data Structure for Top-k Queries on Uncertain Data

    No full text
    Top-k queries allow end-users to focus on the most important (top-k) answers amongst those which satisfy the query. In traditional databases, a user defined score function assigns a score value to each tuple and a top-k query returns k tuples with the highest score. In uncertain database, top-k answer depends not only on the scores but also on the membership probabilities of tuples. Several top-k definitions covering different aspects of score-probability interplay have been proposed in recent past [20, 13, 6, 18]. Most of the existing work in this research field is focused on developing efficient algorithms for answering top-k queries on static uncertain data. Any change (insertion, deletion of a tuple or change in membership probability, score of a tuple) in underlying data forces re-computation of query answers. Such re-computations are not practical considering the dynamic nature of data in many applications. In this paper, we propose a truly dynamic data structure that uses ranking function PRF e (α) proposed by Li et al. [18] under the generally adopted model of x-relations [21]. PRF e can effectively approximate various other top-k definitions on uncertain data based on the value of parameter α. An x-relation consists of a number of x-tuples, where x-tuple is a set of mutually exclusive tuples (up to a constant number) called alternatives. Each x-tuple in a relation randomly instantiates into one tuple from its alternatives. For an uncertain relation with N tuples, our structure can answer top-k queries in O(klogN) time, handles an update in O(logN) time and takes O(N) space. Finally, we evaluate practical efficiency of our structure on both synthetic and real data. © 2011 Springer-Verlag Berlin Heidelberg

    Diagnostic Value Of Microfilariae Search In Hydrocoele Fluid (A negative report)

    No full text
    N

    Ranking large temporal data

    No full text

    Semantics of Ranking Queries for Probabilistic Data

    Full text link

    Discovering Influential Data Objects over Time

    No full text
    corecore