1,392,440 research outputs found

    Structurally Tractable Uncertain Data

    Full text link
    Many data management applications must deal with data which is uncertain, incomplete, or noisy. However, on existing uncertain data representations, we cannot tractably perform the important query evaluation tasks of determining query possibility, certainty, or probability: these problems are hard on arbitrary uncertain input instances. We thus ask whether we could restrict the structure of uncertain data so as to guarantee the tractability of exact query evaluation. We present our tractability results for tree and tree-like uncertain data, and a vision for probabilistic rule reasoning. We also study uncertainty about order, proposing a suitable representation, and study uncertain data conditioned by additional observations.Comment: 11 pages, 1 figure, 1 table. To appear in SIGMOD/PODS PhD Symposium 201

    Range Queries on Uncertain Data

    Full text link
    Given a set PP of nn uncertain points on the real line, each represented by its one-dimensional probability density function, we consider the problem of building data structures on PP to answer range queries of the following three types for any query interval II: (1) top-11 query: find the point in PP that lies in II with the highest probability, (2) top-kk query: given any integer knk\leq n as part of the query, return the kk points in PP that lie in II with the highest probabilities, and (3) threshold query: given any threshold τ\tau as part of the query, return all points of PP that lie in II with probabilities at least τ\tau. We present data structures for these range queries with linear or nearly linear space and efficient query time.Comment: 26 pages. A preliminary version of this paper appeared in ISAAC 2014. In this full version, we also present solutions to the most general case of the problem (i.e., the histogram bounded case), which were left as open problems in the preliminary versio

    Integrating and Ranking Uncertain Scientific Data

    Get PDF
    Mediator-based data integration systems resolve exploratory queries by joining data elements across sources. In the presence of uncertainties, such multiple expansions can quickly lead to spurious connections and incorrect results. The BioRank project investigates formalisms for modeling uncertainty during scientific data integration and for ranking uncertain query results. Our motivating application is protein function prediction. In this paper we show that: (i) explicit modeling of uncertainties as probabilities increases our ability to predict less-known or previously unknown functions (though it does not improve predicting the well-known). This suggests that probabilistic uncertainty models offer utility for scientific knowledge discovery; (ii) small perturbations in the input probabilities tend to produce only minor changes in the quality of our result rankings. This suggests that our methods are robust against slight variations in the way uncertainties are transformed into probabilities; and (iii) several techniques allow us to evaluate our probabilistic rankings efficiently. This suggests that probabilistic query evaluation is not as hard for real-world problems as theory indicates

    Model updating using uncertain experimental modal data

    Get PDF
    The propagation of parameter uncertainty in structural dynamics has become a feasible method to determine the probabilistic description of the vibration response of industrial scale �nite element models. Though methods for uncertainty propagation have been developed extensively, the quanti�cation of parameter uncertainty has been neglected in the past. But a correct assumption for the parameter variability is essential for the estimation of the uncertain vibration response. This paper shows how to identify model parameter means and covariance matrix from uncertain experimental modal test data. The common gradient based approach from deterministic computational model updating was extended by an equation that accounts for the stochastic part. In detail an inverse approach for the identi�cation of statistical parametric properties will be presented which will be applied on a numerical model of a replica of the GARTEUR SM-AG19 benchmark structure. The uncertain eigenfrequencies and mode shapes have been determined in an extensive experimental modal test campaign where the aircraft structure was tested repeatedly while it was 130 times dis- and reassembled in between each experimental modal analysis

    Supporting User-Defined Functions on Uncertain Data

    Get PDF
    Uncertain data management has become crucial in many sensing and scientific applications. As user-defined functions (UDFs) become widely used in these applications, an important task is to capture result uncertainty for queries that evaluate UDFs on uncertain data. In this work, we provide a general framework for supporting UDFs on uncertain data. Specifically, we propose a learning approach based on Gaussian processes (GPs) to compute approximate output distributions of a UDF when evaluated on uncertain input, with guaranteed error bounds. We also devise an online algorithm to compute such output distributions, which employs a suite of optimizations to improve accuracy and performance. Our evaluation using both real-world and synthetic functions shows that our proposed GP approach can outperform the state-of-the-art sampling approach with up to two orders of magnitude improvement for a variety of UDFs. 1

    On Markov Chains with Uncertain Data

    Get PDF
    In this paper, a general method is described to determine uncertainty intervals for performance measures of Markov chains given an uncertainty region for the parameters of the Markov chains. We investigate the effects of uncertainties in the transition probabilities on the limiting distributions, on the state probabilities after n steps, on mean sojourn times in transient states, and on absorption probabilities for absorbing states. We show that the uncertainty effects can be calculated by solving linear programming problems in the case of interval uncertainty for the transition probabilities, and by second order cone optimization in the case of ellipsoidal uncertainty. Many examples are given, especially Markovian queueing examples, to illustrate the theory.Markov chain;Interval uncertainty;Ellipsoidal uncertainty;Linear Programming;Second Order Cone Optimization

    Improvements on the k-center problem for uncertain data

    Full text link
    In real applications, there are situations where we need to model some problems based on uncertain data. This leads us to define an uncertain model for some classical geometric optimization problems and propose algorithms to solve them. In this paper, we study the kk-center problem, for uncertain input. In our setting, each uncertain point PiP_i is located independently from other points in one of several possible locations {Pi,1,,Pi,zi}\{P_{i,1},\dots, P_{i,z_i}\} in a metric space with metric dd, with specified probabilities and the goal is to compute kk-centers {c1,,ck}\{c_1,\dots, c_k\} that minimize the following expected cost Ecost(c1,,ck)=RΩprob(R)maxi=1,,nminj=1,kd(P^i,cj)Ecost(c_1,\dots, c_k)=\sum_{R\in \Omega} prob(R)\max_{i=1,\dots, n}\min_{j=1,\dots k} d(\hat{P}_i,c_j) here Ω\Omega is the probability space of all realizations R={P^1,,P^n}R=\{\hat{P}_1,\dots, \hat{P}_n\} of given uncertain points and prob(R)=i=1nprob(P^i).prob(R)=\prod_{i=1}^n prob(\hat{P}_i). In restricted assigned version of this problem, an assignment A:{P1,,Pn}{c1,,ck}A:\{P_1,\dots, P_n\}\rightarrow \{c_1,\dots, c_k\} is given for any choice of centers and the goal is to minimize EcostA(c1,,ck)=RΩprob(R)maxi=1,,nd(P^i,A(Pi)).Ecost_A(c_1,\dots, c_k)=\sum_{R\in \Omega} prob(R)\max_{i=1,\dots, n} d(\hat{P}_i,A(P_i)). In unrestricted version, the assignment is not specified and the goal is to compute kk centers {c1,,ck}\{c_1,\dots, c_k\} and an assignment AA that minimize the above expected cost. We give several improved constant approximation factor algorithms for the assigned versions of this problem in a Euclidean space and in a general metric space. Our results significantly improve the results of \cite{guh} and generalize the results of \cite{wang} to any dimension. Our approach is to replace a certain center point for each uncertain point and study the properties of these certain points. The proposed algorithms are efficient and simple to implement

    Belief Hierarchical Clustering

    Get PDF
    In the data mining field many clustering methods have been proposed, yet standard versions do not take into account uncertain databases. This paper deals with a new approach to cluster uncertain data by using a hierarchical clustering defined within the belief function framework. The main objective of the belief hierarchical clustering is to allow an object to belong to one or several clusters. To each belonging, a degree of belief is associated, and clusters are combined based on the pignistic properties. Experiments with real uncertain data show that our proposed method can be considered as a propitious tool

    Utility-driven Data Analytics on Uncertain Data

    Full text link
    Modern Internet of Things (IoT) applications generate massive amounts of data, much of it in the form of objects/items of readings, events, and log entries. Specifically, most of the objects in these IoT data contain rich embedded information (e.g., frequency and uncertainty) and different level of importance (e.g., unit utility of items, interestingness, cost, risk, or weight). Many existing approaches in data mining and analytics have limitations such as only the binary attribute is considered within a transaction, as well as all the objects/items having equal weights or importance. To solve these drawbacks, a novel utility-driven data analytics algorithm named HUPNU is presented, to extract High-Utility patterns by considering both Positive and Negative unit utilities from Uncertain data. The qualified high-utility patterns can be effectively discovered for risk prediction, manufacturing management, decision-making, among others. By using the developed vertical Probability-Utility list with the Positive-and-Negative utilities structure, as well as several effective pruning strategies. Experiments showed that the developed HUPNU approach performed great in mining the qualified patterns efficiently and effectively.Comment: Under review in IEEE Internet of Things Journal since 2018, 11 page
    corecore