192,527 research outputs found
Recommended from our members
Supporting Scientific Analytics under Data Uncertainty and Query Uncertainty
Data management is becoming increasingly important in many applications, in particular, in large scientific databases where (1) data can be naturally modeled by continuous random variables, and (2) queries can involve complex predicates and/or be difficult for users to express explicitly. My thesis work aims to provide efficient support to both the data uncertainty and the query uncertainty .
When data is uncertain, an important class of queries requires query answers to be returned if their existence probabilities pass a threshold. I start with optimizing such threshold query processing for continuous uncertain data in the relational model by (i) expediting selections by reducing dimensionality of integration and using faster filters, (ii) expediting joins using new indexes on uncertain data, and (iii) optimizing a query plan using a dynamic, per-tuple based approach. Evaluation results using real-world data and benchmark queries show the accuracy and efficiency of my techniques and the dynamic query planning has over 50% performance gains in most cases over a state-of-the-art threshold query optimizer and is very close to the optimal planning in all cases.
Next I address uncertain data management in the array model, which has gained popu- larity for scientific data processing recently due to performance benefits. I define the formal semantics of array operations on uncertain data involving both value uncertainty within individual tuples and position uncertainty regarding where a tuple should belong in an array given uncertain dimension attributes, and propose a suite of storage and evaluation strategies for array operators, with a focus on a novel scheme that bounds the overhead of querying by strategically placing a few replicas of the tuples with large variances. Evaluation results show that for common workloads, my best-performing techniques outperform baselines up to 1 to 2 orders of magnitude while incurring only small storage overhead.
Finally, to bridge the increasing gap between the fast growth of data and the limited human ability to comprehend data and help the user retrieve high-value content from data more effectively, I propose to build interactive data exploration as a new database service, using an approach called “explore-by-example”. To build an effective system, my work is grounded in a rigorous SVM-based active learning framework and focuses on the following three problems: (i) accuracy-based and convergence-based stopping criteria, (ii) expediting example acquisition in each iteration, and (iii) expediting the final result retrieval. Evaluation results using real-world data and query patterns show that my system significantly outperforms state-of-the-art systems in accuracy (18x accuracy improvement for 4-dimensional workloads) while achieving desired efficiency for interactive exploration (2 to 5 seconds per iteration)
Recognizing Uncertainty in Speech
We address the problem of inferring a speaker's level of certainty based on
prosodic information in the speech signal, which has application in
speech-based dialogue systems. We show that using phrase-level prosodic
features centered around the phrases causing uncertainty, in addition to
utterance-level prosodic features, improves our model's level of certainty
classification. In addition, our models can be used to predict which phrase a
person is uncertain about. These results rely on a novel method for eliciting
utterances of varying levels of certainty that allows us to compare the utility
of contextually-based feature sets. We elicit level of certainty ratings from
both the speakers themselves and a panel of listeners, finding that there is
often a mismatch between speakers' internal states and their perceived states,
and highlighting the importance of this distinction.Comment: 11 page
Bayesian Updating, Model Class Selection and Robust Stochastic Predictions of Structural Response
A fundamental issue when predicting structural response by using mathematical models is how to treat both modeling and excitation uncertainty. A general framework for this is presented which uses probability as a multi-valued
conditional logic for quantitative plausible reasoning in the presence of uncertainty due to incomplete information. The
fundamental probability models that represent the structure’s uncertain behavior are specified by the choice of a stochastic
system model class: a set of input-output probability models for the structure and a prior probability distribution over this set
that quantifies the relative plausibility of each model. A model class can be constructed from a parameterized deterministic
structural model by stochastic embedding utilizing Jaynes’ Principle of Maximum Information Entropy. Robust predictive
analyses use the entire model class with the probabilistic predictions of each model being weighted by its prior probability, or if
structural response data is available, by its posterior probability from Bayes’ Theorem for the model class. Additional robustness
to modeling uncertainty comes from combining the robust predictions of each model class in a set of competing candidates
weighted by the prior or posterior probability of the model class, the latter being computed from Bayes’ Theorem. This higherlevel application of Bayes’ Theorem automatically applies a quantitative Ockham razor that penalizes the data-fit of more
complex model classes that extract more information from the data. Robust predictive analyses involve integrals over highdimensional spaces that usually must be evaluated numerically. Published applications have used Laplace's method of
asymptotic approximation or Markov Chain Monte Carlo algorithms
Certainty Closure: Reliable Constraint Reasoning with Incomplete or Erroneous Data
Constraint Programming (CP) has proved an effective paradigm to model and
solve difficult combinatorial satisfaction and optimisation problems from
disparate domains. Many such problems arising from the commercial world are
permeated by data uncertainty. Existing CP approaches that accommodate
uncertainty are less suited to uncertainty arising due to incomplete and
erroneous data, because they do not build reliable models and solutions
guaranteed to address the user's genuine problem as she perceives it. Other
fields such as reliable computation offer combinations of models and associated
methods to handle these types of uncertain data, but lack an expressive
framework characterising the resolution methodology independently of the model.
We present a unifying framework that extends the CP formalism in both model
and solutions, to tackle ill-defined combinatorial problems with incomplete or
erroneous data. The certainty closure framework brings together modelling and
solving methodologies from different fields into the CP paradigm to provide
reliable and efficient approches for uncertain constraint problems. We
demonstrate the applicability of the framework on a case study in network
diagnosis. We define resolution forms that give generic templates, and their
associated operational semantics, to derive practical solution methods for
reliable solutions.Comment: Revised versio
Robust filtering for a class of stochastic uncertain nonlinear time-delay systems via exponential state estimation
Copyright [2001] IEEE. This material is posted here with permission of the IEEE. Such permission of the IEEE does not in any way imply IEEE endorsement of any of Brunel University's products or services. Internal or personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution must be obtained from the IEEE by writing to [email protected]. By choosing to view this document, you agree to all provisions of the copyright laws protecting it.We investigate the robust filter design problem for a class of nonlinear time-delay stochastic systems. The system under study involves stochastics, unknown state time-delay, parameter uncertainties, and unknown nonlinear disturbances, which are all often encountered in practice and the sources of instability. The aim of this problem is to design a linear, delayless, uncertainty-independent state estimator such that for all admissible uncertainties as well as nonlinear disturbances, the dynamics of the estimation error is stochastically exponentially stable in the mean square, independent of the time delay. Sufficient conditions are proposed to guarantee the existence of desired robust exponential filters, which are derived in terms of the solutions to algebraic Riccati inequalities. The developed theory is illustrated by numerical simulatio
- …