161,306 research outputs found
Robust regression with imprecise data
We consider the problem of regression analysis with imprecise data. By imprecise data we mean imprecise observations of precise quantities in the form of sets of values. In this paper, we explore a recently introduced likelihood-based approach to regression with such data. The approach is very general, since it covers all kinds of imprecise data (i.e. not only intervals) and it is not restricted to linear regression. Its result consists of a set of functions, reflecting the entire uncertainty of the regression problem. Here we study in particular a robust special case of the likelihood-based imprecise regression, which can be interpreted as a generalization of the method of least median of squares. Moreover, we apply it to data from a social survey, and compare it with other approaches to regression with imprecise data. It turns out that the likelihood-based approach is the most generally applicable one and is the only approach accounting for multiple sources of uncertainty at the same time
Empirical interpretation of imprecise probabilities
This paper investigates the possibility of a frequentist interpretation of imprecise probabilities, by generalizing the approach of Bernoulli’s Ars Conjectandi. That is, by studying, in the case of games of chance, under which assumptions imprecise probabilities can be satisfactorily estimated from data. In fact, estimability on the basis of finite amounts of data is a necessary condition for imprecise probabilities in order to have a clear empirical meaning. Unfortunately, imprecise probabilities can be estimated arbitrarily well from data only in very limited settings
IMPrECISE: Good-is-good-enough data integration
IMPrECISE is an XQuery module that adds probabilistic XML functionality to an existing XML DBMS, in our case MonetDB/XQuery. We demonstrate probabilistic XML and data integration functionality of IMPrECISE. The prototype is configurable with domain knowledge such that the amount of uncertainty arising during data integration is reduced to an acceptable level, thus obtaining a "good is good enough" data integration with minimal human effort
Likelihood-based Imprecise Regression
We introduce a new approach to regression with imprecisely observed data, combining likelihood inference with ideas from imprecise probability theory, and thereby taking different kinds of uncertainty into account. The approach is very general and applicable to various kinds of imprecise data, not only to intervals.
In the present paper, we propose a regression method based on this approach, where no parametric distributional assumption is needed and interval estimates of quantiles of the error distribution are used to identify plausible descriptions of the relationship of interest. Therefore, the proposed regression method is very robust.
We apply our robust regression method to an interesting question in the social sciences. The analysis, based on survey data, yields a relatively imprecise result, reflecting the high amount of uncertainty inherent in the analyzed data set
Robust Inference of Trees
This paper is concerned with the reliable inference of optimal
tree-approximations to the dependency structure of an unknown distribution
generating data. The traditional approach to the problem measures the
dependency strength between random variables by the index called mutual
information. In this paper reliability is achieved by Walley's imprecise
Dirichlet model, which generalizes Bayesian learning with Dirichlet priors.
Adopting the imprecise Dirichlet model results in posterior interval
expectation for mutual information, and in a set of plausible trees consistent
with the data. Reliable inference about the actual tree is achieved by focusing
on the substructure common to all the plausible trees. We develop an exact
algorithm that infers the substructure in time O(m^4), m being the number of
random variables. The new algorithm is applied to a set of data sampled from a
known distribution. The method is shown to reliably infer edges of the actual
tree even when the data are very scarce, unlike the traditional approach.
Finally, we provide lower and upper credibility limits for mutual information
under the imprecise Dirichlet model. These enable the previous developments to
be extended to a full inferential method for trees.Comment: 26 pages, 7 figure
Robust Estimators under the Imprecise Dirichlet Model
Walley's Imprecise Dirichlet Model (IDM) for categorical data overcomes
several fundamental problems which other approaches to uncertainty suffer from.
Yet, to be useful in practice, one needs efficient ways for computing the
imprecise=robust sets or intervals. The main objective of this work is to
derive exact, conservative, and approximate, robust and credible interval
estimates under the IDM for a large class of statistical estimators, including
the entropy and mutual information.Comment: 16 LaTeX page
Constructing a knowledge economy composite indicator with imprecise data.
This paper focuses on the construction of a composite indicator for the knowledge based economy using imprecise data. Specifically, for some indicators we only have information on the bounds of the interval within which the true value is believed to lie. The proposed approach is based on a recent offspring in the Data Envelopment Analysis literature. Given the setting of evaluating countries, this paper discerns a ‘strong country in weak environment’ and ‘weak country in strong environment’ scenario resulting in respectively an upper and lower bound on countries’ performance. Accordingly, we derive a classification of ‘benchmark countries’, ‘potential benchmark countries’, and ‘countries open to improvement’.Knowledge economy indicators; Composite indicators; Multiple Imputation; Benefit of the doubt; Weight restrictions; Data Envelopment Analysis; Data impreciseness;
Evidential-EM Algorithm Applied to Progressively Censored Observations
Evidential-EM (E2M) algorithm is an effective approach for computing maximum
likelihood estimations under finite mixture models, especially when there is
uncertain information about data. In this paper we present an extension of the
E2M method in a particular case of incom-plete data, where the loss of
information is due to both mixture models and censored observations. The prior
uncertain information is expressed by belief functions, while the
pseudo-likelihood function is derived based on imprecise observations and prior
knowledge. Then E2M method is evoked to maximize the generalized likelihood
function to obtain the optimal estimation of parameters. Numerical examples
show that the proposed method could effectively integrate the uncertain prior
infor-mation with the current imprecise knowledge conveyed by the observed
data
- …