5 research outputs found

    Evaluation of matching noise for imputation techniques based on nonparametric local linear regression estimators

    No full text
    A new matching procedure based on imputing missing data by means of a local linear estimator of the underlying population regression function (that is assumed not necessarily linear) is introduced. Such a procedure is compared to other traditional approaches, more precisely hot deck methods as well as methods based on kNN estimators. The relationship between the variables of interest is assumed not necessarily linear. Performance is measured by the matching noise given by the discrepancy between the distribution generating genuine data and the distribution generating imputed values

    Evaluation of matching noise for imputation techniques based on nonparametric local linear regression estimators

    No full text
    A new matching procedure based on imputing missing data by means of a local linear estimator of the underlying population regression function (that is assumed not necessarily linear) is introduced. Such a procedure is compared to other traditional approaches, more precisely hot deck methods as well as methods based on kNN estimators. The relationship between the variables of interest is assumed not necessarily linear. Performance is measured by the matching noise given by the discrepancy between the distribution generating genuine data and the distribution generating imputed values. (C) 2008 Elsevier B.V. All rights reserved

    Evaluation of matching noise for imputation techniques based on nonparametric local linear regression estimators

    No full text
    A new matching procedure based on imputing missing data by means of a local linear estimator of the underlying population regression function (that is assumed not necessarily linear) is introduced. Such a procedure is compared to other traditional approaches, more precisely hot deck methods as well as methods based on kNN estimators. The relationship between the variables of interest is assumed not necessarily linear. Performance is measured by the matching noise given by the discrepancy between the distribution generating genuine data and the distribution generating imputed values.

    An assessment of deviations from conditional independence in binary data fusion

    Get PDF
    Data fusion is a data integration technique that provides a way to combine information from different sources through a set of common characteristics (variables), thereby creating a single, all-inclusive data source. The success of a fusion largely depends on the accuracy of the underlying assumptions about the relationship between the common variables and the variables unique to each individual data source. The most common model used to fuse data is based on the assumption of conditional independence, which states that the variables unique to each data set (say Y and Z) are independent given the common variables (say X). This analysis evaluates data fusion procedures for binary data under the assumption of conditional independence, and assesses how deviations from this assumption influence the success of the fusion. The degree of conditional independence present in the data is quantified using a function of entropy, namely the conditional mutual information. The impact of the deviation from conditional independence on the success of the fusion is evaluated using the results from a number of different statistical tests, such as the Chi-square goodness-of-fit test and the 3T-test for a correlation structure, in relation to the level of conditional independence in the data
    corecore