4,656 research outputs found
Recommended from our members
The quest for a donor: probability based methods offer help
When a patient in need of a stem cell transplant has no compatible donor within his or her closest family, and no matched unrelated donor can be found, a remaining option is to search within the patient’s extended family. This situation often arises when the patient is of an ethnic minority, originating from a country that lacks a well-developed stem cell donor program, and has HLA haplotypes that are rare in his or her country of residence. Searching within the extended family may be time-consuming and expensive, and tools to calculate the probability of a match within groups of untested relatives would facilitate the search. We present a general approach to calculating the probability of a match in a given relative, or group of relatives, based on the pedigree, and on knowledge of the genotypes of some of the individuals. The method extends previous approaches by allowing the pedigrees to be consanguineous and arbitrarily complex, with deviations from Hardy-Weinberg equilibrium. We show how this extension has a considerable effect on results, in particular for rare haplotypes. The methods are exemplified using freeware programs to solve a case of practical importance
Imprecise Imputation: A Nonparametric Micro Approach Reflecting the Natural Uncertainty of Statistical Matching with Categorical Data
We develop the first statistical matching micro approach reflecting the natural uncer-
tainty arising during the integration of categorical data. A complete synthetic file is
obtained by imprecise imputation, replacing missing entries by sets of suitable values.
We discuss three imprecise imputation strategies and raise ideas on potential refine-
ments by logical constraints or likelihood-based arguments. Additionally, we show how
imprecise imputation can be embedded into the theory of finite random sets, providing
tight lower and upper bounds for parameters. Our simulation results corroborate that
their narrowness is practically relevant and that they almost always cover the true
parameters
Imprecise Imputation: A Nonparametric Micro Approach Reflecting the Natural Uncertainty of Statistical Matching with Categorical Data
We develop the first statistical matching micro approach reflecting the natural uncer-
tainty arising during the integration of categorical data. A complete synthetic file is
obtained by imprecise imputation, replacing missing entries by sets of suitable values.
We discuss three imprecise imputation strategies and raise ideas on potential refine-
ments by logical constraints or likelihood-based arguments. Additionally, we show how
imprecise imputation can be embedded into the theory of finite random sets, providing
tight lower and upper bounds for parameters. Our simulation results corroborate that
their narrowness is practically relevant and that they almost always cover the true
parameters
Marginal Release Under Local Differential Privacy
Many analysis and machine learning tasks require the availability of marginal
statistics on multidimensional datasets while providing strong privacy
guarantees for the data subjects. Applications for these statistics range from
finding correlations in the data to fitting sophisticated prediction models. In
this paper, we provide a set of algorithms for materializing marginal
statistics under the strong model of local differential privacy. We prove the
first tight theoretical bounds on the accuracy of marginals compiled under each
approach, perform empirical evaluation to confirm these bounds, and evaluate
them for tasks such as modeling and correlation testing. Our results show that
releasing information based on (local) Fourier transformations of the input is
preferable to alternatives based directly on (local) marginals
Eliminating small cells from census counts tables: empirical vs. design transition probabilities
The software SAFE has been developed at the State Statistical Institute Berlin-Brandenburg and has been in regular use there for several years now. It involves an algorithm that yields a controlled cell frequency perturbation. When a microdata set has been protected by this method, any table which can be computed on the basis of this microdata set will not contain any small cells, e.g. cells with frequency counts 1 or 2. We compare empirically observed transition probabilities resulting from this pre-tabular method to transition matrices in the context of variants of microdata key based post-tabular random perturbation methods suggested in the literature, e.g. Shlomo, N., Young, C. (2008) and Fraser, B.,Wooton, J. (2006)
- …