132,910 research outputs found
Confidence Sets Based on Penalized Maximum Likelihood Estimators in Gaussian Regression
Confidence intervals based on penalized maximum likelihood estimators such as
the LASSO, adaptive LASSO, and hard-thresholding are analyzed. In the
known-variance case, the finite-sample coverage properties of such intervals
are determined and it is shown that symmetric intervals are the shortest. The
length of the shortest intervals based on the hard-thresholding estimator is
larger than the length of the shortest interval based on the adaptive LASSO,
which is larger than the length of the shortest interval based on the LASSO,
which in turn is larger than the standard interval based on the maximum
likelihood estimator. In the case where the penalized estimators are tuned to
possess the `sparsity property', the intervals based on these estimators are
larger than the standard interval by an order of magnitude. Furthermore, a
simple asymptotic confidence interval construction in the `sparse' case, that
also applies to the smoothly clipped absolute deviation estimator, is
discussed. The results for the known-variance case are shown to carry over to
the unknown-variance case in an appropriate asymptotic sense.Comment: second revision: new title, some comments added, proofs moved to
appendi
A New Method for Protecting Interrelated Time Series with Bayesian Prior Distributions and Synthetic Data
Organizations disseminate statistical summaries of administrative data via the Web for unrestricted public use. They balance the trade-off between confidentiality protection and inference quality. Recent developments in disclosure avoidance techniques include the incorporation of synthetic data, which capture the essential features of underlying data by releasing altered data generated from a posterior predictive distribution. The United States Census Bureau collects millions of interrelated time series micro-data that are hierarchical and contain many zeros and suppressions. Rule-based disclosure avoidance techniques often require the suppression of count data for small magnitudes and the modification of data based on a small number of entities. Motivated by this problem, we use zero-inflated extensions of Bayesian Generalized Linear Mixed Models (BGLMM) with privacy-preserving prior distributions to develop methods for protecting and releasing synthetic data from time series about thousands of small groups of entities without suppression based on the of magnitudes or number of entities. We find that as the prior distributions of the variance components in the BGLMM become more precise toward zero, confidentiality protection increases and inference quality deteriorates. We evaluate our methodology using a strict privacy measure, empirical differential privacy, and a newly defined risk measure, Probability of Range Identification (PoRI), which directly measures attribute disclosure risk. We illustrate our results with the U.S. Census Bureau’s Quarterly Workforce Indicators
Study of leakage currents in pCVD diamonds as function of the magnetic field
pCVD diamond sensors are regularly used as beam loss monitors in accelerators
by measuring the ionization of the lost particles. In the past these beam loss
monitors showed sudden increases in the dark leakage current without beam
losses and these erratic leakage currents were found to decrease, if magnetic
fields were present. Here we report on a systematic study of leakage currents
inside a magnetic field. The decrease of erratic currents in a magnetic field
was confirmed. On the contrary, diamonds without erratic currents showed an
increase of the leakage current in a magnetic field perpendicular to the
electric field for fields up to 0.6T, for higher fields it decreases. A
preliminary model is introduced to explain the observations.Comment: 6 pages, 16 figures, poster at Hasselt Diamond Workshop, Mar 2009,
accepted version for publicatio
- …
