155 research outputs found
NISS WebSwap: A Web Service for Data Swapping
Data swapping is a statistical disclosure limitation practice that alters records in the data to be released by switching values of attributes across pairs of records in a fraction of the original data. Web Services are an exciting new form of distributed computing that allow users to invoke remote applications nearly transparently. National Institute of Statistical Sciences (NISS) has recently started hosting NISS Web Services as a service and example to the statistical sciences community. In this paper we describe and provide usage information for NISS WebSwap the initial NISS Web Service, which swaps one or more attributes (fields) between user-specified records in a microdata file, uploading the original data file from the user's computer and downloading the file containing the swapped records.
www.niss.org Construction of Full Sample and Replicate Weights for Project Talent, with Applications
Project Talent is a large, nationally representative longitudinal study developed by the American Institutes for Research and conducted from 1960 to 1974. The goals were to assess the interests, abilities, and demographics of 9 th –12 th graders and to follow their trajectories into adulthood. More than 1,200 junior and senior high schools participated. Replicate weights were not constructed at the time, preventing the estimation of standard errors. Today, Project Talent is being revived to study the physical, cognitive, economic, and social processes of aging. In this paper, the retrospective construction of 104 sets of student-level replicate weights is described. Partitioning analysis was performed to generate variance strata and variance primary sampling units. The student-level replicate weights were constructed using a jackknife procedure. The process included adjustment of the base year weights and calibration of (full sample and replicate weights) to the total number of secondary school students in the U. S. in the spring of 1960. The use of replicate weights is illustrated by estimating standard errors for means of composite cognitive scores constructed from student questionnaires. We also describe construction of mortality- and nonresponse-adjusted weights for the thre
www.niss.org Why Data Availability is Such a Hard Problem
If data availability were a simple problem, it would already have been resolved. In this paper, I argue that by viewing data availability as a public good, it is possible to both understand the complexities with which it is fraught and identify a path to a solution. 1 Data Availability as a Public Good Those who view data availability as a black-and-white issue—the purist view, as in the left-hand panel in Figure 1, are ignoring or attenuating not only reality, but also fundamental principles of economics and human behavior. Instead, data availability is composed of infinitely many shades of gray, as in the right-hand panel in Figure 1—the realist view. My fundamental point is that data availability is a public good (Varian, 1992). As are other public goods, it is extremely complex. Strikingly, however, much of the current conversation about data availability ignores, in many cases willfully, this complexity. To purists who disagree, I submit that the empirical evidence is overwhelming. If data availability were a simple problem, it would have been resolved long ago. Solutions imposed by fiat are inefficient at best, and generally ineffective. Many proposals overlook the multiplicity of stakeholders (§3), as well as the complex, competing incentives to which they are subject
Estimation and reconstruction for zero-one Markov processes
Given a Markov process with state space {0, 1} we treat parameter estimation of the transition intensities and state estimation of unobserved portions of the sample path, based on various partial observations of the process. Parameter estimators are devised and shown to be consistent and asymptotically normal. State estimators are computed explicitly and represented in recursive form. Observation mechanisms include regularly spaced samples, regular samples with time jitter, Poisson samples. Poisson samples with state 0 unobservable, observability defined by an alternating renewal process, averaged samples, observation of transition times into state 1 and observation of a random time change of the underlying process. The law of the observability process may be partly unknown. The combined problem of state estimation with estimated parameters is also examined.partially observed Markov process interpolation estimation of transition rates filtering consistency prediction asymptotic normality state estimation with estimated parameters state estimation
State estimation for cox processes on general spaces
Let N be an observable Cox process on a locally compact space E directed by an unobservable random measure M. Techniques are presented for estimation of M, using the observations of N to calculate conditional expectations of the form E [M]A], where A is the [sigma]-algebra generated by the restriction of N to A. We introduce a random measure whose distribution depends on NA, from which we obtain both exact estimates and a recursive method for updating them as further observations become available. Application is made to the specific cases of estimation of an unknown, random scalar multiplier of a known measure, of a symmetrically distributed directing measure M and of a Markov-directed Cox process on . By means of a Poisson cluster representation, the results are extended to treat the situation where N is conditionally additive and infinitely divisible given M.
State estimation for cox processes with unknown probability law
Let Ni, i[greater-or-equal, slanted]1, be i.i.d. observable Cox processes on a compact metric space E, directed by unobservable random measures Mi. Assume that the probability law of the Mi is completely unknown. Techniques are developed for approximation of state estimators using data from the processes N1,...,Nn to estimate necessary attributes of the unknown probability law of the time Mi. The techniques are based on representation of the state estimators in terms of reduced Palm distributions of the Ni and on estimation of these Palm distributions. Estimators of Palm distributions are shown to be strongly consistent and asymptotically normal. The difference between the true and the pseudo-state estimators converges to zero in L2 at rate n- for each [delta] > 0.Cox Process point process Palm distribution estimation for point processes state estimation
- …