37 research outputs found
Recommended from our members
PREDICTING LOW DOSE EFFECTS FOR CHEMICALS IN HIGH THROUGH-PUT STUDIES
High through-put studies commonly use automated systems with 96-well plates in which multiple chemicals are tested at multiple doses using log-2 dose increments after a suitable incubation period. There are typically multiple (ranging from five to eleven) doses on each chemical, and occasionally plate replications of the dose-response studies. The target endpoint for such studies is typically the LC50, but for some chemicals, there may be multiple doses below a benchmark dose where there is no apparent adverse response relative to control response. We show how an estimation approach can lead to clearly interpretable results about response in the low dose region using data from a high throughput study of 2189 chemicals on yeast. Accurate estimates can be obtained of response for study chemicals by using best linear unbiased predictors (BLUPs) in a mixed model, and summarized via plots with expected response (assuming no low-dose effect) with confidence intervals for response below the benchmark dose for each chemical, providing an informative summary of response at low doses. We conclude that this approach can provide valuable insights that would be missed if the observational data were only considered through the lens of statistical methods appropriate for experimental studies
Recommended from our members
The importance of friends and family to recreational gambling, at-risk gambling, and problem gambling
Background The variables correlated with problem gambling are routinely assessed and fairly well established. However, problem gamblers were all ‘at-risk’ and ‘recreational’ gamblers at some point. Thus, it is instructive from a prevention perspective to also understand the variables which discriminate between recreational gambling and at-risk gambling and whether they are similar or different to the ones correlated with problem gambling. This is the purpose of the present study. Method Between September 2013 to May 2014, a representative sample of 9,523 Massachusetts adults was administered a comprehensive survey of their past year gambling behavior and problem gambling symptomatology. Based on responses to the Problem and Pathological Gambling Measure, respondents were categorized as Non-Gamblers (2,523), Recreational Gamblers (6,271), At-Risk Gamblers (600), or Problem/Pathological Gamblers (129). With the reference category of Recreational Gambler, a series of binary logistic regressions were conducted to identify the demographic, health, and gambling related variables that differentiated Recreational Gamblers from Non-Gamblers, At-Risk-Gamblers, and Problem/Pathological Gamblers. Results The strongest discriminator of being a Non-Gambler rather than a Recreational Gambler was having a lower portion of friends and family that were regular gamblers. Compared to Recreational Gamblers, At-Risk Gamblers were more likely to: gamble at casinos; play the instant and daily lottery; be male; gamble online; and be born outside the United States. Compared to Recreational Gamblers, Problem and Pathological Gamblers were more likely to: play the daily lottery; be Black; gamble at casinos; be male; gamble online; and play the instant lottery. Importantly, having a greater portion of friends and family who were regular gamblers was the second strongest correlate of being both an At-Risk Gambler and Problem/Pathological Gambler. Conclusions These analyses offer an examination of the similarities and differences between gambling subtypes. An important finding throughout the analyses is that the gambling involvement of family and friends is strongly related to Recreational Gambling, At-Risk Gambling, and Problem/Pathological Gambling. This suggests that targeting the social networks of heavily involved Recreational Gamblers and At-Risk Gamblers (in addition to Problem/Pathological Gamblers) could be an important focus of efforts in problem gambling prevention
Predicting Random Effects from Finite Population Clustered Samples with Response Error
ABSTRACT In many situations there is interest in parameters (e.g. mean) associated with the response distribution of individual clusters in a finite clustered population. We develop predictors of such parameters using a two-stage sampling probability model with response error. The probability model stems directly from finite population sampling, without additional assumptions. The predictors are closely related to best linear unbiased predictors (BLUP) that arise from common mixed model methods, as well as to modelbased predictors obtained via super-population approaches for survey sampling. The context assumes clusters of equal size and equal size sampling of units within clusters. Target parameters may correspond to clusters realized in the sample, as well as non-realized clusters. In either case, the predictors are linear and unbiased, and minimize the expected mean squared error. They correspond to the sum of predictors of responses for realized and non-realized units in the cluster, accounting directly for the second stage sampling fraction. In contrast, the BLUP commonly used in mixed models can be interpreted as predicting only the responses of second stage units not observed for a cluster, not the cluster mean. The development reveals that two-stage sampling does not give rise to a more general variance structure often assumed in super-population models, even when variances within clusters are heterogeneous. The proposed model is design based and requires minimal assumptions. With response error present, we predict target random variables defined as an expected (or average) response over units in a cluster
Predicting Random Effects from Finite Population Clustered Samples with Response Error
ABSTRACT In many situations there is interest in a parameter representing the mean of an individual cluster in a finite clustered population. We develop predictors of such parameters using a two-stage sampling probability model with response error. The probability model arises directly from finite population sampling, without additional Target parameters may correspond to clusters realized in the sample, as well as non-realized clusters. In either case, the predictors are linear and unbiased, and minimize the expected mean squared error. The predictor is the sum of predictors for realized and non-realized units in the cluster, accounting directly for the second stage sampling fraction. In contrast, the commonly used BLUP in a mixed model can be seen to predict only the responses of non-realized second stage units for a cluster, not the cluster mean. The development reveals that two-stage sampling does not give rise to a more general variance structure often assumed in super-population models, even when variances within clusters are heterogeneous. The predictors provide an interpretable alternative to an apparently artificial model-based approach. With response error present, we predict target random variables defined as an average over units in a cluster of response, or the expected value of response
Daily Soil Ingestion Estimates for Children at a Superfund Site
Ingestion of contaminated soil by children may result in significant exposure to toxic substances at contaminated sites. Estimates of such exposure are based on extrapolation of short-term-exposure estimates to longer time periods. This article provides daily estimates of soil ingestion on 64 children between the ages of 1 and 4 residing at a Superfund site; these values are employed to estimate the distribution of 7-day average soil ingestion exposures (mean, 31 mg/day; median, 17 mg/day) at a contaminated site over different time periods. Best linear unbiased predictors of the 95th-percentile of soil ingestion over 7 days, 30 days, 90 days, and 365 days are 133 mg/day, 112 mg/day, 108 mg/day and 106 mg/day, respectively. Variance components estimates (excluding titanium and outliers, based on Tukey's far-out criteria) are given for soil ingestion between subjects (59 mg/day) 2 , between days on a subject (95 mg/day) 2 , and for uncertainty on a subject-day (132 mg/day) 2 . These results expand knowledge of potential exposure to contaminants among young children from soil ingestion at contaminated sites. They also provide basic distributions that serve as a starting point for use in Monte Carlo risk assessments. KEY WORDS: Soil ingestion; Monte Carlo risk assessment; children; Superfund site; exposure assessment hand-to-mouth behavior among young children. (1-4
Predicting random effects with an expanded finite population mixed model
Prediction of random effects is an important problem with expanding applications. In the simplest context, the problem corresponds to prediction of the latent value (the mean) of a realized cluster selected via two-stage sampling. Recently, Stanek and Singer [Predicting random effects from finite population clustered samples with response error. J. Amer. Statist. Assoc. 99, 119-130] developed best linear unbiased predictors (BLUP) under a finite population mixed model that outperform BLUPs from mixed models and superpopulation models. Their setup, however, does not allow for unequally sized clusters. To overcome this drawback, we consider an expanded finite population mixed model based on a larger set of random variables that span a higher dimensional space than those typically applied to such problems. We show that BLUPs for linear combinations of the realized cluster means derived under such a model have considerably smaller mean squared error (MSE) than those obtained from mixed models, superpopulation models, and finite population mixed models. We motivate our general approach by an example developed for two-stage cluster sampling and show that it faithfully captures the stochastic aspects of sampling in the problem. We also consider simulation studies to illustrate the increased accuracy of the BLUP obtained under the expanded finite population mixed model. (C) 2007 Elsevier B.V. All rights reserved
Design-based random permutation models with auxiliary information
We extend the random permutation model to obtain the best linear unbiased estimator of a finite population mean accounting for auxiliary variables under simple random sampling without replacement (SRS) or stratified SRS. The proposed method provides a systematic design-based justification for well-known results involving common estimators derived under minimal assumptions that do not require specification of a functional relationship between the response and the auxiliary variables.National Institutes of Health, USA [NIH-PHS-R01-HD36848, R01-HL071828-02, 5R01HL079483]National Institutes of Health, USAConselho Nacional de Desenvolvimento Cientifico e Tecnologico (CNPq)Conselho Nacional de Desenvolvimento Cientifico e Tecnologico (CNPq)Fundacao de Amparo a Pesquisa do Estado de Sao Paulo (FAPESP), BrazilFundacao de Amparo a Pesquisa do Estado de Sao Paulo (FAPESP), Brazi
Performance of balanced two-stage empirical predictors of realized cluster latent values from finite populations: A simulation study
Predictors of random effects are usually based on the popular mixed effects (ME) model developed under the assumption that the sample is obtained from a conceptual infinite population; such predictors are employed even when the actual population is finite. Two alternatives that incorporate the finite nature of the population are obtained from the superpopulation model proposed by Scott and Smith (1969. Estimation in multi-stage surveys. J. Amer. Statist. Assoc. 64, 830-840) or from the finite population mixed model recently proposed by Stanek and Singer (2004. Predicting random effects from finite population clustered samples with response error. J. Amer. Statist. Assoc. 99, 1119-1130). Predictors derived under the latter model with the additional assumptions that all variance components are known and that within-cluster variances are equal have smaller mean squared error (MSE) than the competitors based on either the ME or Scott and Smith`s models. As population variances are rarely known, we propose method of moment estimators to obtain empirical predictors and conduct a simulation study to evaluate their performance. The results suggest that the finite population mixed model empirical predictor is more stable than its competitors since, in terms of MSE, it is either the best or the second best and when second best, its performance lies within acceptable limits. When both cluster and unit intra-class correlation coefficients are very high (e.g., 0.95 or more), the performance of the empirical predictors derived under the three models is similar. (c) 2007 Elsevier B.V. All rights reserved
Simple random Sampling with Missing Data Estimating the Population Mean From a Simple Random Sample When Some Responses are Missing
Abstract We develop a design-based prediction approach to estimate the finite population mean in a simple setting where some responses are missing. The approach is based on indicator sampling random variables that operate on labeled units (subjects). Missing data mechanisms are defined that may depend on a subject, or on a selection (such as when the study design assigns groups of selected subjects to different interviewers). Using an approach usually reserved for model-based inference, we develop a predictor that equals the sample total divided by the expected sample size. The methods are direct extensions of best linear unbiased prediction (BLUP) in finite population mixed models. When the probability of missing is estimated from the sample, the empirical estimator simplifies to the mean of the realized non-missing responses. The different missing data mechanisms are revealed by the notation that accounts for the labels and sample selections. The mean squared error (MSE) of the empirical estimator, counterintuitively, is smaller than the MSE if the probability of missing is known