45 research outputs found

    Improving on analyses of self-reported data in a large-scale health survey by using information from an examination-based survey

    Full text link
    Common data sources for assessing the health of a population of interest include large-scale surveys based on interviews that often pose questions requiring a self-report, such as, ‘Has a doctor or other health professional ever told you that you have 〈 health condition of interest〉 ?’ or ‘What is your 〈 height/weight〉 ?’ Answers to such questions might not always reflect the true prevalences of health conditions (for example, if a respondent misreports height/weight or does not have access to a doctor or other health professional). Such ‘measurement error’ in health data could affect inferences about measures of health and health disparities. Drawing on two surveys conducted by the National Center for Health Statistics, this paper describes an imputation-based strategy for using clinical information from an examination-based health survey to improve on analyses of self-reported data in a larger interview-based health survey. Models predicting clinical values from self-reported values and covariates are fitted to data from the National Health and Nutrition Examination Survey (NHANES), which asks self-report questions during an interview component and also obtains clinical measurements during a physical examination component. The fitted models are used to multiply impute clinical values for the National Health Interview Survey (NHIS), a larger survey that obtains data solely via interviews. Illustrations involving hypertension, diabetes, and obesity suggest that estimates of health measures based on the multiply imputed clinical values are different from those based on the NHIS self-reported data alone and have smaller estimated standard errors than those based solely on the NHANES clinical data. The paper discusses the relationship of the methods used in the study to two-phase/two-stage/validation sampling and estimation, along with limitations, practical considerations, and areas for future research. Published in 2009 by John Wiley & Sons, Ltd.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/65032/1/3809_ftp.pd

    Overlapping confidence intervals or standard error intervals: What do they mean in terms of statistical significance?

    Get PDF
    We investigate the procedure of checking for overlap between confidence intervals or standard error intervals to draw conclusions regarding hypotheses about differences between population parameters. Mathematical expressions and algebraic manipulations are given, and computer simulations are performed to assess the usefulness of confidence and standard error intervals in this manner. We make recommendations for their use in situations in which standard tests of hypotheses do not exist. An example is given that tests this methodology for comparing effective dose levels in independent probit regressions, an application that is also pertinent to derivations of LC(50)s for insect pathogens and of detectability half-lives for prey proteins or DNA sequences in predator gut analysis

    Survival Analysis Using Auxiliary Variables Via Multiple Imputation, with Application to AIDS Clinical Trial Data

    Full text link
    We develop an approach, based on multiple imputation, to using auxiliary variables to recover information from censored observations in survival analysis. We apply the approach to data from an AIDS clinical trial comparing ZDV and placebo, in which CD4 count is the time-dependent auxiliary variable. To facilitate imputation, a joint model is developed for the data, which includes a hierarchical change-point model for CD4 counts and a time-dependent proportional hazards model for the time to AIDS. Markov chain Monte Carlo methods are used to multiply impute event times for censored cases. The augmented data are then analyzed and the results combined using standard multiple-imputation techniques. A comparison of our multiple-imputation approach to simply analyzing the observed data indicates that multiple imputation leads to a small change in the estimated effect of ZDV and smaller estimated standard errors. A sensitivity analysis suggests that the qualitative findings are reproducible under a variety of imputation models. A simulation study indicates that improved efficiency over standard analyses and partial corrections for dependent censoring can result. An issue that arises with our approach, however, is whether the analysis of primary interest and the imputation model are compatible.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/66054/1/j.0006-341X.2002.00037.x.pd

    Combining Information from Two Surveys to Estimate County-Level Prevalence Rates of Cancer Risk Factors and Screening

    Get PDF
    Cancer surveillance requires estimates of the prevalence of cancer risk factors and screening for small areas such as counties. Two popular data sources are the Behavioral Risk Factor Surveillance System (BRFSS), a telephone survey conducted by state agencies, and the National Health Interview Survey (NHIS), an area probability sample survey conducted through face-to-face interviews. Both data sources have advantages and disadvantages. The BRFSS is a larger survey, and almost every county is included in the survey; but it has lower response rates as is typical with telephone surveys, and it does not include subjects who live in households with no telephones. On the other hand, the NHIS is a smaller survey, with the majority of counties not included; but it includes both telephone and non-telephone households and has higher response rates. A preliminary analysis shows that the distributions of cancer screening and risk factors are different for telephone and non-telephone households. Thus, information from the two surveys may be combined to address both nonresponse and noncoverage errors. A hierarchical Bayesian approach that combines information from both surveys is used to construct county-level estimates. The proposed model incorporates potential noncoverage and nonresponse biases in the BRFSS as well as complex sample design features of both surveys. A Markov Chain Monte Carlo method is used to simulate draws from the joint posterior distribution of unknown quantities in the model based on the design-based direct estimates and county-level covariates. Yearly prevalence estimates at the county level for 49 states, as well as for the entire state of Alaska and the District of Columbia, are developed for six outcomes using BRFSS and NHIS data from the years 1997-2000. The outcomes include smoking and use of common cancer screening procedures. The NHIS/BRFSS combined county-level estimates are substantially different from those based on BRFSS alone

    A New Reliability Growth Model

    No full text

    The Use of Splines in Data Analysis

    No full text
    corecore