Search CORE

74 research outputs found

Using Longitudinal Data to Estimate the Effect of Starting to Exercise on the Health of Sedentary Older Adults

Author: Diehr Paula
Hirsch Calvin
Publication venue: Collection of Biostatistics Research Archive
Publication date: 08/08/2008
Field of study

Background It is difficult to estimate the effect of exercise on future health from observational data because exercising may be both a cause and an effect of health status. Unadjusted analyses suffer from selection bias (healthier persons more likely to exercise), while adjusted analyses may adjust away some of the benefits of exercise. Objective To obtain a low-bias interpretable estimate of the effect of exercise on future health. Methods We used data from the Cardiovascular Health Study, a longitudinal study of 5,888 older adults. The number of blocks walked in the previous week, collected annually, were classified as Sedentary (less than 7 blocks per week), Moderate, or Active (28 or more blocks per week). The primary low bias analysis was restricted to persons who were both Sedentary and Healthy (in Excellent, Very Good, or Good self-reported health) in the two years before baseline. Self-reported health status (Healthy versus Sick or Dead) at follow-up was regressed on the level of exercise at baseline, variously including or excluding demographics, health prior to baseline, and health at baseline. Findings Exercise trends were associated as expected with age, sex, and race. Healthy persons were more likely than Sick to start to exercise, and Sick Active persons were more likely to become Healthy than Sick Sedentary persons. In the total sample, 77% of persons who were Active at baseline were Healthy at follow-up, as compared with 49% of Sedentary persons, a difference of 28 percentage points that is difficult to interpret. In the subset who were both Sedentary and Healthy in the two years before baseline, the difference was only 14 percentage points. That difference declined to 12 points after adjustment for demographics, and to 9 points after adjusting for other health variables measured prior to baseline. After adjustment for health variables measured at baseline (possibly in the causal pathway) the difference dropped to 7 points and was no longer significantly different from zero. Similar findings occurred when survival was the outcome. The apparent effect of exercise on health was substantially smaller if persons who were Dead at follow-up were excluded. Conclusion At least a third of the apparent benefit of exercise could be explained by selection bias. Where possible, observational studies of the effects of exercise should measure exercise at every period instead of just at enrollment. This permits incorporating exercise and health data prior to baseline. Analysis should also allow for the benefits of exercise on survival. The low-bias estimate of the benefit to a Healthy Sedentary older adult of becoming Active (walking 28 or more blocks per week, median = 48) was 7 percentage points for being alive 2 years later, and 9 percentage points for being alive and healthy. A modest program of walking may confer modest health benefits

Collection Of Biostatistics Research Archive

Multi-state Life Tables, Equilibrium Prevalence, and Baseline Selection Bias

Author: Diehr Paula
Yanez David
Publication venue: Collection of Biostatistics Research Archive
Publication date: 15/06/2010
Field of study

Consider a 3-state system with one absorbing state, such as Healthy, Sick, and Dead. If the system satisfies the 1-step Markov conditions, the prevalence of the Healthy state will converge to a value that is independent of the initial distribution. This equilibrium prevalence and its variance are known under the assumption of time homogeneity, and provided reasonable estimates in the time non-homogeneous systems studied. Here, we derived the equilibrium prevalence for a system with more than three states. Under time homogeneity, the equilibrium prevalence distribution was shown to be an eigenvector of a partition of the matrix of transition probabilities. The eigenvector worked well for time non-homogeneous examples as well. We developed a test for whether the available sample was at equilibrium, and used it to explore whether there was selection bias in the baseline distribution of a large longitudinal cohort sample

Collection Of Biostatistics Research Archive

Probabilities of Transition Among Health States for Older Adults

Author: Diehr Paula
Patrick Donald L.
Publication venue: Collection of Biostatistics Research Archive
Publication date: 01/01/2001
Field of study

Goal: To estimate the probabilities of transition among self-rated health states for older adults, and examine how they vary by age and sex. Methods: We used self-rated health (Excellent, Very Good, Good, Fair, Poor, Dead) collected in two longitudinal studies of older adults (Mean age 75) to estimate the probability of transition in two years. We used the estimates to project future health for selected cohorts. Findings: These older adults were most likely to be in the same health state 2 years later, but a substantial proportion changed in both directions. Transition probabilities varied by initial health state, age and sex. Men were more likely than women to transition to Excellent or Dead. Women were more likely than men to transition to Good or Fair health. Although women aged 70 will have more years of life and more years of healthy life than men, they also have more years of unhealthy life, and the proportion of remaining life that is healthy is slightly higher for men. When observed and predicted Years of Healthy Life (YHL) were compared in various subgroups, the YHL of persons with less favorable baseline characteristics was lower than predicted, and vice-versa. Differences, however, were small (about 5%). Conclusions: These transition probability estimates can be used to predict the future health of individuals or groups as a function of current age, sex, and self-rated health

Collection Of Biostatistics Research Archive

Age- and Sex-Specific Transformations of Health Status Measures to Incorporate Death

Author: Derleth Ann M
Diehr Paula
Publication venue: Collection of Biostatistics Research Archive
Publication date: 16/06/2006
Field of study

Introduction: Measures of health status and physical function do not usually include a specific code for death. This can cause problems in longitudinal studies because analyses limited to survivors may bias the results. One approach is to recode the status variables to include a reasonable value for death. One method that has been used is to replace each scale value with the estimated probability that a person with this value will be “healthy”. “Healthy” has been defined as being above a particular threshold on the variable of interest one year later, or alternatively as being in excellent, very good, or good self-rated health in the same year. Transformation coefficients have been published for various health status measures, but the coefficients were estimated from data for older adults (usually older than 65). Methods: Here, we used data from the Medical Expenditures Panel Survey (MEPS) to develop new age-specific coefficients for self-rated health, activities of daily living (ADL), instrumental activities of daily living (IADL), and the SF-12 physical function scale (PCS). We computed new age-specific transformations for ages 0 through 85 and compared the new transformations with published transformations for persons aged 65 and older. Results: The transformed values were different at different ages, The new transformed values for persons 65 and over were remarkably similar to the published results, calculated from different datasets. Conclusion: The new transformation equations should be particularly useful for studies involving persons younger than 65. For older persons, either the published equations or these new equations may be used

Collection Of Biostatistics Research Archive

Statistical Measures for Admission Rates

Author: Diehr Paula
Publication venue: Collection of Biostatistics Research Archive
Publication date: 18/08/1978
Field of study

Hospital admission rates are often shown and interpreted without consideration of their inherent variability, which may lead to faulty conclusions. This may be because theoretically correct variance estimates are not known for the type of estimates usually used; i.e., total admissions divided by total person-months of observation. Here, correct methods for testing and estimation are shown for situations where they exist. For other types of data, approximate procedures are proposed and their properties examined theoretically and empirically, yielding recommendations for exact and approximate estimation and testing methods for admission rates in common situations

Collection Of Biostatistics Research Archive

Sample Size Calculations and Optimal Followup Time in Health Services Research Using Utilization Rates

Author: Diehr Paula
Publication venue: Collection of Biostatistics Research Archive
Publication date: 22/08/1980
Field of study

It is not always possible to estimate the sample sizes needed in health services research because special formulas are needed, and the necessary data may not be available to use in the formulas. We provide some useful formulas for the sample size required in comparing the means of two groups. These include the special case where the two groups are not of equal size either because one is known to have a higher variability or because one group has already been chosen and its size is thus fixed. We also explore the relationship of the mean to the standard deviation for utilization measures, so that the latter can be estimated from the former for use in the equations. In general, the coefficient of variation is on the order of 2, suggesting that the standard deviation may be crudely estimated as twice the mean. The optimal follow-up period is also calculated

Collection Of Biostatistics Research Archive

Methods for Dealing with Death and Missing Data, and for Standardizing Different Health Variables in Longitudinal Datasets: The Cardiovascular Health Study

Author: Diehr Paula
Publication venue: Collection of Biostatistics Research Archive
Publication date: 29/04/2016
Field of study

Longitudinal studies of older adults usually need to account for deaths and missing data. The study databases often include multiple health-related variables, whose trends over time are hard to compare because they were measured on different scales. Here we present a unified approach to these three problems that was developed and used in the Cardiovascular Health Study. Data were first transformed to a new scale that had integer/ratio properties, and on which “dead” logically takes the value zero. Missing data were then imputed on this new scale, using each person’s own data over time. Imputation could thus be informed by impending death. The new transformed and imputed variable has a value for every person at every potential time, accounts for death, and can also be considered as a measure of “standardized health” that permits comparison of variables that were originally measured on different scales. The imputed variable can also be transformed back to the original scale, which differs from the original data in that missing values have been imputed. Imputed values near death required an addition “post-adjustment”. One approach is shown in sections 5 and 6. In the resulting tidy dataset, every observation is labeled as to whether it was observed, imputed (and how), or the person was dead at the time. The resulting “tidy” dataset can be considered complete, but is flexible enough to permit analysts to handle missing data and deaths in other ways. This approach may be useful for other longitudinal studies as well as for the Cardiovascular Health Study

Collection Of Biostatistics Research Archive

Longitudinal Data with Follow-up Truncated by Death: Finding a Match Between Analysis Method and Research Aims

Author: Diehr Paula
Johnson Laura Lee
Kurland Brenda
Publication venue: Collection of Biostatistics Research Archive
Publication date: 27/11/2007
Field of study

Diverse analysis approaches have been proposed to distinguish data missing due to death from nonresponse, and to summarize trajectories of longitudinal data truncated by death. We demonstrate how these analysis approaches arise from factorizations of the distribution of longitudinal data and survival information. Models are illustrated using hypothetical data examples (cognitive functioning in older adults, and quality of life under hospice care) and up to 10 annual assessments of longitudinal cognitive functioning data for 3814 participants in an observational study. For unconditional models, deaths do not occur, deaths are independent of the longitudinal response, or the unconditional longitudinal response averages over the survival distribution. Unconditional models, such as random effects models, may implicitly impute data beyond the time of death. Fully conditional models stratify the longitudinal response trajectory by time of death. Fully conditional models are effective for describing individual trajectories, in terms of either aging (age, or years from baseline) or dying (years from death). Partly conditional models summarize the longitudinal response in the dynamic cohort of survivors. Partly conditional models are serial cross-sectional snapshots of the response. They reflect the average response in survivors at a given timepoint, rather than individual trajectories. Joint models of survival and longitudinal response describe the evolving health status of the entire cohort. Researchers using longitudinal data should consider which method of accommodating deaths is consistent with research aims, and use analysis methods accordingly

Collection Of Biostatistics Research Archive

Pooling Community Data for Community Interventions When the Number of Pairs is Small

Author: Andrilla Holly
Diehr Paula
Feng Ziding
Lystig Ted
Publication venue: Collection of Biostatistics Research Archive
Publication date: 20/05/1997
Field of study

There is considerable interest in community interventions for health promotion, where the community is the experimental unit. Because such interventions are expensive, the number of experimental units (communities) is usually very small, yielding a study with low power. We examined the ability of a process known as “pooling” or “preliminary significance testing” to improve the power of community variations. In this process, one first tests whether there is significant community variation, using type 1 error of perhaps 0.25. If there is significant variation, the usual community-level test is performed. If not, a person-level test is performed. We found through Monte Carlo simulation that for studies with 2, 3, or 4 communities per group, this procedure could improve power somewhat in situations where the community by time variation is known to be small. Estimates of community by time variation for a variety of health variables are also presented. Because of the limited information available on community variances, and the probable difficulties in defending a person-level analysis, we recommend against the pooling procedure at this time

Collection Of Biostatistics Research Archive