30 research outputs found
Parity and body mass index in US women: a prospective 25-year study.
ObjectiveTo investigate long-term body mass index (BMI) changes associated with childbearing.Design and methodsAdjusted mean BMI changes were estimated by race-ethnicity, baseline BMI, and parity using longitudinal regression models for 3,943 young females over 10 and 25 year follow-up from the ongoing 1979 National Longitudinal Survey of Youth cohort.ResultsEstimated BMI increases varied by group, ranging from a low of 2.1 BMI units for white, non-overweight nulliparas over the first 10 years to a high of 10.1 BMI units for black, overweight multiparas over the full 25-year follow-up. Impacts of parity were strongest among overweight multiparas and primaparas at 10 years, ranges 1.4-1.7 and 0.8-1.3 BMI units, respectively. Among non-overweight women, parity-related gain at 10 years varied by number of births among black and white but not Hispanic women. After 25 years, childbearing significantly increased BMI only among overweight multiparous black women.ConclusionChildbearing is associated with permanent weight gain in some women, but the relationship differs by maternal BMI in young adulthood, number of births, race-ethnicity, and length of follow-up. Given that overweight black women may be at special risk for accumulation of permanent, long-term weight after childbearing, effective interventions for this group are particularly needed
Vertically Shifted Mixture Models for Clustering Longitudinal Data by Shape
Longitudinal studies play a prominent role in health, social and behavioral sciences as well as in the biological sciences, economics, and marketing. By following subjects over time, temporal changes in an outcome of interest can be directly observed and studied. An important question concerns the existence of distinct trajectory patterns. One way to determine these distinct patterns is through cluster analysis, which seeks to separate objects (subjects, patients, observational units) into homogeneous groups. Many methods have been adapted for longitudinal data, but almost all of them fail to explicitly group trajectories according to distinct pattern shapes. To fulfill the need for clustering based explicitly on shape, we propose vertically shifting the data by subtracting the subject-specific mean directly removes the level prior to fitting a mixture modeling. This non-invertible transformation can result in singular covariance matrixes, which makes mixture model estimation difficult. Despite the challenges, this method outperforms existing clustering methods in a simulation study
Diversity of Artists in Major U.S. Museums
The U.S. art museum sector is grappling with diversity. While previous work
has investigated the demographic diversity of museum staffs and visitors, the
diversity of artists in their collections has remained unreported. We conduct
the first large-scale study of artist diversity in museums. By scraping the
public online catalogs of 18 major U.S. museums, deploying a sample of 10,000
artist records comprising over 9,000 unique artists to crowdsourcing, and
analyzing 45,000 responses, we infer artist genders, ethnicities, geographic
origins, and birth decades. Our results are threefold. First, we provide
estimates of gender and ethnic diversity at each museum, and overall, we find
that 85% of artists are white and 87% are men. Second, we identify museums that
are outliers, having significantly higher or lower representation of certain
demographic groups than the rest of the pool. Third, we find that the
relationship between museum collection mission and artist diversity is weak,
suggesting that a museum wishing to increase diversity might do so without
changing its emphases on specific time periods and regions. Our methodology can
be used to broadly and efficiently assess diversity in other fields.Comment: 15 pages, 2 figures, minor revisions of and enhancements to tex
Population-level detection of early loss of kidney function: 7-year follow-up of a young adult cohort at risk of Mesoamerican nephropathy
BACKGROUND: Mesoamerican nephropathy is a leading contributor to premature mortality in Central America. Efforts to identify the cause are hampered by difficulties in distinguishing associations with potential initiating factors from common exposures thought to exacerbate the progression of all forms of established chronic kidney disease (CKD). We explored evidence of disease onset or departure from the healthy estimated glomerular filtration rate distribution [departure from ∼eGFR(healthy)] in an at-risk population. METHODS: Two community-based cohorts (adults aged 18-30 years, n = 351 and 420) from 11 rural communities in Northwest Nicaragua were followed up over 7 and 3 years respectively. We examined associations with both (i) incident CKD and (ii) the time point of departure from ∼eGFR(healthy), using a hidden Markov model. RESULTS: CKD occurred in men only (male incidence rate: 0.7%/year). Fifty-three (out of 1878 visits, 2.7%) and 8 (out of 1067 visits, 0.8%) episodes of probable departure from ∼eGFR(healthy) occurred in men and women, respectively. Cumulative time in sugarcane work and symptoms of excess occupational sun exposure were associated with incident CKD. The same exposures were associated with probability of departure from ∼eGFR(healthy) in time-updated analyses along with measured and self-reported weight loss, nausea, vomiting and cramps, as well as non-steroidal anti-inflammatory drug use. CONCLUSIONS: CKD burden in this population is high and risk factors for established disease are occupational. Additionally, a syndrome suggesting an alternative exposure is associated with evidence of disease onset supporting a possible separate unknown initiating factor for which further investigation is needed. Interventions to reduce the impact of occupational risks should be pursued meanwhile
Recommended from our members
Longitudinal Cluster Analysis with Applications to Growth Trajectories
Longitudinal studies play a prominent role in health, social, and behavioral sciences as well as in the biological sciences, economics, and marketing. By following subjects over time, temporal changes in an outcome of interest can be directly observed and studied. An important question concerns the existence of distinct trajectory patterns. One way to discover potential patterns in the data is through cluster analysis, which seeks to separate objects (individuals, subjects, patients, observational units) into homogeneous groups. There are many ways to cluster multivariate data. Most methods can be categorized into one of two approaches: nonparametric and model-based methods. The first approach makes no assumptions about how the data were generated and produces a sequence of clustering results indexed by the number of clusters k=2,3,... and the choice of dissimilarity measure. The later approach assumes data vectors are generated from a finite mixture of distributions. The bulk of the available clustering algorithms are intended for use on data vectors with exchangeable, independent elements and are not appropriate to be directly applied to repeated measures with inherent dependence.Multivariate Gaussian mixtures are a class of models that provide a flexible parametric approach for the representation of heterogeneous multivariate outcomes. When the outcome is a vector of repeated measurements taken on the same subject, there is often inherent dependence between observations. However, a common covariance assumption is conditional independence---that is, given the mixture component label, the outcomes for subjects are independent. In Chapter 2, I study, through asymptotic bias calculations and simulation, the impact of covariance misspecification in multivariate Gaussian mixtures. Although maximum likelihood estimators of regression and prior probability parameters are not consistent under misspecification, they have little asymptotic bias when mixture components are well separated or if the assumed correlation is close to the truth even when the covariance is misspecified. I also present a robust standard error estimator and show that it outperforms conventional estimators in simulations and can provide evidence that the model is misspecified. The main goal of a longitudinal study is to observed individual change over time; therefore, observed trajectories have two prominent features: level and shape of change over time. These features are typically associated with baseline characteristics of the individual. Grouping by shape and level separately provides an opportunity to detect and estimate these relationships. Although many nonparametric and model-based methods have been adapted for longitudinal data, most fail to explicitly group individuals according to the shape of their repeated measure trajectory. Some methods are thought to group by shape, but the dissimilarity between trajectories is not defined in terms of any one specific feature of the data. Rather, the methods are based on the entire vector and cluster trajectories by the level because it tends to dominate the variability between data vectors. These methods discover shape groups only if level and shape are correlated. To fulfill the need for clustering based explicitly on shape, I propose three methods Chapter 4 that are adaptations of available algorithms. One approach is to use a dissimilarity measure based on estimated derivatives of functions underlying the trajectories. One challenge for this approach is estimating the derivatives with minimal bias and variance. The second approach explicitly models the variability in the level within a group of similarly shaped trajectories using a mixture model resulting in a multilayer mixture model. One difficulty with this method comes in choosing the number of shape clusters. Lastly, vertically shifting the data by subtracting the subject-specific mean directly removes the level prior to modeling. This non-invertible transformation can result in singular covariance matrixes, which makes parameter estimation difficult. In theory, all of these methods should cluster based on shape, but each method has shortfalls. I compare these methods with existing clustering methods in a simulation study in Chapter 5 and find that the vertical shifted mixture model outperforms the existing and other proposed methods. A subset of the clustering methods are then compared on a real data set of childhood growth trajectories from the Center for the Health Assessment of Mothers and Children of Salinas (CHAMACOS) study in Chapter 6. Vertically shifting the data prior to fitting a mixture model results in groups based on the shape of their growth over time in contrast to the standard mixture model assuming either conditional independence or a more general correlation. The group means do not drastically change between methods for this data set, but group membership differs enough to impact inference about the relationship between baseline covariates and distinct groups
Recommended from our members
Longitudinal Cluster Analysis with Applications to Growth Trajectories
Longitudinal studies play a prominent role in health, social, and behavioral sciences as well as in the biological sciences, economics, and marketing. By following subjects over time, temporal changes in an outcome of interest can be directly observed and studied. An important question concerns the existence of distinct trajectory patterns. One way to discover potential patterns in the data is through cluster analysis, which seeks to separate objects (individuals, subjects, patients, observational units) into homogeneous groups. There are many ways to cluster multivariate data. Most methods can be categorized into one of two approaches: nonparametric and model-based methods. The first approach makes no assumptions about how the data were generated and produces a sequence of clustering results indexed by the number of clusters k=2,3,... and the choice of dissimilarity measure. The later approach assumes data vectors are generated from a finite mixture of distributions. The bulk of the available clustering algorithms are intended for use on data vectors with exchangeable, independent elements and are not appropriate to be directly applied to repeated measures with inherent dependence.Multivariate Gaussian mixtures are a class of models that provide a flexible parametric approach for the representation of heterogeneous multivariate outcomes. When the outcome is a vector of repeated measurements taken on the same subject, there is often inherent dependence between observations. However, a common covariance assumption is conditional independence---that is, given the mixture component label, the outcomes for subjects are independent. In Chapter 2, I study, through asymptotic bias calculations and simulation, the impact of covariance misspecification in multivariate Gaussian mixtures. Although maximum likelihood estimators of regression and prior probability parameters are not consistent under misspecification, they have little asymptotic bias when mixture components are well separated or if the assumed correlation is close to the truth even when the covariance is misspecified. I also present a robust standard error estimator and show that it outperforms conventional estimators in simulations and can provide evidence that the model is misspecified. The main goal of a longitudinal study is to observed individual change over time; therefore, observed trajectories have two prominent features: level and shape of change over time. These features are typically associated with baseline characteristics of the individual. Grouping by shape and level separately provides an opportunity to detect and estimate these relationships. Although many nonparametric and model-based methods have been adapted for longitudinal data, most fail to explicitly group individuals according to the shape of their repeated measure trajectory. Some methods are thought to group by shape, but the dissimilarity between trajectories is not defined in terms of any one specific feature of the data. Rather, the methods are based on the entire vector and cluster trajectories by the level because it tends to dominate the variability between data vectors. These methods discover shape groups only if level and shape are correlated. To fulfill the need for clustering based explicitly on shape, I propose three methods Chapter 4 that are adaptations of available algorithms. One approach is to use a dissimilarity measure based on estimated derivatives of functions underlying the trajectories. One challenge for this approach is estimating the derivatives with minimal bias and variance. The second approach explicitly models the variability in the level within a group of similarly shaped trajectories using a mixture model resulting in a multilayer mixture model. One difficulty with this method comes in choosing the number of shape clusters. Lastly, vertically shifting the data by subtracting the subject-specific mean directly removes the level prior to modeling. This non-invertible transformation can result in singular covariance matrixes, which makes parameter estimation difficult. In theory, all of these methods should cluster based on shape, but each method has shortfalls. I compare these methods with existing clustering methods in a simulation study in Chapter 5 and find that the vertical shifted mixture model outperforms the existing and other proposed methods. A subset of the clustering methods are then compared on a real data set of childhood growth trajectories from the Center for the Health Assessment of Mothers and Children of Salinas (CHAMACOS) study in Chapter 6. Vertically shifting the data prior to fitting a mixture model results in groups based on the shape of their growth over time in contrast to the standard mixture model assuming either conditional independence or a more general correlation. The group means do not drastically change between methods for this data set, but group membership differs enough to impact inference about the relationship between baseline covariates and distinct groups