30 research outputs found
State of the Multiple Imputation Software
Owing to its practicality as well as strong inferential properties, multiple imputation has been increasingly popular in the analysis of incomplete data. Methods that are not only computationally elegant but also applicable in wide spectrum of statistical incomplete data problems have also been increasingly implemented in a numerous computing environments. Unfortunately, however, the speed of this development has not been replicated in reaching to "sophisticated" users. While the researchers have been quite successful in developing the underlying software, documentation in a style that would be most reachable to the greater scientific society has been lacking. The main goal of this special volume is to close this gap by articles that illustrate these software developments. Here I provide a brief history of multiple imputation and relevant software and highlight the contents of the contributions. Potential directions for the future of the software development is also provided
A likelihood-based approach to mixed modeling with ambiguity in cluster identifiers
This manuscript describes a novel, linear mixed-effects modelâfitting technique for the setting in which correlated data indicators are not completely observed. Mixed modeling is a useful analytical tool for characterizing genotypeâphenotype associations among multiple potentially informative genetic loci. This approach involves grouping individuals into genetic clusters, where individuals in the same cluster have similar or identical multilocus genotypes. In haplotype-based investigations of unrelated individuals, corresponding cluster assignments are unobservable since the alignment of alleles within chromosomal copies is not generally observed. We derive an expectation conditional maximization approach to estimation in the mixed modeling setting, where cluster assignments are ambiguous. The approach has broad relevance to the analysis of data with missing correlated data identifiers. An example is provided based on data arising from a cohort of human immunodeficiency virus type-1âinfected individuals at risk for antiretroviral therapyâassociated dyslipidemia
Longitudinal Analysis of Cepstral Peak Prominence in Children
Objectives: To evaluate whether the acoustic measure of cepstral peak prominence changes during typical development in children 2â7.
Methods: Data were retrospectively analyzed from the Arizona Child Acoustic Database Repository in this longitudinal cohort study. The Repository contains longitudinal data recordings from 63 total children between 2â7 years of age. Thirty-one children met the inclusion criteria for the current analysis (at least five time points of usable speech data, no history of speech or language difficulties, no significant dysphonia, and were monolingual speakers of American English). Cepstral peak prominence measures were calculated in Praat for each child, at each timepoint. Additional acoustic measures of vocal fundamental frequency, vocal intensity, and stimuli length were also calculated. These measures were chosen as previous work has shown they may impact cepstral peak prominence values.
Results: Linear mixed-effects regression models examined the relationship between cepstral peak prominence and age, after controlling for vocal fundamental frequency, vocal intensity, and stimuli length. Within-participant effects of age were found, indicating a trajectory change in which cepstral peak prominence increases with age in this population. This positive relationship between a cepstral peak prominence and age was nonlinear, with a steeper slope between age and cepstral peak prominence after five years of age.
Conclusions: This is the first study to examine the typical developmental trajectory of cepstral peak prominence children between 2â7 years, a critical period of vocal development. Cepstral peak prominence increased with age, suggesting an increase in periodicity of vocal fold vibration that coincides with the significant vocal fold structural changes occurring during this time. Outcomes present important normative information on vocal development, essential for effectively understanding the difference between what vocal changes are part of normative development and what changes indicate a voice disorder
Impact of non-normal random effects on inference by multiple imputation: A simulation assessment
Multivariate extensions of well-known linear mixed-effects models have been increasingly utilized in inference by multiple imputation in the analysis of multilevel incomplete data. The normality assumption for the underlying error terms and random effects plays a crucial role in simulating the posterior predictive distribution from which the multiple imputations are drawn. The plausibility of this normality assumption on the subject-specific random effects is assessed. Specifically, the performance of multiple imputation created under a multivariate linear mixed-effects model is investigated on a diverse set of incomplete data sets simulated under varying distributional characteristics. Under moderate amounts of missing data, the simulation study confirms that the underlying model leads to a well-calibrated procedure with negligible biases and actual coverage rates close to nominal rates in estimates of the regression coefficients. Estimation quality of the random-effect variance and association measures, however, are negatively affected from both the misspecification of the random-effect distribution and number of incompletely-observed variables. Some of the adverse impacts include lower coverage rates and increased biases.
Muslim/Non-Muslim Locational Attainment in Philadelphia: A New Fault Line in Residential Inequality?
This study examines Muslim/non-Muslim disparities in locational attainment. We pooled data from the 2004, 2006, and 2008 waves of the Public Health Management Corporationâs Southeastern Pennsylvania Household Survey. These data contain respondentsâ religious identities and are geocoded at the census-tract level, allowing us to merge American Community Survey data and examine neighborhood-level outcomes to gauge respondentsâ locational attainment. Net of controls, our multivariate analyses reveal that among blacks and nonblacks, Muslims live in neighborhoods that have significantly lower shares of whites and greater representations of blacks. Among blacks, Muslims are significantly less likely to reside in suburbs, relative to non-Muslims. The Muslim disadvantages for blacks and nonblacks in neighborhood poverty and neighborhood median income, however, become insignificant. Our results provide support for the tenets of the spatial assimilation and place stratification models and suggest that Muslim/non-Muslim disparities in locational attainment comprise a new fault line in residential stratification
Recommended from our members
An Expectation Maximization Approach to Estimate Malaria Haplotype Frequencies in Multiply Infected Children
Characterizing genetic variability in the human pathogenic Plasmodium species, the group of parasites that cause Malaria, may have broad global health implications. Specifically, discerning the combinations of mutations that lead to viable parasites and the population level frequencies of these clonal sequences will allow for targeted vaccine development and individualized treatment choices. This presents an analytical challenge, however, since haplotypic phase (i.e. the alignment of bases on a single DNA strand) is generally unobservable in multiply infected individuals. This manuscript describes an expectation maximization (EM) approach to maximum likelihood estimation of haplotype frequencies in this missing data setting. The approach is applied to a cohort of N=341 malaria infected children in Uganda, Cameroon and Sudan to characterize regional differences. A simulation study is also presented to characterize method performance and assess sensitivity to distributional assumptions
An Expectation Maximization Approach to Estimate Malaria Haplotype Frequencies in Multiply Infected Children
Characterizing genetic variability in the human pathogenic Plasmodium species, the group of parasites that cause Malaria, may have broad global health implications. Specifically, discerning the combinations of mutations that lead to viable parasites and the population level frequencies of these clonal sequences will allow for targeted vaccine development and individualized treatment choices. This presents an analytical challenge, however, since haplotypic phase (i.e. the alignment of bases on a single DNA strand) is generally unobservable in multiply infected individuals. This manuscript describes an expectation maximization (EM) approach to maximum likelihood estimation of haplotype frequencies in this missing data setting. The approach is applied to a cohort of N=341 malaria infected children in Uganda, Cameroon and Sudan to characterize regional differences. A simulation study is also presented to characterize method performance and assess sensitivity to distributional assumptions.