53 research outputs found

    A Comparison of Machine Learning Algorithms for the Surveillance of Autism Spectrum Disorder

    Full text link
    The Centers for Disease Control and Prevention (CDC) coordinates a labor-intensive process to measure the prevalence of autism spectrum disorder (ASD) among children in the United States. Random forests methods have shown promise in speeding up this process, but they lag behind human classification accuracy by about 5%. We explore whether more recently available document classification algorithms can close this gap. We applied 8 supervised learning algorithms to predict whether children meet the case definition for ASD based solely on the words in their evaluations. We compared the algorithms' performance across 10 random train-test splits of the data, using classification accuracy, F1 score, and number of positive calls to evaluate their potential use for surveillance. Across the 10 train-test cycles, the random forest and support vector machine with Naive Bayes features (NB-SVM) each achieved slightly more than 87% mean accuracy. The NB-SVM produced significantly more false negatives than false positives (P = 0.027), but the random forest did not, making its prevalence estimates very close to the true prevalence in the data. The best-performing neural network performed similarly to the random forest on both measures. The random forest performed as well as more recently available models like the NB-SVM and the neural network, and it also produced good prevalence estimates. NB-SVM may not be a good candidate for use in a fully-automated surveillance workflow due to increased false negatives. More sophisticated algorithms, like hierarchical convolutional neural networks, may not be feasible to train due to characteristics of the data. Current algorithms might perform better if the data are abstracted and processed differently and if they take into account information about the children in addition to their evaluations

    Services for Adults with an Autism Spectrum Disorder

    Get PDF
    Objective: The need for useful evidence about services is increasing as larger numbers of children identified with an autism spectrum disorder age toward adulthood. The objective of this review was to characterize the topical and methodological aspects of research on services for supporting success in work, education, and social participation among adults with an autism spectrum disorder and to propose recommendations for moving this area of research forward. Method: Review of literature published in English from 2000 to 2010.Results: We found that the evidence base about services for adults with an ASD is underdeveloped and can be considered a field of inquiry that is relatively unformed. Extant research does not reflect the demographic or impairment heterogeneity of the population, the range of services that adults with autism require in order to function with purposeful lives in the community, and the need for coordination across service systems and sectors. Conclusions: Future studies must examine issues related to cost and efficiency given the broader sociopolitical and economic context of service provision. Furthermore, future research needs to consider how demographic and impairment heterogeneity have implications for building an evidence base that will have greater external validity

    Autism Spectrum Disorder Among US Children (2002–2010): Socioeconomic, Racial, and Ethnic Disparities

    Get PDF
    Objectives. To describe the association between indicators of socioeconomic status (SES) and the prevalence of autism spectrum disorder (ASD) in the United States during the period 2002 to 2010, when overall ASD prevalence among children more than doubled, and to determine whether SES disparities account for ongoing racial and ethnic disparities in ASD prevalence

    Assessment of Community Event-Based Surveillance for Ebola Virus Disease, Sierra Leone, 2015.

    Get PDF
    In 2015, community event-based surveillance (CEBS) was implemented in Sierra Leone to assist with the detection of Ebola virus disease (EVD) cases. We assessed the sensitivity of CEBS for finding EVD cases during a 7-month period, and in a 6-week subanalysis, we assessed the timeliness of reporting cases with no known epidemiologic links at time of detection. Of the 12,126 CEBS reports, 287 (2%) met the suspected case definition, and 16 were confirmed positive. CEBS detected 30% (16/53) of the EVD cases identified during the study period. During the subanalysis, CEBS staff identified 4 of 6 cases with no epidemiologic links. These CEBS-detected cases were identified more rapidly than those detected by the national surveillance system; however, too few cases were detected to determine system timeliness. Although CEBS detected EVD cases, it largely generated false alerts. Future versions of community-based surveillance could improve case detection through increased staff training and community engagement

    Analysis of human mini-exome sequencing data from Genetic Analysis Workshop 17 using a Bayesian hierarchical mixture model

    Get PDF
    Next-generation sequencing technologies are rapidly changing the field of genetic epidemiology and enabling exploration of the full allele frequency spectrum underlying complex diseases. Although sequencing technologies have shifted our focus toward rare genetic variants, statistical methods traditionally used in genetic association studies are inadequate for estimating effects of low minor allele frequency variants. Four our study we use the Genetic Analysis Workshop 17 data from 697 unrelated individuals (genotypes for 24,487 autosomal variants from 3,205 genes). We apply a Bayesian hierarchical mixture model to identify genes associated with a simulated binary phenotype using a transformed genotype design matrix weighted by allele frequencies. A Metropolis Hasting algorithm is used to jointly sample each indicator variable and additive genetic effect pair from its conditional posterior distribution, and remaining parameters are sampled by Gibbs sampling. This method identified 58 genes with a posterior probability greater than 0.8 for being associated with the phenotype. One of these 58 genes, PIK3C2B was correctly identified as being associated with affected status based on the simulation process. This project demonstrates the utility of Bayesian hierarchical mixture models using a transformed genotype matrix to detect genes containing rare and common variants associated with a binary phenotype

    Detecting gene-environment interactions in genome-wide association data

    Get PDF
    Despite the importance of gene-environment (G×E) interactions in the etiology of common diseases, little work has been done to develop methods for detecting these types of interactions in genome-wide association study data. This was the focus of Genetic Analysis Workshop 16 Group 10 contributions, which introduced a variety of new methods for the detection of G×E interactions in both case-control and family-based data using both cross-sectional and longitudinal study designs. Many of these contributions detected significant G×E interactions. Although these interactions have not yet been confirmed, the results suggest the importance of testing for interactions. Issues of sample size, quantifying the environmental exposure, longitudinal data analysis, family-based analysis, selection of the most powerful analysis method, population stratification, and computational expense with respect to testing G×E interactions are discussed
    • …
    corecore