3,390 research outputs found

    Graph-Sparse LDA: A Topic Model with Structured Sparsity

    Full text link
    Originally designed to model text, topic modeling has become a powerful tool for uncovering latent structure in domains including medicine, finance, and vision. The goals for the model vary depending on the application: in some cases, the discovered topics may be used for prediction or some other downstream task. In other cases, the content of the topic itself may be of intrinsic scientific interest. Unfortunately, even using modern sparse techniques, the discovered topics are often difficult to interpret due to the high dimensionality of the underlying space. To improve topic interpretability, we introduce Graph-Sparse LDA, a hierarchical topic model that leverages knowledge of relationships between words (e.g., as encoded by an ontology). In our model, topics are summarized by a few latent concept-words from the underlying graph that explain the observed words. Graph-Sparse LDA recovers sparse, interpretable summaries on two real-world biomedical datasets while matching state-of-the-art prediction performance

    Identifying predictive features of autism spectrum disorders in a clinical sample of adolescents and adults using machine learning

    Get PDF
    Diagnosing autism spectrum disorders (ASD) is a complicated, time-consuming process which is particularly challenging in older individuals. One of the most widely used behavioral diagnostic tools is the Autism Diagnostic Observation Schedule (ADOS). Previous work using machine learning techniques suggested that ASD detection in children can be achieved with substantially fewer items than the original ADOS. Here, we expand on this work with a specific focus on adolescents and adults as assessed with the ADOS Module 4. We used a machine learning algorithm (support vector machine) to examine whether ASD detection can be improved by identifying a subset of behavioral features from the ADOS Module 4 in a routine clinical sample of N = 673 high-functioning adolescents and adults with ASD (n = 385) and individuals with suspected ASD but other best-estimate or no psychiatric diagnoses (n = 288). We identified reduced subsets of 5 behavioral features for the whole sample as well as age subgroups (adolescents vs. adults) that showed good specificity and sensitivity and reached performance close to that of the existing ADOS algorithm and the full ADOS, with no significant differences in overall performance. These results may help to improve the complicated diagnostic process of ASD by encouraging future efforts to develop novel diagnostic instruments for ASD detection based on the identified constructs as well as aiding clinicians in the difficult question of differential diagnosis

    Identification and Analysis of Behavioral Phenotypes in Autism Spectrum Disorder via Unsupervised Machine Learning

    Get PDF
    Background and objective: Autism spectrum disorder (ASD) is a heterogeneous disorder. Research has explored potential ASD subgroups with preliminary evidence supporting the existence of behaviorally and genetically distinct subgroups; however, research has yet to leverage machine learning to identify phenotypes on a scale large enough to robustly examine treatment response across such subgroups. The purpose of the present study was to apply Gaussian Mixture Models and Hierarchical Clustering to identify behavioral phenotypes of ASD and examine treatment response across the learned phenotypes. Materials and methods: The present study included a sample of children with ASD (N = 2400), the largest of its kind to date. Unsupervised machine learning was applied to model ASD subgroups as well as their taxonomic relationships. Retrospective treatment data were available for a portion of the sample (n =1034). Treatment response was examined within each subgroup via regression. Results: The application of a Gaussian Mixture Model revealed 16 subgroups. Further examination of the subgroups through Hierarchical Agglomerative Clustering suggested 2 overlying behavioral phenotypes with unique deficit profiles each composed of subgroups that differed in severity of those deficits. Furthermore, differentiated response to treatment was found across subtypes, with a substantially higher amount of variance accounted for due to the homogenization effect of the clustering. Discussion: The high amount of variance explained by the regression models indicates that clustering provides a basis for homogenization, and thus an opportunity to tailor treatment based on cluster memberships. These findings have significant implications on prognosis and targeted treatment of ASD, and pave the way for personalized intervention based on unsupervised machine learning

    EEG analytics for early detection of autism spectrum disorder: a data-driven approach

    Get PDF
    Autism spectrum disorder (ASD) is a complex and heterogeneous disorder, diagnosed on the basis of behavioral symptoms during the second year of life or later. Finding scalable biomarkers for early detection is challenging because of the variability in presentation of the disorder and the need for simple measurements that could be implemented routinely during well-baby checkups. EEG is a relatively easy-to-use, low cost brain measurement tool that is being increasingly explored as a potential clinical tool for monitoring atypical brain development. EEG measurements were collected from 99 infants with an older sibling diagnosed with ASD, and 89 low risk controls, beginning at 3 months of age and continuing until 36 months of age. Nonlinear features were computed from EEG signals and used as input to statistical learning methods. Prediction of the clinical diagnostic outcome of ASD or not ASD was highly accurate when using EEG measurements from as early as 3 months of age. Specificity, sensitivity and PPV were high, exceeding 95% at some ages. Prediction of ADOS calibrated severity scores for all infants in the study using only EEG data taken as early as 3 months of age was strongly correlated with the actual measured scores. This suggests that useful digital biomarkers might be extracted from EEG measurements.This research was supported by National Institute of Mental Health (NIMH) grant R21 MH 093753 (to WJB), National Institute on Deafness and Other Communication Disorders (NIDCD) grant R21 DC08647 (to HTF), NIDCD grant R01 DC 10290 (to HTF and CAN) and a grant from the Simons Foundation (to CAN, HTF, and WJB). We are especially grateful to the staff and students who worked on the study and to the families who participated. (R21 MH 093753 - National Institute of Mental Health (NIMH); R21 DC08647 - National Institute on Deafness and Other Communication Disorders (NIDCD); R01 DC 10290 - NIDCD; Simons Foundation)Published versio
    • …
    corecore