126 research outputs found

    Machine Learning in Automated Text Categorization

    Full text link
    The automated categorization (or classification) of texts into predefined categories has witnessed a booming interest in the last ten years, due to the increased availability of documents in digital form and the ensuing need to organize them. In the research community the dominant approach to this problem is based on machine learning techniques: a general inductive process automatically builds a classifier by learning, from a set of preclassified documents, the characteristics of the categories. The advantages of this approach over the knowledge engineering approach (consisting in the manual definition of a classifier by domain experts) are a very good effectiveness, considerable savings in terms of expert manpower, and straightforward portability to different domains. This survey discusses the main approaches to text categorization that fall within the machine learning paradigm. We will discuss in detail issues pertaining to three different problems, namely document representation, classifier construction, and classifier evaluation.Comment: Accepted for publication on ACM Computing Survey

    Polygenic risk score-based phenome-wide association study identifies novel associations for Tourette syndrome

    Get PDF
    Tourette Syndrome (TS) is a complex neurodevelopmental disorder characterized by vocal and motor tics lasting more than a year. It is highly polygenic in nature with both rare and common previously associated variants. Epidemiological studies have shown TS to be correlated with other phenotypes, but large-scale phenome wide analyses in biobank level data have not been performed to date. In this study, we used the summary statistics from the latest meta-analysis of TS to calculate the polygenic risk score (PRS) of individuals in the UK Biobank data and applied a Phenome Wide Association Study (PheWAS) approach to determine the association of disease risk with a wide range of phenotypes. A total of 57 traits were found to be significantly associated with TS polygenic risk, including multiple psychosocial factors and mental health conditions such as anxiety disorder and depression. Additional associations were observed with complex non-psychiatric disorders such as Type 2 diabetes, heart palpitations, and respiratory conditions. Cross-disorder comparisons of phenotypic associations with genetic risk for other childhood-onset disorders (e.g.: attention deficit hyperactivity disorder [ADHD], autism spectrum disorder [ASD], and obsessive-compulsive disorder [OCD]) indicated an overlap in associations between TS and these disorders. ADHD and ASD had a similar direction of effect with TS while OCD had an opposite direction of effect for all traits except mental health factors. Sex-specific PheWAS analysis identified differences in the associations with TS genetic risk between males and females. Type 2 diabetes and heart palpitations were significantly associated with TS risk in males but not in females, whereas diseases of the respiratory system were associated with TS risk in females but not in males. This analysis provides further evidence of shared genetic and phenotypic architecture of different complex disorders

    Gene selection for classification of microarray data based on the Bayes error

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>With DNA microarray data, selecting a compact subset of discriminative genes from thousands of genes is a critical step for accurate classification of phenotypes for, e.g., disease diagnosis. Several widely used gene selection methods often select top-ranked genes according to their individual discriminative power in classifying samples into distinct categories, without considering correlations among genes. A limitation of these gene selection methods is that they may result in gene sets with some redundancy and yield an unnecessary large number of candidate genes for classification analyses. Some latest studies show that incorporating gene to gene correlations into gene selection can remove redundant genes and improve classification accuracy.</p> <p>Results</p> <p>In this study, we propose a new method, Based Bayes error Filter (BBF), to select relevant genes and remove redundant genes in classification analyses of microarray data. The effectiveness and accuracy of this method is demonstrated through analyses of five publicly available microarray datasets. The results show that our gene selection method is capable of achieving better accuracies than previous studies, while being able to effectively select relevant genes, remove redundant genes and obtain efficient and small gene sets for sample classification purposes.</p> <p>Conclusion</p> <p>The proposed method can effectively identify a compact set of genes with high classification accuracy. This study also indicates that application of the Bayes error is a feasible and effective wayfor removing redundant genes in gene selection.</p

    Enhancing neuroimaging genetics through meta-analysis for Tourette syndrome (ENIGMA-TS): A worldwide platform for collaboration

    Get PDF
    Tourette syndrome (TS) is characterized by multiple motor and vocal tics, and high-comorbidity rates with other neuropsychiatric disorders. Obsessive compulsive disorder (OCD), attention deficit hyperactivity disorder (ADHD), autism spectrum disorders (ASDs), major depressive disorder (MDD), and anxiety disorders (AXDs) are among the most prevalent TS comorbidities. To date, studies on TS brain structure and function have been limited in size with efforts mostly fragmented. This leads to low-statistical power, discordant results due to differences in approaches, and hinders the ability to stratify patients according to clinical parameters and investigate comorbidity patterns. Here, we present the scientific premise, perspectives, and key goals that have motivated the establishment of the Enhancing Neuroimaging Genetics through Meta-Analysis for TS (ENIGMA-TS) working group. The ENIGMA-TS working group is an international collaborative effort bringing together a large network of investigators who aim to understand brain structure and function in TS and dissect the underlying neurobiology that leads to observed comorbidity patterns and clinical heterogeneity. Previously collected TS neuroimaging data will be analyzed jointly and integrated with TS genomic data, as well as equivalently large and already existing studies of highly comorbid OCD, ADHD, ASD, MDD, and AXD. Our work highlights the power of collaborative efforts and transdiagnostic approaches, and points to the existence of different TS subtypes. ENIGMA-TS will offer large-scale, high-powered studies that will lead to important insights toward understanding brain structure and function and genetic effects in TS and related disorders, and the identification of biomarkers that could help inform improved clinical practice

    Age- and region-specific hepatitis B prevalence in Turkey estimated using generalized linear mixed models: a systematic review

    Get PDF
    Toy M, Önder FO, Wörmann T, et al. Age- and region-specific hepatitis B prevalence in Turkey estimated using generalized linear mixed models: a systematic review. BMC infectious diseases. 2011;11(1): 337.BACKGROUND: To provide a clear picture of the current hepatitis B situation, the authors performed a systematic review to estimate the age- and region-specific prevalence of chronic hepatitis B (CHB) in Turkey. METHODS: A total of 339 studies with original data on the prevalence of hepatitis B surface antigen (HBsAg) in Turkey and published between 1999 and 2009 were identified through a search of electronic databases, by reviewing citations, and by writing to authors. After a critical assessment, the authors included 129 studies, divided into categories: 'age-specific'; 'region-specific'; and 'specific population group'. To account for the differences among the studies, a generalized linear mixed model was used to estimate the overall prevalence across all age groups and regions. For specific population groups, the authors calculated the weighted mean prevalence. RESULTS: The estimated overall population prevalence was 4.57, 95% confidence interval (CI): 3.58, 5.76, and the estimated total number of CHB cases was about 3.3 million. The outcomes of the age-specific groups varied from 2.84, (95% CI: 2.60, 3.10) for the 0-14-year olds to 6.36 (95% CI: 5.83, 6.90) in the 25-34-year-old group. CONCLUSION: There are large age-group and regional differences in CHB prevalence in Turkey, where CHB remains a serious health problem

    Towards Omni-Tomography—Grand Fusion of Multiple Modalities for Simultaneous Interior Tomography

    Get PDF
    We recently elevated interior tomography from its origin in computed tomography (CT) to a general tomographic principle, and proved its validity for other tomographic modalities including SPECT, MRI, and others. Here we propose “omni-tomography”, a novel concept for the grand fusion of multiple tomographic modalities for simultaneous data acquisition in a region of interest (ROI). Omni-tomography can be instrumental when physiological processes under investigation are multi-dimensional, multi-scale, multi-temporal and multi-parametric. Both preclinical and clinical studies now depend on in vivo tomography, often requiring separate evaluations by different imaging modalities. Over the past decade, two approaches have been used for multimodality fusion: Software based image registration and hybrid scanners such as PET-CT, PET-MRI, and SPECT-CT among others. While there are intrinsic limitations with both approaches, the main obstacle to the seamless fusion of multiple imaging modalities has been the bulkiness of each individual imager and the conflict of their physical (especially spatial) requirements. To address this challenge, omni-tomography is now unveiled as an emerging direction for biomedical imaging and systems biomedicine

    Do Humans Optimally Exploit Redundancy to Control Step Variability in Walking?

    Get PDF
    It is widely accepted that humans and animals minimize energetic cost while walking. While such principles predict average behavior, they do not explain the variability observed in walking. For robust performance, walking movements must adapt at each step, not just on average. Here, we propose an analytical framework that reconciles issues of optimality, redundancy, and stochasticity. For human treadmill walking, we defined a goal function to formulate a precise mathematical definition of one possible control strategy: maintain constant speed at each stride. We recorded stride times and stride lengths from healthy subjects walking at five speeds. The specified goal function yielded a decomposition of stride-to-stride variations into new gait variables explicitly related to achieving the hypothesized strategy. Subjects exhibited greatly decreased variability for goal-relevant gait fluctuations directly related to achieving this strategy, but far greater variability for goal-irrelevant fluctuations. More importantly, humans immediately corrected goal-relevant deviations at each successive stride, while allowing goal-irrelevant deviations to persist across multiple strides. To demonstrate that this was not the only strategy people could have used to successfully accomplish the task, we created three surrogate data sets. Each tested a specific alternative hypothesis that subjects used a different strategy that made no reference to the hypothesized goal function. Humans did not adopt any of these viable alternative strategies. Finally, we developed a sequence of stochastic control models of stride-to-stride variability for walking, based on the Minimum Intervention Principle. We demonstrate that healthy humans are not precisely “optimal,” but instead consistently slightly over-correct small deviations in walking speed at each stride. Our results reveal a new governing principle for regulating stride-to-stride fluctuations in human walking that acts independently of, but in parallel with, minimizing energetic cost. Thus, humans exploit task redundancies to achieve robust control while minimizing effort and allowing potentially beneficial motor variability

    What’s retinoic acid got to do with it? Retinoic acid regulation of the neural crest in craniofacial and ocular development

    Full text link
    Peer Reviewedhttps://deepblue.lib.umich.edu/bitstream/2027.42/151310/1/dvg23308.pdfhttps://deepblue.lib.umich.edu/bitstream/2027.42/151310/2/dvg23308_am.pd

    Diet in irritable bowel syndrome

    Get PDF
    corecore