55 research outputs found

    Computational Optimizations for Machine Learning

    Get PDF
    The present book contains the 10 articles finally accepted for publication in the Special Issue “Computational Optimizations for Machine Learning” of the MDPI journal Mathematics, which cover a wide range of topics connected to the theory and applications of machine learning, neural networks and artificial intelligence. These topics include, among others, various types of machine learning classes, such as supervised, unsupervised and reinforcement learning, deep neural networks, convolutional neural networks, GANs, decision trees, linear regression, SVM, K-means clustering, Q-learning, temporal difference, deep adversarial networks and more. It is hoped that the book will be interesting and useful to those developing mathematical algorithms and applications in the domain of artificial intelligence and machine learning as well as for those having the appropriate mathematical background and willing to become familiar with recent advances of machine learning computational optimization mathematics, which has nowadays permeated into almost all sectors of human life and activity

    SIS 2017. Statistics and Data Science: new challenges, new generations

    Get PDF
    The 2017 SIS Conference aims to highlight the crucial role of the Statistics in Data Science. In this new domain of ‘meaning’ extracted from the data, the increasing amount of produced and available data in databases, nowadays, has brought new challenges. That involves different fields of statistics, machine learning, information and computer science, optimization, pattern recognition. These afford together a considerable contribute in the analysis of ‘Big data’, open data, relational and complex data, structured and no-structured. The interest is to collect the contributes which provide from the different domains of Statistics, in the high dimensional data quality validation, sampling extraction, dimensional reduction, pattern selection, data modelling, testing hypotheses and confirming conclusions drawn from the data

    Characterizing model uncertainty in ensemble learning

    Get PDF

    Computational Methods to Identify Regulatory Variants in the Non-Coding Regions of the Human Genome

    Full text link
    Evidence from Genome Wide Association Studies (GWAS) has provided us with insights into human phenotypes by identifying genetic variation statistically associated with diseases and complex traits. However, the functional consequences of these genetic variants remain unknown in many cases, especially for those in the non-coding regions of the human genome. My dissertation focuses on single nucleotide polymorphisms (SNPs) as the most common genetic variation type. I define some SNPs as regulatory SNPs that can alter the transcription factor binding affinities within the DNA sequences of regulatory elements. This change affects downstream gene expression and plays a role in disease progression and trait development. Characterizing genome-wide regulatory variants is particularly challenging because the gene regulatory network is dynamic across various cell types and environmental conditions. In addition to the DNA sequence context, the gene regulatory network relies on epigenetic factors, such as chromatin accessibility, histone modification, and chromatin looping. In this dissertation, I applied computational approaches to predict regulatory variants by incorporating sequence information and functional genomics annotations from various high-throughput assays. In chapter 2, I developed a computation tool, SURF, to prioritize the regulatory variants within promoters and enhancers with clinical relevance. These variants were validated by massively parallel reporter assays and used as an unbiased test set in CAGI5 “Regulation Saturation” challenge. My algorithm achieved the best performance in this challenge compared to other participant groups. In chapter 3, I extended SURF to TURF, a computational tool to predict tissue-specific functions of regulatory variants and provide a more robust prediction on genome-wide non-coding regions. By leveraging tissue-specific genomic annotations of tissues from the same organ, I also calculated TURF organ-specific scores covering most ENCODE project organs. Many of the GWAS traits showed enrichment of regulatory variants prioritized by TURF scores in their relevant organs, which indicates that these regulatory variants are likely to be involved in the trait developments and can be a valuable source for future studies. In chapter 4, to enable the quick annotation on non-coding variants for the scientific community, I designed some major updates to an online tool, RegulomeDB. With the user's input of query variant, RegulomeDB returns the evidence from diverse functional genomics assays that overlaps the variant’s position, displayed with interactive charts and a genome browser view. The new probabilistic score derived from SURF was also integrated into the query system. To further provide functional hypotheses to putative regulatory variants, I finally explored the pipeline to assign their target genes with evidence from eQTL studies and Hi-C experiments. Together, my dissertation developed computational tools for broad community use on prioritizing and assigning target genes to regulatory variants in non-coding regions of the human genome.PHDBioinformaticsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/167962/1/shengchd_1.pd

    Machine Learning Modeling from Omics Data as Prospective Tool for Improvement of Inflammatory Bowel Disease Diagnosis and Clinical Classifications

    Get PDF
    Research of inflammatory bowel disease (IBD) has identified numerous molecular players involved in the disease development. Even so, the understanding of IBD is incomplete, while disease treatment is still far from the precision medicine. Reliable diagnostic and prognostic biomarkers in IBD are limited which may reduce efficient therapeutic outcomes. High-throughput technologies and artificial intelligence emerged as powerful tools in search of unrevealed molecular patterns that could give important insights into IBD pathogenesis and help to address unmet clinical needs. Machine learning, a subtype of artificial intelligence, uses complex mathematical algorithms to learn from existing data in order to predict future outcomes. The scientific community has been increasingly employing machine learning for the prediction of IBD outcomes from comprehensive patient data-clinical records, genomic, transcriptomic, proteomic, metagenomic, and other IBD relevant omics data. This review aims to present fundamental principles behind machine learning modeling and its current application in IBD research with the focus on studies that explored genomic and transcriptomic data. We described different strategies used for dealing with omics data and outlined the best-performing methods. Before being translated into clinical settings, the developed machine learning models should be tested in independent prospective studies as well as randomized controlled trials

    Evolutionary Computation

    Get PDF
    This book presents several recent advances on Evolutionary Computation, specially evolution-based optimization methods and hybrid algorithms for several applications, from optimization and learning to pattern recognition and bioinformatics. This book also presents new algorithms based on several analogies and metafores, where one of them is based on philosophy, specifically on the philosophy of praxis and dialectics. In this book it is also presented interesting applications on bioinformatics, specially the use of particle swarms to discover gene expression patterns in DNA microarrays. Therefore, this book features representative work on the field of evolutionary computation and applied sciences. The intended audience is graduate, undergraduate, researchers, and anyone who wishes to become familiar with the latest research work on this field

    Systems Analytics and Integration of Big Omics Data

    Get PDF
    A “genotype"" is essentially an organism's full hereditary information which is obtained from its parents. A ""phenotype"" is an organism's actual observed physical and behavioral properties. These may include traits such as morphology, size, height, eye color, metabolism, etc. One of the pressing challenges in computational and systems biology is genotype-to-phenotype prediction. This is challenging given the amount of data generated by modern Omics technologies. This “Big Data” is so large and complex that traditional data processing applications are not up to the task. Challenges arise in collection, analysis, mining, sharing, transfer, visualization, archiving, and integration of these data. In this Special Issue, there is a focus on the systems-level analysis of Omics data, recent developments in gene ontology annotation, and advances in biological pathways and network biology. The integration of Omics data with clinical and biomedical data using machine learning is explored. This Special Issue covers new methodologies in the context of gene–environment interactions, tissue-specific gene expression, and how external factors or host genetics impact the microbiome
    corecore