11 research outputs found

    Ensembling Variable Selectors by Stability Selection for the Cox Model

    Get PDF

    Review of automated time series forecasting pipelines

    Get PDF
    Time series forecasting is fundamental for various use cases in different domains such as energy systems and economics. Creating a forecasting model for a specific use case requires an iterative and complex design process. The typical design process includes the five sections (1) data pre-processing, (2) feature engineering, (3) hyperparameter optimization, (4) forecasting method selection, and (5) forecast ensembling, which are commonly organized in a pipeline structure. One promising approach to handle the ever-growing demand for time series forecasts is automating this design process. The present paper, thus, analyzes the existing literature on automated time series forecasting pipelines to investigate how to automate the design process of forecasting models. Thereby, we consider both Automated Machine Learning (AutoML) and automated statistical forecasting methods in a single forecasting pipeline. For this purpose, we firstly present and compare the proposed automation methods for each pipeline section. Secondly, we analyze the automation methods regarding their interaction, combination, and coverage of the five pipeline sections. For both, we discuss the literature, identify problems, give recommendations, and suggest future research. This review reveals that the majority of papers only cover two or three of the five pipeline sections. We conclude that future research has to holistically consider the automation of the forecasting pipeline to enable the large-scale application of time series forecasting

    MULTIVARIATE ANALYSIS FOR UNDERSTANDING COGNITIVE SPEECH PROCESSING

    Get PDF
    MULTIVARIATE ANALYSIS FOR UNDERSTANDING COGNITIVE SPEECH PROCESSIN

    Effects of environmental factors on monogastric gut microbial community and functional dynamics

    Get PDF
    Doctor of PhilosophyGenetics Interdepartmental ProgramMajor Professor Not ListedFundamental knowledge, for understanding establishment and disturbance of gut microbiota during both health and disease, is the composition and function of the gut microbiome. However, a healthy gut microbiota has not been defined at any profound taxonomic resolution, and even less on a functional level. Previous aims of monogastric microbiome, or comprehensive gut microbial membership, relied greatly on marker gene sequencing, which sequenced less than 0.01% of the microbial genome, limiting our knowledge on microbial functions and strain-level dynamics. The research on understanding the gut microbiota impact on the host is still at a juvenile stage, and much still needs to be learned in understanding the microbiota dynamics in healthy hosts and disturbances on the short- and long-term. To achieve such goals and understand the implications of environmental changes, associated with age development and antibiotic treatment, I utilized two distinct monogastric swine populations. With these swine, I evaluated their gut microbiota for microbial membership, function, and genetic variation. In my first study, I elucidated the dynamics of bacteria, archaea and fungi populations in the swine gut for the duration of the host lifetime. My objective was to provide a foundational understanding of healthy gut microbiome during long-term development. I collected 234 fecal samples, across 31 time points, from 10 swine from birth through 156 days of age. Samples were collected during the three swine development stages (preweaning, nursery, and growth adult). Next, I performed bacterial 16S rRNA amplicon sequencing for bacteria and fungal qPCR for the dominating fungus of the swine gut, Kazachstania slooffiae. My results demonstrated a highly volatile bacteriome, with low K. slooffiae presence, in the young, preweaning host. Following weaning, bacterial populations became relatively established with a peak in K. slooffiae abundance. Finally, I determined multiple negative, competitive interactions between bacterial and K. slooffiae fungi during the nursery and growth adult stages. I provided evidence for previously unknown competitive interactions which occur throughout the weaned and adult periods. This first study indicated a need for future genetic support of microbial functions pertaining to establishment and competitive dynamics. My second objective was a thorough investigation into the functions of methanogenic archaea during the host lifetime. Archaea of the monogastric gut are historically understudied relative to bacteria. I performed shotgun metagenome sequencing on a subset of the hosts (n=7) and samples (n=112) from my first objective. I resolved 1,130 microbial genomes termed metagenome assembled genomes (MAGs). Within these genomes were 8 methanogenic archaea MAGs which fell into two orders: Methanomassiliicoccales (5) and Methanobacteriales (3). I discovered the first US swine MAGs for two archaea, while describing novel evidence of acetoclastic methanogenesis. Furthermore, I described age-associated detection and methanogenic functions. My second objective provided a comprehensive, gene-supported analysis of monogastric-associated methanogens which furthered our understanding of microbiome development and functions. The focus of my final objective was to determine genetic variation and function of microbes following antibiotic treatments. A distinct swine population, relative to the first study, of 648 weaned swine were assigned to one of three treatments: control (no antibiotic ever), chlortetracycline (CTC) for 14 days, or tiamulin (TMU) for 14 days. Pigs were housed in pens where there were 8 pens/treatment and 27 pigs/pen (i.e. 216 pigs/treatment). Fecal samples were collected from 5 random swine from each of 2 random pens per treatment every collection. Collections occurred 7 days prior to treatment (i.e. day of weaning), and every 7 days until 14 days past antibiotic treatment with one final collection at 28 days post treatment. Samples were pooled according to pen and collection day, followed by gDNA extraction, library preparation, and shotgun metagenomic sequencing. I curated 81 MAGs and analyzed genetic variation according to pre- and post-treatment. I found 11 MAGs with no statistical difference in detection and statistically consistently high variation in the form of genetic entropy (SDHSE [sustained detection and high sustained entropy] MAGs). The SDHSE MAGs were suggested to be multidrug resistant (MDR) due to their continued detection throughout CTC and TMU treatments. Even though I identified 22 unique antimicrobial resistance genes in SDHSE MAGs, less than a third contained genes with TMU resistance. There are likely additional TMU resistance genes contributing to the SDHSE MAGs detention throughout TMU treatment. Together, this investigation described how MDR microbial populations harbor genetic variation, with potential for additional resistance, and highlighted the need for further antimicrobial investigations into gene AMR functions. In conclusion, this dissertation offers a comprehensive, functional understanding of the many microbiome members, including bacteria, archaea and fungi. These studies are critical for understanding how monogastric microbes act through the host lifetime and in response to antibiotic treatments, which will aid future endeavors for monogastric health as it pertains to the gut microbiome

    Metalearning

    Get PDF
    This open access book as one of the fastest-growing areas of research in machine learning, metalearning studies principled methods to obtain efficient models and solutions by adapting machine learning and data mining processes. This adaptation usually exploits information from past experience on other tasks and the adaptive processes can involve machine learning approaches. As a related area to metalearning and a hot topic currently, automated machine learning (AutoML) is concerned with automating the machine learning processes. Metalearning and AutoML can help AI learn to control the application of different learning methods and acquire new solutions faster without unnecessary interventions from the user. This book offers a comprehensive and thorough introduction to almost all aspects of metalearning and AutoML, covering the basic concepts and architecture, evaluation, datasets, hyperparameter optimization, ensembles and workflows, and also how this knowledge can be used to select, combine, compose, adapt and configure both algorithms and models to yield faster and better solutions to data mining and data science problems. It can thus help developers to develop systems that can improve themselves through experience. This book is a substantial update of the first edition published in 2009. It includes 18 chapters, more than twice as much as the previous version. This enabled the authors to cover the most relevant topics in more depth and incorporate the overview of recent research in the respective area. The book will be of interest to researchers and graduate students in the areas of machine learning, data mining, data science and artificial intelligence. ; Metalearning is the study of principled methods that exploit metaknowledge to obtain efficient models and solutions by adapting machine learning and data mining processes. While the variety of machine learning and data mining techniques now available can, in principle, provide good model solutions, a methodology is still needed to guide the search for the most appropriate model in an efficient way. Metalearning provides one such methodology that allows systems to become more effective through experience. This book discusses several approaches to obtaining knowledge concerning the performance of machine learning and data mining algorithms. It shows how this knowledge can be reused to select, combine, compose and adapt both algorithms and models to yield faster, more effective solutions to data mining problems. It can thus help developers improve their algorithms and also develop learning systems that can improve themselves. The book will be of interest to researchers and graduate students in the areas of machine learning, data mining and artificial intelligence

    Ensembling Variable Selectors by Stability Selection for the Cox Model

    No full text
    As a pivotal tool to build interpretive models, variable selection plays an increasingly important role in high-dimensional data analysis. In recent years, variable selection ensembles (VSEs) have gained much interest due to their many advantages. Stability selection (Meinshausen and Bühlmann, 2010), a VSE technique based on subsampling in combination with a base algorithm like lasso, is an effective method to control false discovery rate (FDR) and to improve selection accuracy in linear regression models. By adopting lasso as a base learner, we attempt to extend stability selection to handle variable selection problems in a Cox model. According to our experience, it is crucial to set the regularization region Λ in lasso and the parameter λmin properly so that stability selection can work well. To the best of our knowledge, however, there is no literature addressing this problem in an explicit way. Therefore, we first provide a detailed procedure to specify Λ and λmin. Then, some simulated and real-world data with various censoring rates are used to examine how well stability selection performs. It is also compared with several other variable selection approaches. Experimental results demonstrate that it achieves better or competitive performance in comparison with several other popular techniques

    Metalearning

    Get PDF
    This open access book as one of the fastest-growing areas of research in machine learning, metalearning studies principled methods to obtain efficient models and solutions by adapting machine learning and data mining processes. This adaptation usually exploits information from past experience on other tasks and the adaptive processes can involve machine learning approaches. As a related area to metalearning and a hot topic currently, automated machine learning (AutoML) is concerned with automating the machine learning processes. Metalearning and AutoML can help AI learn to control the application of different learning methods and acquire new solutions faster without unnecessary interventions from the user. This book offers a comprehensive and thorough introduction to almost all aspects of metalearning and AutoML, covering the basic concepts and architecture, evaluation, datasets, hyperparameter optimization, ensembles and workflows, and also how this knowledge can be used to select, combine, compose, adapt and configure both algorithms and models to yield faster and better solutions to data mining and data science problems. It can thus help developers to develop systems that can improve themselves through experience. This book is a substantial update of the first edition published in 2009. It includes 18 chapters, more than twice as much as the previous version. This enabled the authors to cover the most relevant topics in more depth and incorporate the overview of recent research in the respective area. The book will be of interest to researchers and graduate students in the areas of machine learning, data mining, data science and artificial intelligence. ; Metalearning is the study of principled methods that exploit metaknowledge to obtain efficient models and solutions by adapting machine learning and data mining processes. While the variety of machine learning and data mining techniques now available can, in principle, provide good model solutions, a methodology is still needed to guide the search for the most appropriate model in an efficient way. Metalearning provides one such methodology that allows systems to become more effective through experience. This book discusses several approaches to obtaining knowledge concerning the performance of machine learning and data mining algorithms. It shows how this knowledge can be reused to select, combine, compose and adapt both algorithms and models to yield faster, more effective solutions to data mining problems. It can thus help developers improve their algorithms and also develop learning systems that can improve themselves. The book will be of interest to researchers and graduate students in the areas of machine learning, data mining and artificial intelligence

    Ensembling variable selectors by stability selection for the Cox model

    No full text

    Machine Learning in Discrete Molecular Spaces

    Get PDF
    The past decade has seen an explosion of machine learning in chemistry. Whether it is in property prediction, synthesis, molecular design, or any other subdivision, machine learning seems poised to become an integral, if not a dominant, component of future research efforts. This extraordinary capacity rests on the interac- tion between machine learning models and the underlying chemical data landscape commonly referred to as chemical space. Chemical space has multiple incarnations, but is generally considered the space of all possible molecules. In this sense, it is one example of a molecular set: an arbitrary collection of molecules. This thesis is devoted to precisely these objects, and particularly how they interact with machine learning models. This work is predicated on the idea that by better understanding the relationship between molecular sets and the models trained on them we can improve models, achieve greater interpretability, and further break down the walls between data-driven and human-centric chemistry. The hope is that this enables the full predictive power of machine learning to be leveraged while continuing to build our understanding of chemistry. The first three chapters of this thesis introduce and reviews the necessary machine learning theory, particularly the tools that have been specially designed for chemical problems. This is followed by an extensive literature review in which the contributions of machine learning to multiple facets of chemistry over the last two decades are explored. Chapters 4-7 explore the research conducted throughout this PhD. Here we explore how we can meaningfully describe the properties of an arbitrary set of molecules through information theory; how we can determine the most informative data points in a set of molecules; how graph signal processing can be used to understand the relationship between the chosen molecular representation, the property, and the machine learning model; and finally how this approach can be brought to bear on protein space. Each of these sub-projects briefly explores the necessary mathematical theory before leveraging it to provide approaches that resolve the posed problems. We conclude with a summary of the contributions of this work and outline fruitful avenues for further exploration

    Proceedings of the 22nd Conference on Formal Methods in Computer-Aided Design – FMCAD 2022

    Get PDF
    The Conference on Formal Methods in Computer-Aided Design (FMCAD) is an annual conference on the theory and applications of formal methods in hardware and system verification. FMCAD provides a leading forum to researchers in academia and industry for presenting and discussing groundbreaking methods, technologies, theoretical results, and tools for reasoning formally about computing systems. FMCAD covers formal aspects of computer-aided system design including verification, specification, synthesis, and testing
    corecore