138 research outputs found

    Industrial Data Science for Batch Manufacturing Processes

    Full text link
    Batch processes show several sources of variability, from raw materials' properties to initial and evolving conditions that change during the different events in the manufacturing process. In this chapter, we will illustrate with an industrial example how to use machine learning to reduce this apparent excess of data while maintaining the relevant information for process engineers. Two common use cases will be presented: 1) AutoML analysis to quickly find correlations in batch process data, and 2) trajectory analysis to monitor and identify anomalous batches leading to process control improvements

    Anomaly Analysis in Cleaning-in-Place Operations of an Industrial Brewery Fermenter

    Get PDF
    Analyzing historical data of industrial cleaning-in-place (CIP) operations is essential to avoid potential operation failures but is usually not done. This paper presents a three-level approach of analysis based on the CIP case of a brewery fermenter to describe how to analyze the historical data in steps for detecting anomalies. In the first level, the system is assessed before cleaning to ensure that the selected recipe and system are able to accomplish the task. In the second level, a multiway principal component analysis (MPCA) algorithm is applied to monitor the process variables online or post cleaning, with the purpose of locally detecting the anomalies and explaining the potential causes of the anomalous event. The third level analysis is performed after cleaning to evaluate the cleaning results. The implementation of the analysis approach has significant potential to automatically detect deviations and anomalies in future CIP cycles and to optimize the cleaning process

    Computational methods for high-throughput metabolomics

    Get PDF
    Hoffmann N. Computational methods for high-throughput metabolomics. Bielefeld: Universität Bielefeld; 2014.The advent of analytical technologies being broadly and routinely applied in biology and biochemistry for the analysis and characterization of small molecules in biological organisms has brought with it the need to process, analyze, compare, and evaluate large amounts of experimental data in a highly automated fashion. The most prominent methods used in these fields are chromatographic methods capable of separating complex mixtures of chemical compounds by properties like size or charge, coupled to mass spectrometry detectors that measure the mass and intensity of a compound's ion or its fragments eluting from the chromatographic separation system. One major problem in these high-throughput applications is the automatic extraction of features quantifying the compounds contained in the measured results and their reliable association among multiple measurements for quantification and statistical analysis. The main goal of this thesis is the creation of scalable and robust methods for highly automated processing of large numbers of samples. Of special importance is the comparison of different samples in order to find similarities and differences in the context of metabolomics, the study of small chemical compounds in biological organisms. We herein describe novel algorithms for retention time alignment of peak and chromatogram data from one- and two-dimensional gas chromatography-mass spectrometry experiments in the application area of metabolomics. We also perform a comprehensive evaluation of each method against other state-of-the-art methods on publicly available datasets with genuine biological backgrounds. In addition to these methods, we also describe the underlying software framework Maltcms and the accompanying graphical user interface Maui, and demonstrate their use on instructive application examples

    Deeper Understanding of Tutorial Dialogues and Student Assessment

    Get PDF
    Bloom (1984) reported two standard deviation improvement with human tutoring which inspired many researchers to develop Intelligent Tutoring Systems (ITSs) that are as effective as human tutoring. However, recent studies suggest that the 2-sigma result was misleading and that current ITSs are as good as human tutors. Nevertheless, we can think of 2 standard deviations as the benchmark for tutoring effectiveness of ideal expert tutors. In the case of ITSs, there is still the possibility that ITSs could be better than humans.One way to improve the ITSs would be identifying, understanding, and then successfully implementing effective tutorial strategies that lead to learning gains. Another step towards improving the effectiveness of ITSs is an accurate assessment of student responses. However, evaluating student answers in tutorial dialogues is challenging. The student answers often refer to the entities in the previous dialogue turns and problem description. Therefore, the student answers should be evaluated by taking dialogue context into account. Moreover, the system should explain which parts of the student answer are correct and which are incorrect. Such explanation capability allows the ITSs to provide targeted feedback to help students reflect upon and correct their knowledge deficits. Furthermore, targeted feedback increases learners\u27 engagement, enabling them to persist in solving the instructional task at hand on their own. In this dissertation, we describe our approach to discover and understand effective tutorial strategies employed by effective human tutors while interacting with learners. We also present various approaches to automatically assess students\u27 contributions using general methods that we developed for semantic analysis of short texts. We explain our work using generic semantic similarity approaches to evaluate the semantic similarity between individual learner contributions and ideal answers provided by experts for target instructional tasks. We also describe our method to assess student performance based on tutorial dialogue context, accounting for linguistic phenomena such as ellipsis and pronouns. We then propose an approach to provide an explanatory capability for assessing student responses. Finally, we recommend a novel method based on concept maps for jointly evaluating and interpreting the correctness of student responses

    Deep generative models for biology: represent, predict, design

    Get PDF
    Deep generative models have revolutionized the field of artificial intelligence, fundamentally changing how we generate novel objects that imitate or extrapolate from training data, and transforming how we access and consume various types of information such as texts, images, speech, and computer programs. They have the potential to radically transform other scientific disciplines, ranging from mathematical problem solving, to supporting fast and accurate simulations in high-energy physics or enabling rapid weather forecasting. In computational biology, generative models hold immense promise for improving our understanding of complex biological processes, designing new drugs and therapies, and forecasting viral evolution during pandemics, among many other applications. Biological objects pose however unique challenges due to their inherent complexity, encompassing massive spaces, multiple complementary data modalities, and a unique interplay between highly structured and relatively unstructured components. In this thesis, we develop several deep generative modeling frameworks that are motivated by key questions in computational biology. Given the interdisciplinary nature of this endeavor, we first provide a comprehensive background in generative modeling, uncertainty quantification, sequential decision making, as well as important concepts in biology and chemistry to facilitate a thorough understanding of our work. We then deep dive into the core of our contributions, which are structured around three chapters. The first chapter introduces methods for learning representations of biological sequences, laying the foundation for subsequent analyses. The second chapter illustrates how these representations can be leveraged to predict complex properties of biomolecules, focusing on three specific applications: protein fitness prediction, the effects of genetic variations on human disease risk and viral immune escape. Finally, the third chapter is dedicated to methods for designing novel biomolecules, including drug target identification, de novo molecular optimization, and protein engineering. This thesis also makes several methodological contributions to broader machine learning challenges, such as uncertainty quantification in high-dimensional spaces or efficient transformer architectures, which hold potential value in other application domains. We conclude by summarizing our key findings, highlighting shortcomings of current approaches, proposing potential avenues for future research, and discussing emerging trends within the field
    • …
    corecore