138 research outputs found
Industrial Data Science for Batch Manufacturing Processes
Batch processes show several sources of variability, from raw materials'
properties to initial and evolving conditions that change during the different
events in the manufacturing process. In this chapter, we will illustrate with
an industrial example how to use machine learning to reduce this apparent
excess of data while maintaining the relevant information for process
engineers. Two common use cases will be presented: 1) AutoML analysis to
quickly find correlations in batch process data, and 2) trajectory analysis to
monitor and identify anomalous batches leading to process control improvements
Anomaly Analysis in Cleaning-in-Place Operations of an Industrial Brewery Fermenter
Analyzing historical data of industrial
cleaning-in-place (CIP)
operations is essential to avoid potential operation failures but
is usually not done. This paper presents a three-level approach of
analysis based on the CIP case of a brewery fermenter to describe
how to analyze the historical data in steps for detecting anomalies.
In the first level, the system is assessed before cleaning to ensure
that the selected recipe and system are able to accomplish the task.
In the second level, a multiway principal component analysis (MPCA)
algorithm is applied to monitor the process variables online or post
cleaning, with the purpose of locally detecting the anomalies and
explaining the potential causes of the anomalous event. The third
level analysis is performed after cleaning to evaluate the cleaning
results. The implementation of the analysis approach has significant
potential to automatically detect deviations and anomalies in future
CIP cycles and to optimize the cleaning process
Recommended from our members
The Computational Diet: A Review of Computational Methods Across Diet, Microbiome, and Health.
Food and human health are inextricably linked. As such, revolutionary impacts on health have been derived from advances in the production and distribution of food relating to food safety and fortification with micronutrients. During the past two decades, it has become apparent that the human microbiome has the potential to modulate health, including in ways that may be related to diet and the composition of specific foods. Despite the excitement and potential surrounding this area, the complexity of the gut microbiome, the chemical composition of food, and their interplay in situ remains a daunting task to fully understand. However, recent advances in high-throughput sequencing, metabolomics profiling, compositional analysis of food, and the emergence of electronic health records provide new sources of data that can contribute to addressing this challenge. Computational science will play an essential role in this effort as it will provide the foundation to integrate these data layers and derive insights capable of revealing and understanding the complex interactions between diet, gut microbiome, and health. Here, we review the current knowledge on diet-health-gut microbiota, relevant data sources, bioinformatics tools, machine learning capabilities, as well as the intellectual property and legislative regulatory landscape. We provide guidance on employing machine learning and data analytics, identify gaps in current methods, and describe new scenarios to be unlocked in the next few years in the context of current knowledge
Computational methods for high-throughput metabolomics
Hoffmann N. Computational methods for high-throughput metabolomics. Bielefeld: Universität Bielefeld; 2014.The advent of analytical technologies being broadly and routinely applied in biology and biochemistry for the analysis and characterization of small molecules in biological organisms has brought with it the need to process, analyze, compare, and evaluate large amounts of experimental data in a highly automated fashion. The most prominent methods used in these fields are chromatographic methods capable of separating complex mixtures of chemical compounds by properties like size or charge, coupled to mass spectrometry detectors that measure the mass and intensity of a compound's ion or its fragments eluting from the chromatographic separation system.
One major problem in these high-throughput applications is the automatic extraction of features quantifying the compounds contained in the measured results and their reliable association among multiple measurements for quantification and statistical analysis.
The main goal of this thesis is the creation of scalable and robust methods for highly automated processing of large numbers of samples. Of special importance is the comparison of different samples in order to find similarities and differences in the context of metabolomics, the study of small chemical compounds in biological organisms.
We herein describe novel algorithms for retention time alignment of peak and chromatogram data from one- and two-dimensional gas chromatography-mass spectrometry experiments in the application area of metabolomics. We also perform a comprehensive evaluation of each method against other state-of-the-art methods on publicly available datasets with genuine biological backgrounds.
In addition to these methods, we also describe the underlying software framework Maltcms and the accompanying graphical user interface Maui, and demonstrate their use on instructive application examples
Deeper Understanding of Tutorial Dialogues and Student Assessment
Bloom (1984) reported two standard deviation improvement with human tutoring which inspired many researchers to develop Intelligent Tutoring Systems (ITSs) that are as effective as human tutoring. However, recent studies suggest that the 2-sigma result was misleading and that current ITSs are as good as human tutors. Nevertheless, we can think of 2 standard deviations as the benchmark for tutoring effectiveness of ideal expert tutors. In the case of ITSs, there is still the possibility that ITSs could be better than humans.One way to improve the ITSs would be identifying, understanding, and then successfully implementing effective tutorial strategies that lead to learning gains. Another step towards improving the effectiveness of ITSs is an accurate assessment of student responses. However, evaluating student answers in tutorial dialogues is challenging. The student answers often refer to the entities in the previous dialogue turns and problem description. Therefore, the student answers should be evaluated by taking dialogue context into account. Moreover, the system should explain which parts of the student answer are correct and which are incorrect. Such explanation capability allows the ITSs to provide targeted feedback to help students reflect upon and correct their knowledge deficits. Furthermore, targeted feedback increases learners\u27 engagement, enabling them to persist in solving the instructional task at hand on their own. In this dissertation, we describe our approach to discover and understand effective tutorial strategies employed by effective human tutors while interacting with learners. We also present various approaches to automatically assess students\u27 contributions using general methods that we developed for semantic analysis of short texts. We explain our work using generic semantic similarity approaches to evaluate the semantic similarity between individual learner contributions and ideal answers provided by experts for target instructional tasks. We also describe our method to assess student performance based on tutorial dialogue context, accounting for linguistic phenomena such as ellipsis and pronouns. We then propose an approach to provide an explanatory capability for assessing student responses. Finally, we recommend a novel method based on concept maps for jointly evaluating and interpreting the correctness of student responses
Deep generative models for biology: represent, predict, design
Deep generative models have revolutionized the field of artificial intelligence, fundamentally changing how we generate novel objects that imitate or extrapolate from training data, and transforming how we access and consume various types of information such as texts, images, speech, and computer programs. They have the potential to radically transform other scientific disciplines, ranging from mathematical problem solving, to supporting fast and accurate simulations in high-energy physics or enabling rapid weather forecasting. In computational biology, generative models hold immense promise for improving our understanding of complex biological processes, designing new drugs and therapies, and forecasting viral evolution during pandemics, among many other applications. Biological objects pose however unique challenges due to their inherent complexity, encompassing massive spaces, multiple complementary data modalities, and a unique interplay between highly structured and relatively unstructured components.
In this thesis, we develop several deep generative modeling frameworks that are motivated by key questions in computational biology. Given the interdisciplinary nature of this endeavor, we first provide a comprehensive background in generative modeling, uncertainty quantification, sequential decision making, as well as important concepts in biology and chemistry to facilitate a thorough understanding of our work. We then deep dive into the core of our contributions, which are structured around three chapters. The first chapter introduces methods for learning representations of biological sequences, laying the foundation for subsequent analyses. The second chapter illustrates how these representations can be leveraged to predict complex properties of biomolecules, focusing on three specific applications: protein fitness prediction, the effects of genetic variations on human disease risk and viral immune escape. Finally, the third chapter is dedicated to methods for designing novel biomolecules, including drug target identification, de novo molecular optimization, and protein engineering.
This thesis also makes several methodological contributions to broader machine learning challenges, such as uncertainty quantification in high-dimensional spaces or efficient transformer architectures, which hold potential value in other application domains. We conclude by summarizing our key findings, highlighting shortcomings of current approaches, proposing potential avenues for future research, and discussing emerging trends within the field
- …