271 research outputs found

    Evaluation of cell culture with a simulated continuous manufacturing (sCM) process in 50mL tubespins for clone selection

    Get PDF
    Continuous Manufacturing (CM) is a process where perfusion cell culture for \u3e30 days is performed with a pre-defined constant biomass set point achieved by bleeding extra cells from the bioreactor (BR). Requirements for cell lines cultured in CM include: (1) good growth to achieve biomass set-point and maintain viability \u3e90%; (2) constant cell-specific productivity as function of culture time (and consequently, volumetric productivity); and (3) constant product quality as function of culture time. In comparison to traditional batch or fed-batch cultures, early screening of numerous clones for a CM process may need to include further evaluation of these three additional attributes to better choose the top performing clones in a CM-like culture. With this purpose, we evaluated a small-scale simulated CM process (sCM) in 50mL Tubespins to screen up to 20 different clones simultaneously. This sCM small-scale model mimics a BR CM process with a simulated perfusion via daily medium exchange. Additionally, sCM can match the cell-specific perfusion rate (CSPR) in the CM BR and includes a discrete daily manual bleed to maintain a target cell density. We performed two sets of experiments to determine sCM performance including (1) evaluation of 16 cell lines expressing a model molecule and cultured in both sCM and small-scale fed-batch process, and (2) evaluation of 5 clones in both sCM and 2L CM BR. Our results indicate clone ranking accordingly to product quality is comparable between small-scale fed-batch and sCM, but ranking accordingly to viability and growth could differ between the two formats. Comparing to BR CM results, sCM predicts well daily volumetric productivity and overall growth performance, but final viability is lower in sCM for some clones. Overall product quality trends as function of culture time were similar between BR CM and sCM. In summary, we established a small-scale Tubespin model for CM that could be used as an additional tool during clone screening

    An Unsupervised Learning Perspective on the Dynamic Contribution to Extreme Precipitation Changes

    Full text link
    Despite the importance of quantifying how the spatial patterns of extreme precipitation will change with warming, we lack tools to objectively analyze the storm-scale outputs of modern climate models. To address this gap, we develop an unsupervised machine learning framework to quantify how storm dynamics affect precipitation extremes and their changes without sacrificing spatial information. Over a wide range of precipitation quantiles, we find that the spatial patterns of extreme precipitation changes are dominated by spatial shifts in storm regimes rather than intrinsic changes in how these storm regimes produce precipitation.Comment: 14 Pages, 9 Figures, Accepted to "Tackling Climate Change with Machine Learning: workshop at NeurIPS 2022". arXiv admin note: text overlap with arXiv:2208.1184

    Two-step hyperparameter optimization method: Accelerating hyperparameter search by using a fraction of a training dataset

    Full text link
    Hyperparameter optimization (HPO) is an important step in machine learning (ML) model development, but common practices are archaic -- primarily relying on manual or grid searches. This is partly because adopting advanced HPO algorithms introduces added complexity to the workflow, leading to longer computation times. This poses a notable challenge to ML applications, as suboptimal hyperparameter selections curtail the potential of ML model performance, ultimately obstructing the full exploitation of ML techniques. In this article, we present a two-step HPO method as a strategic solution to curbing computational demands and wait times, gleaned from practical experiences in applied ML parameterization work. The initial phase involves a preliminary evaluation of hyperparameters on a small subset of the training dataset, followed by a re-evaluation of the top-performing candidate models post-retraining with the entire training dataset. This two-step HPO method is universally applicable across HPO search algorithms, and we argue it has attractive efficiency gains. As a case study, we present our recent application of the two-step HPO method to the development of neural network emulators for aerosol activation. Although our primary use case is a data-rich limit with many millions of samples, we also find that using up to 0.0025% of the data (a few thousand samples) in the initial step is sufficient to find optimal hyperparameter configurations from much more extensive sampling, achieving up to 135-times speedup. The benefits of this method materialize through an assessment of hyperparameters and model performance, revealing the minimal model complexity required to achieve the best performance. The assortment of top-performing models harvested from the HPO process allows us to choose a high-performing model with a low inference cost for efficient use in global climate models (GCMs)

    Systematic Sampling and Validation of Machine Learning-Parameterizations in Climate Models

    Full text link
    Progress in hybrid physics-machine learning (ML) climate simulations has been limited by the difficulty of obtaining performant coupled (i.e. online) simulations. While evaluating hundreds of ML parameterizations of subgrid closures (here of convection and radiation) offline is straightforward, online evaluation at the same scale is technically challenging. Our software automation achieves an order-of-magnitude larger sampling of online modeling errors than has previously been examined. Using this, we evaluate the hybrid climate model performance and define strategies to improve it. We show that model online performance improves when incorporating memory, a relative humidity input feature transformation, and additional input variables. We also reveal substantial variation in online error and inconsistencies between offline vs. online error statistics. The implication is that hundreds of candidate ML models should be evaluated online to detect the effects of parameterization design choices. This is considerably more sampling than tends to be reported in the current literature.Comment: 13 pages, 4 figure

    Screening cell growth in simulated continuous manufacturing spin tubes determines optimal media conditions for cell lines

    Get PDF
    While continuous manufacturing (CM) offers significant advantages over batch, fed-batch, and batch perfusion cultures, it is typically more difficult to screen conditions or troubleshoot issues because of the added complexity to the bioreactor system and the long time duration required to receive representative results. For certain screening factors such as medium, the use of spin tubes (50 mL shaken conical-shape vessels) in a simulated CM (sCM) format can be used to approximate the conditions experienced in an instrumented bioreactor. We will discuss how sCM spin tube performance data was used to troubleshoot bioreactor performance while evaluating five cell lines using multiple medium. Specifically we will show how sCM spin tubes can successfully screen medium performance for parameters such as growth, viability, and productivity. The resultant output from the sCM study was then used to perform a confirmation bioreactor run, with significant process improvement that was in line with expected performance. In summary, we show how sCM spin tubes can be used as an effective tool to screen specific inputs such as media for improved bioreactor performance

    A Fortran-Keras Deep Learning Bridge for Scientific Computing

    Get PDF
    Implementing artificial neural networks is commonly achieved via high-level programming languages like Python and easy-to-use deep learning libraries like Keras. These software libraries come pre-loaded with a variety of network architectures, provide autodifferentiation, and support GPUs for fast and efficient computation. As a result, a deep learning practitioner will favor training a neural network model in Python, where these tools are readily available. However, many large-scale scientific computation projects are written in Fortran, making it difficult to integrate with modern deep learning methods. To alleviate this problem, we introduce a software library, the Fortran-Keras Bridge (FKB). This two-way bridge connects environments where deep learning resources are plentiful, with those where they are scarce. The paper describes several unique features offered by FKB, such as customizable layers, loss functions, and network ensembles. The paper concludes with a case study that applies FKB to address open questions about the robustness of an experimental approach to global climate simulation, in which subgrid physics are outsourced to deep neural network emulators. In this context, FKB enables a hyperparameter search of one hundred plus candidate models of subgrid cloud and radiation physics, initially implemented in Keras, to be transferred and used in Fortran. Such a process allows the model's emergent behavior to be assessed, i.e. when fit imperfections are coupled to explicit planetary-scale fluid dynamics. The results reveal a previously unrecognized strong relationship between offline validation error and online performance, in which the choice of optimizer proves unexpectedly critical. This reveals many neural network architectures that produce considerable improvements in stability including some with reduced error, for an especially challenging training dataset

    A Fortran-Keras Deep Learning Bridge for Scientific Computing

    Get PDF
    Implementing artificial neural networks is commonly achieved via high-level programming languages such as Python and easy-to-use deep learning libraries such as Keras. These software libraries come preloaded with a variety of network architectures, provide autodifferentiation, and support GPUs for fast and efficient computation. As a result, a deep learning practitioner will favor training a neural network model in Python, where these tools are readily available. However, many large-scale scientific computation projects are written in Fortran, making it difficult to integrate with modern deep learning methods. To alleviate this problem, we introduce a software library, the Fortran-Keras Bridge (FKB). This two-way bridge connects environments where deep learning resources are plentiful with those where they are scarce. The paper describes several unique features offered by FKB, such as customizable layers, loss functions, and network ensembles. The paper concludes with a case study that applies FKB to address open questions about the robustness of an experimental approach to global climate simulation, in which subgrid physics are outsourced to deep neural network emulators. In this context, FKB enables a hyperparameter search of one hundred plus candidate models of subgrid cloud and radiation physics, initially implemented in Keras, to be transferred and used in Fortran. Such a process allows the model’s emergent behavior to be assessed, i.e., when fit imperfections are coupled to explicit planetary-scale fluid dynamics. The results reveal a previously unrecognized strong relationship between offline validation error and online performance, in which the choice of the optimizer proves unexpectedly critical. This in turn reveals many new neural network architectures that produce considerable improvements in climate model stability including some with reduced error, for an especially challenging training dataset

    Comparing Storm Resolving Models and Climates via Unsupervised Machine Learning

    Full text link
    Storm-resolving models (SRMs) have gained widespread interest because of the unprecedented detail with which they resolve the global climate. However, it remains difficult to quantify objective differences in how SRMs resolve complex atmospheric formations. This lack of appropriate tools for comparing model similarities is a problem in many disparate fields that involve simulation tools for complex data. To address this challenge we develop methods to estimate distributional distances based on both nonlinear dimensionality reduction and vector quantization. Our approach automatically learns appropriate notions of similarity from low-dimensional latent data representations that the different models produce. This enables an intercomparison of nine SRMs based on their high-dimensional simulation data and reveals that only six are similar in their representation of atmospheric dynamics. Furthermore, we uncover signatures of the convective response to global warming in a fully unsupervised way. Our study provides a path toward evaluating future high-resolution simulation data more objectively.Comment: 22 pages, 19 figures. Submitted to journal for consideratio
    • …
    corecore