More accurate process understanding from process characterization studies using Monte Carlo simulation, regularized regression, and classification models
Establishment of an appropriate control strategy with defined operating ranges (OR) predicted to meet a target product profile is a critical component of commercializing new biologics under the Quality by Design (QbD) approach. Process characterization (PC) studies are performed to expand process understanding by achieving two main goals: 1) determining which process parameters have significant effects on quality attributes and 2) establishing models describing the relationships between these critical process parameters (CPP) and critical quality attributes (CQA). Risk assessment and design of experiments (DOE) techniques are effectively deployed in the industry to identify parameters to study and build process understanding. However, the true value of the data produced by these studies can be compromised by the inherent flaws with traditional data analysis techniques. In particular, p-value based methods such as stepwise regression are prone to generate false positives and overestimated parameter coefficients. Many of the deficiencies of traditional stepwise regression can be alleviated by applying Monte Carlo cross validation (MCCV) and simulations to stepwise algorithms. These methods can greatly enhance process understanding and assist in the selection of CPPs. Regularized regression methods such as LASSO, ridge, and elastic net are also designed to overcome many of the issues inherent in techniques based on ordinary least squares. However, a superior strategy is to build multiple models using a variety of techniques and use the insights gained from each to establish the relationships between CPPs and CQAs. Use of complementary methods during data analysis allows more informed decisions to be made during model construction.
Please click Additional Files below to see the full abstract