9 research outputs found

    The development of therapeutic proteins can be hindered by poor decision-making strategies in the early stage

    Get PDF
    In this study we address two major issues related to the current development process of therapeutic proteins and their characterization. First, due to limited samples amounts, the selection of lead molecules in the early stages is often based on the results from a limited physicochemical characterization. The latter can be based on measurements of only 2-3 parameters, e.g. protein melting temperature, protein aggregation temperature, and is usually performed in only one buffer, e.g. PBS. The hypothesis we present is that such approach can lead to the rejection of lead candidates that can still be manufacturable and can move on to clinical trials. The second matter we address are the often-reported correlations between protein physicochemical parameters in the literature. We propose that such correlations can be found only in a small sample population, e.g. one protein in different solution conditions or different proteins from the same class. However, we expect that such correlations would not be valid in a large population, including various protein structures and solution conditions. In order to address the above-mentioned issues, we created the PIPPI consortium (http://www.pippi.kemi.dtu.dk) and applied systematic approach to map the physicochemical properties of a wide range of proteins and extensively study their stability as a function of the solution conditions. We show that promising therapeutic protein lead candidate can appear as non-manufacturable when only limited physicochemical characterization is performed, e.g. a few methods are used and only a few solution conditions are tested. Therefore, the rejection rate during early-stage development can be improved by more thorough physicochemical characterization. Moreover, only weak linear correlations between biophysical properties of proteins are observed in a large populations. This suggests that the often-reported correlations between parameters describing the protein stability are not representative of a global population. Understanding the connections between various physiochemical parameters would require a systematic database which is currently in development by the PIPPI consortium

    Novel Non-linear Curve Fitting to Resolve Protein Unfolding Transitions in Intrinsic Fluorescence Differential Scanning Fluorimetry

    Get PDF
    In biotherapeutic protein research, an estimation of the studied protein's thermal stability is one of the important steps that determine developability as a function of solvent conditions. Differential Scanning Fluorimetry (DSF) can be applied to measure thermal stability. Label-free DSF measures amino acid fluorescence as a function of temperature, where conformational changes induce observable peak deformation, yielding apparent melting temperatures. The estimation of the stability parameters can be hindered in the case of multidomain, multimeric or aggregating proteins when multiple transitions partially coincide. These overlapping protein unfolding transitions are hard to evaluate by the conventional methodology, as peak maxima are shifted by convolution. We show how non-linear curve fitting of intrinsic fluorescence DSF can deconvolute highly overlapping transitions in formulation screening in a semi-automated process. The proposed methodology relies on synchronous, constrained fits of the fluorescence intensity, ratio and their derivatives, by combining linear baselines with generalized logistic transition functions. The proposed algorithm is applied to data from three proteins; a single transition, a double separated transition and a double overlapping transition. Extracted thermal stability parameters; apparent melting temperatures T-m,T-1, T-m,T-2, and melting onset temperature T-onset are obtained and compared with reference software analysis. The fits show R-2 = 0.94 for single and R-2 = 0.88 for separated transitions. Obtaining values and trends for T-onset in a well-described and automated way, will aid protein scientist to better evaluate the thermal stability of proteins

    Application of interpretable artificial neural networks to early monoclonal antibodies development

    Get PDF
    The development of a new protein drug typically starts with the design, expression and biophysical characterization of many different protein constructs. The initially high number of constructs is radically reduced to a few candidates that exhibit the desired biological and physicochemical properties. This process of protein expression and characterization to find the most promising molecules is both expensive and time-consuming. Consequently, many companies adopt and implement philosophies, e.g. platforms for protein expression and formulation, computational approaches, machine learning, to save resources and facilitate protein drug development. Inspired by this, we propose the use of interpretable artificial neuronal networks (ANNs) to predict biophysical properties of therapeutic monoclonal antibodies i.e. melting temperature Tm, aggregation onset temperature Tagg, interaction parameter kD as a function of pH and salt concentration from the amino acid composition. Our ANNs were trained with typical early-stage screening datasets achieving high prediction accuracy. By only using the amino acid composition, we could keep the ANNs simple which allows for high general applicability, robustness and interpretability. Finally, we propose a novel “knowledge transfer” approach, which can be readily applied due to the simple algorithm design, to understand how our ANNs come to their conclusions

    Water quality monitoring based on chemometric analysis of high-resolution phytoplankton data measured with flow cytometry

    No full text
    River water is an important source of Dutch drinking water. For this reason, continuous monitoring of river water quality is needed. However, comprehensive chemical analyses with high-resolution gas chromatography [GC]-mass spectrometry [MS]/liquid chromatography [LC]-MS are quite tedious and time consuming; this makes them poorly fit for routine water quality monitoring and, therefore, many pollution events are missed. Phytoplankton are highly sensitive and responsive to toxicity, which makes them highly usable for effect-based water quality monitoring. Flow cytometry can measure the optical properties of phytoplankton every hour, generating a large amount of information-rich data in one year. However, this requires chemometrics, as the resulting fingerprints need to be processed into information about abnormal phytoplankton behaviour. We developed Discriminant Analysis of Multi-Aspect CYtometry (DAMACY) to model the “normal condition” of the phytoplankton community imposed by diurnal, meteorological, and other exogenous influences. DAMACY first describes the cellular variability and distribution of phytoplankton in each measurement using principal component analysis, and then aims to find subtle differences in these phytoplankton distributions that predict normal environmental conditions. Deviations from these normal environmental conditions indicated abnormal phytoplankton behaviour that happened alongside pollution events measured with the GC/MS and LC/MS systems. Thus, our results demonstrate that flow cytometry in combination with chemometrics may be used for an automated hourly assessment of river water quality and as a near real-time early warning for detecting harmful known or unknown contaminants. Finally, both the flow cytometer and the DAMACY algorithm run completely autonomous and only requires maintenance once or twice per year. The warning system results may be uploaded automatically, so that drinking water companies may temporary stop pumping water whenever abnormal phytoplankton behaviour is detected. In the case of prolonged abnormal phytoplankton behaviour, comprehensive analysis may still be used to identify the chemical compound, its origin, and toxicity

    Advancing Therapeutic Protein Discovery and Development through Comprehensive Computational and Biophysical Characterization

    No full text
    Therapeutic protein candidates should exhibit favorable properties that render them suitable to become drugs. Nevertheless, there are no well-established guidelines for the efficient selection of proteinaceous molecules with desired features during early stage development. Such guidelines can emerge only from a large body of published research that employs orthogonal techniques to characterize therapeutic proteins in different formulations. In this work, we share a study on a diverse group of proteins, including their primary sequences, purity data, and computational and biophysical characterization at different pH and ionic strength. We report weak linear correlations between many of the biophysical parameters. We suggest that a stability comparison of diverse therapeutic protein candidates should be based on a computational and biophysical characterization in multiple formulation conditions, as the latter can largely determine whether a protein is above or below a certain stability threshold. We use the presented data set to calculate several stability risk scores obtained with an increasing level of analytical effort and show how they correlate with protein aggregation during storage. Our work highlights the importance of developing combined risk scores that can be used for early stage developability assessment. We suggest that such scores can have high prediction accuracy only when they are based on protein stability characterization in different solution conditions

    Chemometrics in Protein Formulation:Stability Governed by Repulsion and Protein Unfolding

    No full text
    Therapeutic proteins can be challenging to develop due to their complexity and the requirement of an acceptable formulation to ensure patient safety and efficacy. To date, there is no universal formulation development strategy that can identify optimal formulation conditions for all types of proteins in a fast and reliable manner. In this work, high-throughput characterization, employing a toolbox of five techniques, was performed on 14 structurally different proteins formulated in 6 different buffer conditions and in the presence of 4 different excipients. Multivariate data analysis and chemometrics were used to analyze the data in an unbiased way. First, observed changes in stability were primarily determined by the individual protein. Second, pH and ionic strength are the two most important factors determining the physical stability of proteins, where there exists a significant statistical interaction between protein and pH/ionic strength. Additionally, we developed prediction methods by partial least-squares regression. Colloidal stability indicators are important for prediction of real-time stability, while conformational stability indicators are important for prediction of stability under accelerated stress conditions at 40 °C. In order to predict real-time storage stability, protein-protein repulsion and the initial monomer fraction are the most important properties to monitor
    corecore