Data science essentials for digital biopharma

Abstract

Kumulative Dissertation in vier ArtikelnNovel biopharmaceutical products and existing obstacles are driving the digital transformation of the biopharmaceutical industry. For the sake of transparent and reproducible processes, data is supposed to become the basis of all decisions. Those decisions should be taken over by highly automated data science tools that are deeply integrated into the biopharmaceutical life-cycle. At the same time, these tools should allow that rare human resources, like process engineers and biotech scientists, focus on more creative tasks which cannot be taken over by computers and algorithms. However, the development, deployment, and use of data science in pharmaceutical environments is linked to various obstacles. Problems regarding data accessibility, data complexity, and data integrity are serious showstoppers, preventing the automated routine use of data science tools in todays pharmaceutical environments. Finding solutions for these obstacles is a prerequisite for a successful digital transformation of the biopharmaceutical industry. This thesis aims to elaborate workflows that increase data accessibility, decrease data complexity and ensure data integrity. Those workflows will improve process transparency and rational decisions by reducing and simplifying the amount of human work and contribute to the establishment of automated, robust, and reproducible processes. The goal was achieved through algorithms that preprocess, model, and condense data down to the physiological information that matters, with high emphasis on the discovery of outliers, inconsistencies, and errors. Furthermore, visualization tools, based on multivariate statistics, were established to set-up systematic top-down analysis workflows. The intended user group for these tools were process experts, which usually have no programming skills, but process knowledge required to analyze the data and extract meaningful conclusions successfully. Another focus was on data integrity. Data and analysis workflows, the foundation of all decisions, need to be the ground truth. Both raw and derived data need to be above all doubt and keep its integrity by design. Therefore, an approach based on version control systems and the blockchain was suggested. The last emphasis was the task of data science development itself. A conservative industry, the vast amounts of different data sources, as well as old-fashioned IT/OT environments, are high hurdles for the development and deployment of more powerful data science tools. This thesis shows approaches to improve this situation by applying DevOps technologies and mindsets already used in other industries. While biopharma is undergoing a digital transformation, the workflows and mindsets presented in this thesis act as essential cornerstones for future developments of more transparent and powerful data science tools within digital biopharma. These principles are important for people and companies that aim to develop and deploy data science tools in pharmaceutical environments, and those, who want to build frameworks for that.15

    Similar works

    Full text

    thumbnail-image

    Available Versions