15 research outputs found

    A review of mechanistic learning in mathematical oncology

    Full text link
    Mechanistic learning, the synergistic combination of knowledge-driven and data-driven modeling, is an emerging field. In particular, in mathematical oncology, the application of mathematical modeling to cancer biology and oncology, the use of mechanistic learning is growing. This review aims to capture the current state of the field and provide a perspective on how mechanistic learning may further progress in mathematical oncology. We highlight the synergistic potential of knowledge-driven mechanistic mathematical modeling and data-driven modeling, such as machine and deep learning. We point out similarities and differences regarding model complexity, data requirements, outputs generated, and interpretability of the algorithms and their results. Then, organizing combinations of knowledge- and data-driven modeling into four categories (sequential, parallel, intrinsic, and extrinsic mechanistic learning), we summarize a variety of approaches at the interface between purely data- and knowledge-driven models. Using examples predominantly from oncology, we discuss a range of techniques including physics-informed neural networks, surrogate model learning, and digital twins. We see that mechanistic learning, with its intentional leveraging of the strengths of both knowledge and data-driven modeling, can greatly impact the complex problems of oncology. Given the increasing ubiquity and impact of machine learning, it is critical to incorporate it into the study of mathematical oncology with mechanistic learning providing a path to that end. As the field of mechanistic learning advances, we aim for this review and proposed categorization framework to foster additional collaboration between the data- and knowledge-driven modeling fields. Further collaboration will help address difficult issues in oncology such as limited data availability, requirements of model transparency, and complex input dat

    Data-driven prediction of spinal cord injury recovery: An exploration of current status and future perspectives

    Get PDF
    Spinal Cord Injury (SCI) presents a significant challenge in rehabilitation medicine, with recovery outcomes varying widely among individuals. Machine learning (ML) is a promising approach to enhance the prediction of recovery trajectories, but its integration into clinical practice requires a thorough understanding of its efficacy and applicability. We systematically reviewed the current literature on data-driven models of SCI recovery prediction. The included studies were evaluated based on a range of criteria assessing the approach, implementation, input data preferences, and the clinical outcomes aimed to forecast. We observe a tendency to utilize routinely acquired data, such as International Standards for Neurological Classification of SCI (ISNCSCI), imaging, and demographics, for the prediction of functional outcomes derived from the Spinal Cord Independence Measure (SCIM) III and Functional Independence Measure (FIM) scores with a focus on motor ability. Although there has been an increasing interest in data-driven studies over time, traditional machine learning architectures, such as linear regression and tree-based approaches, remained the overwhelmingly popular choices for implementation. This implies ample opportunities for exploring architectures addressing the challenges of predicting SCI recovery, including techniques for learning from limited longitudinal data, improving generalizability, and enhancing reproducibility. We conclude with a perspective, highlighting possible future directions for data-driven SCI recovery prediction and drawing parallels to other application fields in terms of diverse data types (imaging, tabular, sequential, multimodal), data challenges (limited, missing, longitudinal data), and algorithmic needs (causal inference, robustness)

    reComBat: Batch effect removal in large-scale, multi-source omics data integration

    Get PDF
    With the steadily increasing abundance of omics data produced all over the world, some-times decades apart and under vastly different experimental conditions residing in public databases, a crucial step in many data-driven bioinformatics applications is that of data integration. The challenge of batch effect removal for entire databases lies in the large number and coincide of both batches and desired, biological variation resulting in design matrix singularity. This problem currently cannot be solved by any common batch correction algorithm. In this study, we present reComBat , a regularised version of the empirical Bayes method to overcome this limitation. We demonstrate our approach for the harmonisation of public gene expression data of the human opportunistic pathogen Pseudomonas aeruginosa and study a several metrics to empirically demonstrate that batch effects are successfully mitigated while biologically meaningful gene expression variation is retained. reComBat fills the gap in batch correction approaches applicable to large scale, public omics databases and opens up new avenues for data driven analysis of complex biological processes beyond the scope of a single study

    reComBat: batch-effect removal in large-scale multi-source gene-expression data integration

    Get PDF
    With the steadily increasing abundance of omics data produced all over the world under vastly different experimental conditions residing in public databases, a crucial step in many data-driven bioinformatics applications is that of data integration. The challenge of batch-effect removal for entire databases lies in the large number of batches and biological variation, which can result in design matrix singularity. This problem can currently not be solved satisfactorily by any common batch-correction algorithm.; We present; reComBat; , a regularized version of the empirical Bayes method to overcome this limitation and benchmark it against popular approaches for the harmonization of public gene-expression data (both microarray and bulkRNAsq) of the human opportunistic pathogen; Pseudomonas aeruginosa; . Batch-effects are successfully mitigated while biologically meaningful gene-expression variation is retained.; reComBat; fills the gap in batch-correction approaches applicable to large-scale, public omics databases and opens up new avenues for data-driven analysis of complex biological processes beyond the scope of a single study.; The code is available at https://github.com/BorgwardtLab/reComBat, all data and evaluation code can be found at https://github.com/BorgwardtLab/batchCorrectionPublicData.; Supplementary data are available at; Bioinformatics Advances; online

    Genitourinary α/β Ratios in the CHHiP Trial the Fraction Size Sensitivity of Late Genitourinary Toxicity: Analysis of Alpha/Beta (α/β) Ratios in the CHHiP Trial

    Get PDF
    PURPOSE: Moderately hypofractionated external beam intensity-modulated radiotherapy (IMRT) for prostate cancer is now standard-of-care. Normal tissue toxicity responses to fraction size alteration are non-linear: the linear-quadratic model is a widely-used framework accounting for this, through the α/β ratio. Few α/β ratio estimates exist for human late genitourinary endpoints; here we provide estimates derived from a hypofractionation trial. METHODS AND MATERIALS: The XXXXXX trial randomised 3216 men with localised prostate cancer 1:1:1 between conventionally fractionated IMRT (74Gy/37 fractions (Fr)) and two moderately hypofractionated regimens (60Gy/20Fr & 57Gy/19Fr). Radiotherapy plan and suitable follow-up assessment was available for 2206 men. Three prospectively assessed clinician-reported toxicity scales were amalgamated for common genitourinary endpoints: Dysuria, Haematuria, Incontinence, Reduced flow/Stricture, Urine Frequency. Per endpoint, only patients with baseline zero toxicity were included. Three models for endpoint grade ≥1 (G1+) and G2+ toxicity were fitted: Lyman Kutcher-Burman (LKB) without equivalent dose in 2Gy/Fr (EQD2) correction [LKB-NoEQD2]; LKB with EQD2-correction [LKB-EQD2]; LKB-EQD2 with dose-modifying-factor (DMF) inclusion [LKB-EQD2-DMF]. DMFs were: age, diabetes, hypertension, pelvic surgery, prior transurethral resection of prostate (TURP), overall treatment time and acute genitourinary toxicity (G2+). Bootstrapping generated 95% confidence intervals and unbiased performance estimates. Models were compared by likelihood ratio test. RESULTS: The LKB-EQD2 model significantly improved performance over LKB-NoEQD2 for just three endpoints: Dysuria G1+ (α/β=2.0 Gy, 95%CI 1.2-3.2Gy), Haematuria G1+ (α/β=0.9 Gy, 95%CI 0.1-2.2Gy) and Haematuria G2+ (α/β=0.6Gy, 95%CI 0.1-1.7Gy). For these three endpoints, further incorporation of two DMFs improved on LKB-EQD2: acute genitourinary toxicity and Prior TURP (Haematuria G1+ only), but α/β ratio estimates remained stable. CONCLUSIONS: Inclusion of EQD2-correction significantly improved model fitting for Dysuria and Haematuria endpoints, where fitted α/β ratio estimates were low: 0.6-2 Gy. This suggests therapeutic gain for clinician-reported GU toxicity, through hypofractionation, might be lower than expected by typical late α/β ratio assumptions of 3-5 Gy

    A review of mechanistic learning in mathematical oncology

    No full text
    Mechanistic learning refers to the synergistic combination of mechanistic mathematical modeling and data-driven machine or deep learning. This emerging field finds increasing applications in (mathematical) oncology. This review aims to capture the current state of the field and provides a perspective on how mechanistic learning may progress in the oncology domain. We highlight the synergistic potential of mechanistic learning and point out similarities and differences between purely data-driven and mechanistic approaches concerning model complexity, data requirements, outputs generated, and interpretability of the algorithms and their results. Four categories of mechanistic learning (sequential, parallel, extrinsic, intrinsic) of mechanistic learning are presented with specific examples. We discuss a range of techniques including physics-informed neural networks, surrogate model learning, and digital twins. Example applications address complex problems predominantly from the domain of oncology research such as longitudinal tumor response predictions or time-to-event modeling. As the field of mechanistic learning advances, we aim for this review and proposed categorization framework to foster additional collaboration between the data- and knowledge-driven modeling fields. Further collaboration will help address difficult issues in oncology such as limited data availability, requirements of model transparency, and complex input data which are embraced in a mechanistic learning framewor

    A Century of Fractionated Radiotherapy: How Mathematical Oncology Can Break the Rules

    No full text
    Radiotherapy is involved in 50% of all cancer treatments and 40% of cancer cures. Most of these treatments are delivered in fractions of equal doses of radiation (Fractional Equivalent Dosing (FED)) in days to weeks. This treatment paradigm has remained unchanged in the past century and does not account for the development of radioresistance during treatment. Even if under-optimized, deviating from a century of successful therapy delivered in FED can be difficult. One way of exploring the infinite space of fraction size and scheduling to identify optimal fractionation schedules is through mathematical oncology simulations that allow for in silico evaluation. This review article explores the evidence that current fractionation promotes the development of radioresistance, summarizes mathematical solutions to account for radioresistance, both in the curative and non-curative setting, and reviews current clinical data investigating non-FED fractionated radiotherapy

    Studying missingness in spinal cord injury data: challenges and impact of data imputation

    Get PDF
    BACKGROUND In the last decades, medical research fields studying rare conditions such as spinal cord injury (SCI) have made extensive efforts to collect large-scale data. However, most analysis methods rely on complete data. This is particularly troublesome when studying clinical data as they are prone to missingness. Often, researchers mitigate this problem by removing patients with missing data from the analyses. Less commonly, imputation methods to infer likely values are applied. OBJECTIVE Our objective was to study how handling missing data influences the results reported, taking the example of SCI registries. We aimed to raise awareness on the effects of missing data and provide guidelines to be applied for future research projects, in SCI research and beyond. METHODS Using the Sygen clinical trial data (n = 797), we analyzed the impact of the type of variable in which data is missing, the pattern according to which data is missing, and the imputation strategy (e.g. mean imputation, last observation carried forward, multiple imputation). RESULTS Our simulations show that mean imputation may lead to results strongly deviating from the underlying expected results. For repeated measures missing at late stages (> = 6 months after injury in this simulation study), carrying the last observation forward seems the preferable option for the imputation. This simulation study could show that a one-size-fit-all imputation strategy falls short in SCI data sets. CONCLUSIONS Data-tailored imputation strategies are required (e.g., characterisation of the missingness pattern, last observation carried forward for repeated measures evolving to a plateau over time). Therefore, systematically reporting the extent, kind and decisions made regarding missing data will be essential to improve the interpretation, transparency, and reproducibility of the research presented

    Intermittent radiotherapy as alternative treatment for recurrent high grade glioma: a modeling study based on longitudinal tumor measurements

    No full text
    Recurrent high grade glioma patients face a poor prognosis for which no curative treatment option currently exists. In contrast to prescribing high dose hypofractionated stereotactic radiotherapy (HFSRT, ≥6 Gy × 5 in daily fractions) with debulking intent, we suggest a personalized treatment strategy to improve tumor control by delivering high dose intermittent radiation treatment (iRT, ≥6 Gy × 1 every 6 weeks). We performed a simulation analysis to compare HFSRT, iRT and iRT plus boost (≥6 Gy × 3 in daily fractions at time of progression) based on a mathematical model of tumor growth, radiation response and patient-specific evolution of resistance to additional treatments (pembrolizumab and bevacizumab). Model parameters were fitted from tumor growth curves of 16 patients enrolled in the phase 1 NCT02313272 trial that combined HFSRT with bevacizumab and pembrolizumab. Then, iRT +/− boost treatments were simulated and compared to HFSRT based on time to tumor regrowth. The modeling results demonstrated that iRT + boost(− boost) treatment was equal or superior to HFSRT in 15(11) out of 16 cases and that patients that remained responsive to pembrolizumab and bevacizumab would benefit most from iRT. Time to progression could be prolonged through the application of additional, intermittently delivered fractions. iRT hence provides a promising treatment option for recurrent high grade glioma patients for prospective clinical evaluation.ISSN:2045-232
    corecore