31 research outputs found

    Augmentation of FTIR spectral datasets using Wasserstein generative adversarial networks for cancer liquid biopsies †

    Get PDF
    Over recent years, deep learning (DL) has become more widely used within the field of cancer diagnostics. However, DL often requires large training datasets to prevent overfitting, which can be difficult and expensive to acquire. Data augmentation is a method that can be used to generate new data points to train DL models. In this study, we use attenuated total reflectance Fourier-transform infrared (ATR-FTIR) spectra of patient dried serum samples and compare non-generative data augmentation methods to Wasserstein generative adversarial networks (WGANs) in their ability to improve the performance of a convolutional neural network (CNN) to differentiate between pancreatic cancer and non-cancer samples in a total cohort of 625 patients. The results show that WGAN augmented spectra improve CNN performance more than non-generative augmented spectra. When compared with a model that utilised no augmented spectra, adding WGAN augmented spectra to a CNN with the same architecture and same parameters, increased the area under the receiver operating characteristic curve (AUC) from 0.661 to 0.757, presenting a 15% increase in diagnostic performance. In a separate test on a colorectal cancer dataset, data augmentation using a WGAN led to an increase in AUC from 0.905 to 0.955. This demonstrates the impact data augmentation can have on DL performance for cancer diagnosis when the amount of real data available for model training is limited

    Recurrent neural networks for time domain modelling of FTIR spectra : application to brain tumour detection

    Get PDF
    Attenuated total reflectance (ATR)-Fourier transform infrared (FTIR) spectroscopy alongside machine learning (ML) techniques is an emerging approach for the early detection of brain cancer in clinical practice. A crucial step in the acquisition of an IR spectrum is the transformation of the time domain signal from the biological sample to a frequency domain spectrum via a discrete Fourier transform. Further pre-processing of the spectrum is typically applied to reduce non-biological sample variance, and thus to improve subsequent analysis. However, the Fourier transformation is often assumed to be essential even though modelling of time domain data is common in other fields. We apply an inverse Fourier transform to frequency domain data to map these to the time domain. We use the transformed data to develop deep learning models utilising Recurrent Neural Networks (RNNs) to differentiate between brain cancer and control in a cohort of 1438 patients. The best performing model achieves a mean (cross-validated score) area under the receiver operating characteristic (ROC) curve (AUC) of 0.97 with sensitivity of 0.91 and specificity of 0.91. This is better than the optimal model trained on frequency domain data which achieves an AUC of 0.93 with sensitivity of 0.85 and specificity of 0.85. A dataset comprising 385 patient samples which were prospectively collected in the clinic is used to test a model defined with the best performing configuration and fit to the time domain. Its classification accuracy is found to be comparable to the gold-standard for this dataset demonstrating that RNNs can accurately classify disease states using spectroscopic data represented in the time domain

    Recurrent neural networks for time domain modelling of FTIR spectra: application to brain tumour detection

    Get PDF
    Attenuated total reflectance (ATR)-Fourier transform infrared (FTIR) spectroscopy alongside machine learning (ML) techniques is an emerging approach for the early detection of brain cancer in clinical practice. A crucial step in the acquisition of an IR spectrum is the transformation of the time domain signal from the biological sample to a frequency domain spectrum via a discrete Fourier transform. Further pre-processing of the spectrum is typically applied to reduce non-biological sample variance, and thus to improve subsequent analysis. However, the Fourier transformation is often assumed to be essential even though modelling of time domain data is common in other fields. We apply an inverse Fourier transform to frequency domain data to map these to the time domain. We use the transformed data to develop deep learning models utilising Recurrent Neural Networks (RNNs) to differentiate between brain cancer and control in a cohort of 1438 patients. The best performing model achieves a mean (cross-validated score) area under the receiver operating characteristic (ROC) curve (AUC) of 0.97 with sensitivity of 0.91 and specificity of 0.91. This is better than the optimal model trained on frequency domain data which achieves an AUC of 0.93 with sensitivity of 0.85 and specificity of 0.85. A dataset comprising 385 patient samples which were prospectively collected in the clinic is used to test a model defined with the best performing configuration and fit to the time domain. Its classification accuracy is found to be comparable to the gold-standard for this dataset demonstrating that RNNs can accurately classify disease states using spectroscopic data represented in the time domain

    Interrogation of IDH1 Status in Gliomas by Fourier Transform Infrared Spectroscopy

    Get PDF
    Mutations in the isocitrate dehydrogenase 1 (IDH1) gene are found in a high proportion of diffuse gliomas. The presence of the IDH1 mutation is a valuable diagnostic, prognostic and predictive biomarker for the management of patients with glial tumours. Techniques involving vibrational spectroscopy, e.g., Fourier transform infrared (FTIR) spectroscopy, have previously demonstrated analytical capabilities for cancer detection, and have the potential to contribute to diagnostics. The implementation of FTIR microspectroscopy during surgical biopsy could present a fast, label-free method for molecular genetic classification. For example, the rapid determination of IDH1 status in a patient with a glioma diagnosis could inform intra-operative decision-making between alternative surgical strategies. In this study, we utilized synchrotron-based FTIR microanalysis to probe tissue microarray sections from 79 glioma patients, and distinguished the positive class (IDH1-mutated) from the IDH1-wildtype glioma, with a sensitivity and specificity of 82.4% and 83.4%, respectively. We also examined the ability of attenuated total reflection (ATR)-FTIR spectroscopy in detecting the biomolecular events and global epigenetic and metabolic changes associated with mutations in the IDH1 enzyme, in blood serum samples collected from an additional 72 brain tumour patients. Centrifugal filtration enhanced the diagnostic ability of the classification models, with balanced accuracies up to ~69%. Identification of the molecular status from blood serum prior to biopsy could further direct some patients to alternative treatment strategies

    Clinical validation of a spectroscopic liquid biopsy for earlier detection of brain cancer

    Get PDF
    BackgroundDiagnostic delays impact the quality of life and survival of patients with brain tumors. Earlier and expeditious diagnoses in these patients are crucial to reduce the morbidities and mortalities associated with brain tumors. A simple, rapid blood test that can be administered easily in a primary care setting to efficiently identify symptomatic patients who are most likely to have a brain tumor would enable quicker referral to brain imaging for those who need it most.MethodsBlood serum samples from 603 patients were prospectively collected and analyzed. Patients either had non-specific symptoms that could be indicative of a brain tumor on presentation to the Emergency Department, or a new brain tumor diagnosis and referral to the neurosurgical unit, NHS Lothian, Scotland. Patient blood serum samples were analyzed using the Dxcover® Brain Cancer liquid biopsy. This technology utilizes infrared spectroscopy combined with a diagnostic algorithm to predict the presence of intracranial disease.ResultsOur liquid biopsy approach reported an area under the receiver operating characteristic curve of 0.8. The sensitivity-tuned model achieves a 96% sensitivity with 45% specificity (NPV 99.3%) and identified 100% of glioblastoma multiforme patients. When tuned for a higher specificity, the model yields a sensitivity of 47% with 90% specificity (PPV 28.4%).ConclusionsThis simple, non-invasive blood test facilitates the triage and radiographic diagnosis of brain tumor patients while providing reassurance to healthy patients. Minimizing time to diagnosis would facilitate the identification of brain tumor patients at an earlier stage, enabling more effective, less morbid surgical and adjuvant care

    A fluid-dynamical model for ‘anti-surfactants’

    Get PDF
    We construct a fluid-dynamical model for the flow of a solution with a free surface at which surface tension acts. This model can describe both classical surfactants, which decrease the surface tension of the solution relative to that of the pure solvent, and ‘anti-surfactants’ (such as many salts when added to water, and small amounts of water when added to alcohol) which increase it. We demonstrate the utility of the model by considering the linear stability of an infinitely deep layer of initially quiescent fluid. In particular, we predict the occurrence of a novel instability driven by surface-tension gradients, which occurs for anti-surfactant, but not for surfactant, solutions

    Blinded predictions and post-hoc analysis of the second solubility challenge data : exploring training data and feature set selection for machine and deep learning models

    Get PDF
    Accurate methods to predict solubility from molecular structure are highly sought after in the chemical sciences. To assess the state-of-the-art, the American Chemical Society organised a “Second Solubility Challenge” in 2019, in which competitors were invited to submit blinded predictions of the solubilities of 132 drug-like molecules. In the first part of this article, we describe the development of two models that were submitted to the Blind Challenge in 2019, but which have not previously been reported. These models were based on computationally inexpensive molecular descriptors and traditional machine learning algorithms, and were trained on a relatively small dataset of 300 molecules. In the second part of the article, to test the hypothesis that predictions would improve with more advanced algorithms and higher volumes of training data, we compare these original predictions with those made after the deadline using deep learning models trained on larger solubility datasets consisting of 2999 and 5697 molecules. The results show that there are several algorithms that are able to obtain near state-of-the-art performance on the solubility challenge datasets, with the best model, a graph convolutional neural network, resulting in a RMSE of 0.86 log units. Critical analysis of the models reveal systematic di↵erences between the performance of models using certain feature sets and training datasets. The results suggest that careful selection of high quality training data from relevant regions of chemical space is critical for prediction accuracy, but that other methodological issues remain problematic for machine learning solubility models, such as the difficulty in modelling complex chemical spaces from sparse training datasets

    Rapid detection of heart failure using a spectroscopic liquid biopsy

    Get PDF
    Heart disease is growing annually across the globe with numbers expected to rise to 46% of the population by 2030. Early detection is vital for several reasons, firstly it improves the long-term prognosis of the patient by admitting them through the appropriate pathway faster, secondly it reduces healthcare costs by streamlining diagnosis and finally, in combination with management or treatment, it can prevent the progression of the disease which in turn improves the patient’s quality of life. Therefore, there lies an increasing need to develop assays which can rapidly detect heart disease at an early stage. The Dxcover® liquid biopsy platform employs infrared spectroscopy and artificial intelligence, to quickly analyse minute amounts of patient serum. In this study, discrimination between healthy controls and diseased patients was obtained with an area under the receiver operating characteristic curve (AUC) of 0.89. When assessing the heart failure vs all patients, which is most akin to what would be observed in a triage setting, the model when tuned to a minimum of 45% specificity yielded a sensitivity of 89% and an NPV of 0.996, conversely when sensitivity was set at a 45% minimum, the specificity was 96%, giving an NPV of 0.991 when using a 1.5% prevalence. Other models were assessed in parallel, but the performance of the ORFPLS model was overall superior to the other models tested. In this large scale (n = 404) proof-of-concept study, we have shown that the Dxcover liquid biopsy platform has the potential to be a viable triage tool in emergency and routine situations for the diagnosis of heart failure

    Multiple novel prostate cancer susceptibility signals identified by fine-mapping of known risk loci among Europeans

    Get PDF
    Genome-wide association studies (GWAS) have identified numerous common prostate cancer (PrCa) susceptibility loci. We have fine-mapped 64 GWAS regions known at the conclusion of the iCOGS study using large-scale genotyping and imputation in 25 723 PrCa cases and 26 274 controls of European ancestry. We detected evidence for multiple independent signals at 16 regions, 12 of which contained additional newly identified significant associations. A single signal comprising a spectrum of correlated variation was observed at 39 regions; 35 of which are now described by a novel more significantly associated lead SNP, while the originally reported variant remained as the lead SNP only in 4 regions. We also confirmed two association signals in Europeans that had been previously reported only in East-Asian GWAS. Based on statistical evidence and linkage disequilibrium (LD) structure, we have curated and narrowed down the list of the most likely candidate causal variants for each region. Functional annotation using data from ENCODE filtered for PrCa cell lines and eQTL analysis demonstrated significant enrichment for overlap with bio-features within this set. By incorporating the novel risk variants identified here alongside the refined data for existing association signals, we estimate that these loci now explain ∼38.9% of the familial relative risk of PrCa, an 8.9% improvement over the previously reported GWAS tag SNPs. This suggests that a significant fraction of the heritability of PrCa may have been hidden during the discovery phase of GWAS, in particular due to the presence of multiple independent signals within the same regio

    The Efficacy of Exercise in Reducing Depressive Symptoms among Cancer Survivors: A Meta-Analysis

    Get PDF
    INTRODUCTION: The purpose of this meta-analysis was to examine the efficacy of exercise to reduce depressive symptoms among cancer survivors. In addition, we examined the extent to which exercise dose and clinical characteristics of cancer survivors influence the relationship between exercise and reductions in depressive symptoms. METHODS: We conducted a systematic search identifying randomized controlled trials of exercise interventions among adult cancer survivors, examining depressive symptoms as an outcome. We calculated effect sizes for each study and performed weighted multiple regression moderator analysis. RESULTS: We identified 40 exercise interventions including 2,929 cancer survivors. Diverse groups of cancer survivors were examined in seven exercise interventions; breast cancer survivors were examined in 26; prostate cancer, leukemia, and lymphoma were examined in two; and colorectal cancer in one. Cancer survivors who completed an exercise intervention reduced depression more than controls, d(+) = -0.13 (95% CI: -0.26, -0.01). Increases in weekly volume of aerobic exercise reduced depressive symptoms in dose-response fashion (β = -0.24, p = 0.03), a pattern evident only in higher quality trials. Exercise reduced depressive symptoms most when exercise sessions were supervised (β = -0.26, p = 0.01) and when cancer survivors were between 47-62 yr (β = 0.27, p = 0.01). CONCLUSION: Exercise training provides a small overall reduction in depressive symptoms among cancer survivors but one that increased in dose-response fashion with weekly volume of aerobic exercise in high quality trials. Depressive symptoms were reduced to the greatest degree among breast cancer survivors, among cancer survivors aged between 47-62 yr, or when exercise sessions were supervised
    corecore