6 research outputs found
Applying Feature Selection to Improve Predictive Performance and Explainability in Lung Cancer Detection with Soft Computing
The field of biomedicine is focused on the detection and subsequent treatment of various complex diseases. Among these, cancer stands out as one of the most studied, due to the high mortality it entails. The appearance of cancer depends directly on the correct functionality and balance of the genome. Therefore, it is mandatory to ensure which of the approximately 25,000 human genes are linked with this undesirable condition. In this work, we focus on a case study of a population affected by lung cancer. Patient information has been obtained using liquid biopsy technology, i.e. capturing cell information from the bloodstream and applying an RNA-seq procedure to get the frequency of representation for each gene. The ultimate goal of this study is to find a good trade-off between predictive capacity and interpretability for the discernment of this type of cancer. To this end, we will apply a large number of techniques for feature selection, using different thresholds for the number of selected discriminant genes. Our experimental results, using Soft Computing techniques, show that model-based feature selection via Random Forest is essential for both improving the predictive capacity of the models, and also their explainability over a small subset of genes
Combinatorial Blood Platelets-Derived circRNA and mRNA Signature for Early-Stage Lung Cancer Detection
The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/ijms24054881/s1.Despite the diversity of liquid biopsy transcriptomic repertoire, numerous studies often
exploit only a single RNA type signature for diagnostic biomarker potential. This frequently results in
insufficient sensitivity and specificity necessary to reach diagnostic utility. Combinatorial biomarker
approaches may offer a more reliable diagnosis. Here, we investigated the synergistic contributions
of circRNA and mRNA signatures derived from blood platelets as biomarkers for lung cancer
detection. We developed a comprehensive bioinformatics pipeline permitting an analysis of platelet-
circRNA and mRNA derived from non-cancer individuals and lung cancer patients. An optimal
selected signature is then used to generate the predictive classification model using machine learning
algorithm. Using an individual signature of 21 circRNA and 28 mRNA, the predictive models
reached an area under the curve (AUC) of 0.88 and 0.81, respectively. Importantly, combinatorial
analysis including both types of RNAs resulted in an 8-target signature (6 mRNA and 2 circRNA),
enhancing the differentiation of lung cancer from controls (AUC of 0.92). Additionally, we identified
five biomarkers potentially specific for early-stage detection of lung cancer. Our proof-of-concept
study presents the first multi-analyte-based approach for the analysis of platelets-derived biomarkers,
providing a potential combinatorial diagnostic signature for lung cancer detection.European Union’s Horizon 2020 research and innovation program under the Marie Skłodowska-Curie 765492
Digital multiplexed analysis of circular RNAs in FFPE and fresh non-small cell lung cancer specimens
We would like to thank Stephanie Davis for her language editing assistance. The investigators also wish to thank the patients for kindly agreeing to donate samples to this study. We thank all the physicians who collaborated by providing clinical information. Graphical Abstract, Figs 1A, 8A and Fig. S1 were created with Biorender.com. This project has received funding from a European Union's Horizon 2020 research and innovation program under the Marie SklodowskaCurie grant agreement ELBA No 765492.Although many studies highlight the implication of circular RNAs (circRNAs)
in carcinogenesis and tumor progression, their potential as cancer
biomarkers has not yet been fully explored in the clinic due to the limitations
of current quantification methods. Here, we report the use of the
nCounter platform as a valid technology for the analysis of circRNA
expression patterns in non-small cell lung cancer (NSCLC) specimens.
Under this context, our custom-made circRNA panel was able to detect
circRNA expression both in NSCLC cells and formalin-fixed paraffinembedded
(FFPE) tissues. CircFUT8 was overexpressed in NSCLC, contrasting
with circEPB41L2, circBNC2, and circSOX13 downregulation even
at the early stages of the disease. Machine learning (ML) approaches from
different paradigms allowed discrimination of NSCLC from nontumor controls
(NTCs) with an 8-circRNA signature. An additional 4-circRNA signature
was able to classify early-stage NSCLC samples from NTC,
reaching a maximum area under the ROC curve (AUC) of 0.981. Our
results not only present two circRNA signatures with diagnosis potential
but also introduce nCounter processing following ML as a feasible protocol
for the study and development of circRNA signatures for NSCLC.European Commission 76549
Analysis of extracellular vesicle mRNA derived from plasma using the nCounter platform
Extracellular vesicles (EVs) are double-layered phospholipid membrane vesicles that are released by most cells and can mediate intercellular communication through their RNA cargo. In this study, we tested if the NanoString nCounter platform can be used for the analysis of EV-mRNA. We developed and optimized a methodology for EV enrichment, EV-RNA extraction and nCounter analysis. Then, we demonstrated the validity of our workflow by analyzing EV-RNA profiles from the plasma of 19 cancer patients and 10 controls and developing a gene signature to differentiate cancer versus control samples. TRI reagent outperformed automated RNA extraction and, although lower plasma input is feasible, 500 μL provided highest total counts and number of transcripts detected. A 10-cycle pre-amplification followed by DNase treatment yielded reproducible mRNA target detection. However, appropriate probe design to prevent genomic DNA binding is preferred. A gene signature, created using a bioinformatic algorithm, was able to distinguish between control and cancer EV-mRNA profiles with an area under the ROC curve of 0.99. Hence, the nCounter platform can be used to detect mRNA targets and develop gene signatures from plasma-derived EVs
Digital multiplexed analysis of circular RNAs in FFPE and fresh non-small cell lung cancer specimens
Although many studies highlight the implication of circular RNAs (circRNAs) in carcinogenesis and tumor progression, their potential as cancer biomarkers has not yet been fully explored in the clinic due to the limitations of current quantification methods. Here, we report the use of the nCounter platform as a valid technology for the analysis of circRNA expression patterns in non-small cell lung cancer (NSCLC) specimens. Under this context, our custom-made circRNA panel was able to detect circRNA expression both in NSCLC cells and formalin-fixed paraffin-embedded (FFPE) tissues. CircFUT8 was overexpressed in NSCLC, contrasting with circEPB41L2, circBNC2, and circSOX13 downregulation even at the early stages of the disease. Machine learning (ML) approaches from different paradigms allowed discrimination of NSCLC from nontumor controls (NTCs) with an 8-circRNA signature. An additional 4-circRNA signature was able to classify early-stage NSCLC samples from NTC, reaching a maximum area under the ROC curve (AUC) of 0.981. Our results not only present two circRNA signatures with diagnosis potential but also introduce nCounter processing following ML as a feasible protocol for the study and development of circRNA signatures for NSCLC. Aberrant circular RNA (circRNA) expression is present in lung cancer. Using nCounter with machine learning, we discovered two signatures able to discriminate FFPE lung cancer samples from controls even at early stage. Our results not only highlight the potential of circRNAs as lung cancer biomarkers but also introduce nCounter as a suitable platform for circRNA expression studies in these samples
Combinatorial Blood Platelets-Derived circRNA and mRNA Signature for Early-Stage Lung Cancer Detection
Despite the diversity of liquid biopsy transcriptomic repertoire, numerous studies often exploit only a single RNA type signature for diagnostic biomarker potential. This frequently results in insufficient sensitivity and specificity necessary to reach diagnostic utility. Combinatorial biomarker approaches may offer a more reliable diagnosis. Here, we investigated the synergistic contributions of circRNA and mRNA signatures derived from blood platelets as biomarkers for lung cancer detection. We developed a comprehensive bioinformatics pipeline permitting an analysis of platelet-circRNA and mRNA derived from non-cancer individuals and lung cancer patients. An optimal selected signature is then used to generate the predictive classification model using machine learning algorithm. Using an individual signature of 21 circRNA and 28 mRNA, the predictive models reached an area under the curve (AUC) of 0.88 and 0.81, respectively. Importantly, combinatorial analysis including both types of RNAs resulted in an 8-target signature (6 mRNA and 2 circRNA), enhancing the differentiation of lung cancer from controls (AUC of 0.92). Additionally, we identified five biomarkers potentially specific for early-stage detection of lung cancer. Our proof-of-concept study presents the first multi-analyte-based approach for the analysis of platelets-derived biomarkers, providing a potential combinatorial diagnostic signature for lung cancer detection