175 research outputs found

    Computational models and approaches for lung cancer diagnosis

    Full text link
    The success of treatment of patients with cancer depends on establishing an accurate diagnosis. To this end, the aim of this study is to developed novel lung cancer diagnostic models. New algorithms are proposed to analyse the biological data and extract knowledge that assists in achieving accurate diagnosis results

    Detecting reliable gene interactions by a hierarchy of Bayesian network classifiers

    Get PDF
    The main purpose of a gene interaction network is to map the relationships of the genes that are out of sight when a genomic study is tackled. DNA microarrays allow the measure of gene expression of thousands of genes at the same time. These data constitute the numeric seed for the induction of the gene networks. In this paper, we propose a new approach to build gene networks by means of Bayesian classifiers, variable selection and bootstrap resampling. The interactions induced by the Bayesian classifiers are based both on the expression levels and on the phenotype information of the supervised variable. Feature selection and bootstrap resampling add reliability and robustness to the overall process removing the false positive findings. The consensus among all the induced models produces a hierarchy of dependences and, thus, of variables. Biologists can define the depth level of the model hierarchy so the set of interactions and genes involved can vary from a sparse to a dense set. Experimental results show how these networks perform well on classification tasks. The biological validation matches previous biological findings and opens new hypothesis for future studie

    Advancements in early cancer diagnosis using blood-based biomarkers and a machine learning approach

    Get PDF
    Mutations that promote aberrant cell growth are the root of the condition known as cancer. There are over a hundred distinct forms of cancer that have been identified, with lung, colon, pancreatic, breast, kidney, and prostate cancer being the most prevalent. The likelihood that a patient will survive cancer is significantly improved by early identification. Most techniques used to detect cancer are invasive, which may be painful and uncomfortable for patients and prevent them from seeking treatment. As a result, cancer is frequently discovered only after substantial symptoms have developed and it may then be too late for treatment. In this review, we will discuss several methods for detecting cancer through blood tests, different elements that serve as biomarkers, and machine learning algorithms for predicting outcomes

    Identification of S100A8-correlated genes for prediction of disease progression in non-muscle invasive bladder cancer

    Get PDF
    <p>Abstract</p> <p>Background</p> <p><it>S100 calcium binding protein A8 </it>(<it>S100A8</it>) has been implicated as a prognostic indicator in several types of cancer. However, previous studies are limited in their ability to predict the clinical behavior of the cancer. Here, we sought to identify a molecular signature based on <it>S100A8 </it>expression and to assess its usefulness as a prognostic indicator of disease progression in non-muscle invasive bladder cancer (NMIBC).</p> <p>Methods</p> <p>We used 103 primary NMIBC specimens for microarray gene expression profiling. The median follow-up period for all patients was 57.6 months (range: 3.2 to 137.0 months). Various statistical methods, including the leave-one-out cross validation method, were applied to identify a gene expression signature able to predict the likelihood of progression. The prognostic value of the gene expression signature was validated in an independent cohort (n = 302).</p> <p>Results</p> <p>Kaplan-Meier estimates revealed significant differences in disease progression associated with the expression signature of <it>S100A8</it>-correlated genes (log-rank test, <it>P </it>< 0.001). Multivariate Cox regression analysis revealed that the expression signature of <it>S100A8</it>-correlated genes was a strong predictor of disease progression (hazard ratio = 15.225, 95% confidence interval = 1.746 to 133.52, <it>P </it>= 0.014). We validated our results in an independent cohort and confirmed that this signature produced consistent prediction patterns. Finally, gene network analyses of the signature revealed that <it>S100A8</it>, <it>IL1B</it>, and <it>S100A9 </it>could be important mediators of the progression of NMIBC.</p> <p>Conclusions</p> <p>The prognostic molecular signature defined by <it>S100A8</it>-correlated genes represents a promising diagnostic tool for the identification of NMIBC patients that have a high risk of progression to muscle invasive bladder cancer.</p

    Plasma Free Amino Acid Profiling of Five Types of Cancer Patients and Its Application for Early Detection

    Get PDF
    BACKGROUND: Recently, rapid advances have been made in metabolomics-based, easy-to-use early cancer detection methods using blood samples. Among metabolites, profiling of plasma free amino acids (PFAAs) is a promising approach because PFAAs link all organ systems and have important roles in metabolism. Furthermore, PFAA profiles are known to be influenced by specific diseases, including cancers. Therefore, the purpose of the present study was to determine the characteristics of the PFAA profiles in cancer patients and the possibility of using this information for early detection. METHODS AND FINDINGS: Plasma samples were collected from approximately 200 patients from multiple institutes, each diagnosed with one of the following five types of cancer: lung, gastric, colorectal, breast, or prostate cancer. Patients were compared to gender- and age- matched controls also used in this study. The PFAA levels were measured using high-performance liquid chromatography (HPLC)-electrospray ionization (ESI)-mass spectrometry (MS). Univariate analysis revealed significant differences in the PFAA profiles between the controls and the patients with any of the five types of cancer listed above, even those with asymptomatic early-stage disease. Furthermore, multivariate analysis clearly discriminated the cancer patients from the controls in terms of the area under the receiver-operator characteristics curve (AUC of ROC >0.75 for each cancer), regardless of cancer stage. Because this study was designed as case-control study, further investigations, including model construction and validation using cohorts with larger sample sizes, are necessary to determine the usefulness of PFAA profiling. CONCLUSIONS: These findings suggest that PFAA profiling has great potential for improving cancer screening and diagnosis and understanding disease pathogenesis. PFAA profiles can also be used to determine various disease diagnoses from a single blood sample, which involves a relatively simple plasma assay and imposes a lower physical burden on subjects when compared to existing screening methods

    SED, a normalization free method for DNA microarray data analysis

    Get PDF
    BACKGROUND: Analysis of DNA microarray data usually begins with a normalization step where intensities of different arrays are adjusted to the same scale so that the intensity levels from different arrays can be compared with one other. Both simple total array intensity-based as well as more complex "local intensity level" dependent normalization methods have been developed, some of which are widely used. Much less developed methods for microarray data analysis include those that bypass the normalization step and therefore yield results that are not confounded by potential normalization errors. RESULTS: Instead of focusing on the raw intensity levels, we developed a new method for microarray data analysis that maps each gene's expression intensity level to a high dimensional space of SEDs (Signs of Expression Difference), the signs of the expression intensity difference between a given gene and every other gene on the array. Since SED are unchanged under any monotonic transformation of intensity levels, the SED based method is normalization free. When tested on a multi-class tumor classification problem, simple Naive Bayes and Nearest Neighbor methods using the SED approach gave results comparable with normalized intensity-based algorithms. Furthermore, a high percentage of classifiers based on a single gene's SED gave good classification results, suggesting that SED does capture essential information from the intensity levels. CONCLUSION: The results of testing this new method on multi-class tumor classification problems suggests that the SED-based, normalization-free method of microarray data analysis is feasible and promising

    Applications of Machine Learning in Human Microbiome Studies: A Review on Feature Selection, Biomarker Identification, Disease Prediction and Treatment

    Get PDF
    COST Action CA18131 Cierva Grant IJC2019-042188-I (LM-Z) Estonian Research Council grant PUT 1371The number of microbiome-related studies has notably increased the availability of data on human microbiome composition and function. These studies provide the essential material to deeply explore host-microbiome associations and their relation to the development and progression of various complex diseases. Improved data-analytical tools are needed to exploit all information from these biological datasets, taking into account the peculiarities of microbiome data, i.e., compositional, heterogeneous and sparse nature of these datasets. The possibility of predicting host-phenotypes based on taxonomy-informed feature selection to establish an association between microbiome and predict disease states is beneficial for personalized medicine. In this regard, machine learning (ML) provides new insights into the development of models that can be used to predict outputs, such as classification and prediction in microbiology, infer host phenotypes to predict diseases and use microbial communities to stratify patients by their characterization of state-specific microbial signatures. Here we review the state-of-the-art ML methods and respective software applied in human microbiome studies, performed as part of the COST Action ML4Microbiome activities. This scoping review focuses on the application of ML in microbiome studies related to association and clinical use for diagnostics, prognostics, and therapeutics. Although the data presented here is more related to the bacterial community, many algorithms could be applied in general, regardless of the feature type. This literature and software review covering this broad topic is aligned with the scoping review methodology. The manual identification of data sources has been complemented with: (1) automated publication search through digital libraries of the three major publishers using natural language processing (NLP) Toolkit, and (2) an automated identification of relevant software repositories on GitHub and ranking of the related research papers relying on learning to rank approach.publishersversionpublishe

    Applications of Machine Learning in Human Microbiome Studies: A Review on Feature Selection, Biomarker Identification, Disease Prediction and Treatment

    Get PDF
    The number of microbiome-related studies has notably increased the availability of data on human microbiome composition and function. These studies provide the essential material to deeply explore host-microbiome associations and their relation to the development and progression of various complex diseases. Improved data-analytical tools are needed to exploit all information from these biological datasets, taking into account the peculiarities of microbiome data, i.e., compositional, heterogeneous and sparse nature of these datasets. The possibility of predicting host-phenotypes based on taxonomy-informed feature selection to establish an association between microbiome and predict disease states is beneficial for personalized medicine. In this regard, machine learning (ML) provides new insights into the development of models that can be used to predict outputs, such as classification and prediction in microbiology, infer host phenotypes to predict diseases and use microbial communities to stratify patients by their characterization of state-specific microbial signatures. Here we review the state-of-the-art ML methods and respective software applied in human microbiome studies, performed as part of the COST Action ML4Microbiome activities. This scoping review focuses on the application of ML in microbiome studies related to association and clinical use for diagnostics, prognostics, and therapeutics. Although the data presented here is more related to the bacterial community, many algorithms could be applied in general, regardless of the feature type. This literature and software review covering this broad topic is aligned with the scoping review methodology. The manual identification of data sources has been complemented with: (1) automated publication search through digital libraries of the three major publishers using natural language processing (NLP) Toolkit, and (2) an automated identification of relevant software repositories on GitHub and ranking of the related research papers relying on learning to rank approach.This study was supported by COST Action CA18131 “Statistical and machine learning techniques in human microbiome studies”. Estonian Research Council grant PRG548 (JT). Spanish State Research Agency Juan de la Cierva Grant IJC2019-042188-I (LM-Z). EO was founded and OA was supported by Estonian Research Council grant PUT 1371 and EMBO Installation grant 3573. AG was supported by Statutory Research project of the Department of Computer Networks and Systems
    • …
    corecore