2,726 research outputs found
Elephant Search with Deep Learning for Microarray Data Analysis
Even though there is a plethora of research in Microarray gene expression
data analysis, still, it poses challenges for researchers to effectively and
efficiently analyze the large yet complex expression of genes. The feature
(gene) selection method is of paramount importance for understanding the
differences in biological and non-biological variation between samples. In
order to address this problem, a novel elephant search (ES) based optimization
is proposed to select best gene expressions from the large volume of microarray
data. Further, a promising machine learning method is envisioned to leverage
such high dimensional and complex microarray dataset for extracting hidden
patterns inside to make a meaningful prediction and most accurate
classification. In particular, stochastic gradient descent based Deep learning
(DL) with softmax activation function is then used on the reduced features
(genes) for better classification of different samples according to their gene
expression levels. The experiments are carried out on nine most popular Cancer
microarray gene selection datasets, obtained from UCI machine learning
repository. The empirical results obtained by the proposed elephant search
based deep learning (ESDL) approach are compared with most recent published
article for its suitability in future Bioinformatics research.Comment: 12 pages, 5 Tabl
Machine Learning and Integrative Analysis of Biomedical Big Data.
Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods. Integrative analysis of multi-omics and clinical data is key to new biomedical discoveries and advancements in precision medicine. However, data integration poses new computational challenges as well as exacerbates the ones associated with single-omics studies. Specialized computational approaches are required to effectively and efficiently perform integrative analysis of biomedical data acquired from diverse modalities. In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues
Kernel methods in genomics and computational biology
Support vector machines and kernel methods are increasingly popular in
genomics and computational biology, due to their good performance in real-world
applications and strong modularity that makes them suitable to a wide range of
problems, from the classification of tumors to the automatic annotation of
proteins. Their ability to work in high dimension, to process non-vectorial
data, and the natural framework they provide to integrate heterogeneous data
are particularly relevant to various problems arising in computational biology.
In this chapter we survey some of the most prominent applications published so
far, highlighting the particular developments in kernel methods triggered by
problems in biology, and mention a few promising research directions likely to
expand in the future
Computational models and approaches for lung cancer diagnosis
The success of treatment of patients with cancer depends on establishing an accurate diagnosis. To this end, the aim of this study is to developed novel lung cancer diagnostic models. New algorithms are proposed to analyse the biological data and extract knowledge that assists in achieving accurate diagnosis results
Radiomics strategies for risk assessment of tumour failure in head-and-neck cancer
Quantitative extraction of high-dimensional mineable data from medical images
is a process known as radiomics. Radiomics is foreseen as an essential
prognostic tool for cancer risk assessment and the quantification of
intratumoural heterogeneity. In this work, 1615 radiomic features (quantifying
tumour image intensity, shape, texture) extracted from pre-treatment FDG-PET
and CT images of 300 patients from four different cohorts were analyzed for the
risk assessment of locoregional recurrences (LR) and distant metastases (DM) in
head-and-neck cancer. Prediction models combining radiomic and clinical
variables were constructed via random forests and imbalance-adjustment
strategies using two of the four cohorts. Independent validation of the
prediction and prognostic performance of the models was carried out on the
other two cohorts (LR: AUC = 0.69 and CI = 0.67; DM: AUC = 0.86 and CI = 0.88).
Furthermore, the results obtained via Kaplan-Meier analysis demonstrated the
potential of radiomics for assessing the risk of specific tumour outcomes using
multiple stratification groups. This could have important clinical impact,
notably by allowing for a better personalization of chemo-radiation treatments
for head-and-neck cancer patients from different risk groups.Comment: (1) Paper: 33 pages, 4 figures, 1 table; (2) SUPP info: 41 pages, 7
figures, 8 table
- …