2,725 research outputs found

    Elephant Search with Deep Learning for Microarray Data Analysis

    Full text link
    Even though there is a plethora of research in Microarray gene expression data analysis, still, it poses challenges for researchers to effectively and efficiently analyze the large yet complex expression of genes. The feature (gene) selection method is of paramount importance for understanding the differences in biological and non-biological variation between samples. In order to address this problem, a novel elephant search (ES) based optimization is proposed to select best gene expressions from the large volume of microarray data. Further, a promising machine learning method is envisioned to leverage such high dimensional and complex microarray dataset for extracting hidden patterns inside to make a meaningful prediction and most accurate classification. In particular, stochastic gradient descent based Deep learning (DL) with softmax activation function is then used on the reduced features (genes) for better classification of different samples according to their gene expression levels. The experiments are carried out on nine most popular Cancer microarray gene selection datasets, obtained from UCI machine learning repository. The empirical results obtained by the proposed elephant search based deep learning (ESDL) approach are compared with most recent published article for its suitability in future Bioinformatics research.Comment: 12 pages, 5 Tabl

    Algorithms Implemented for Cancer Gene Searching and Classifications

    Get PDF
    Understanding the gene expression is an important factor to cancer diagnosis. One target of this understanding is implementing cancer gene search and classification methods. However, cancer gene search and classification is a challenge in that there is no an obvious exact algorithm that can be implemented individually for various cancer cells. In this paper a research is con-ducted through the most common top ranked algorithms implemented for cancer gene search and classification, and how they are implemented to reach a better performance. The paper will distinguish algorithms implemented for Bio image analysis for cancer cells and algorithms implemented based on DNA array data. The main purpose of this paper is to explore a road map towards presenting the most current algorithms implemented for cancer gene search and classification

    Machine Learning and Integrative Analysis of Biomedical Big Data.

    Get PDF
    Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods. Integrative analysis of multi-omics and clinical data is key to new biomedical discoveries and advancements in precision medicine. However, data integration poses new computational challenges as well as exacerbates the ones associated with single-omics studies. Specialized computational approaches are required to effectively and efficiently perform integrative analysis of biomedical data acquired from diverse modalities. In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues

    EapGAFS: Microarray Dataset for Ensemble Classification for Diseases Prediction

    Get PDF
    Microarray data stores the measured expression levels of thousands of genes simultaneously which helps the researchers to get insight into the biological and prognostic information. Cancer is a deadly disease that develops over time and involves the uncontrolled division of body cells. In cancer, many genes are responsible for cell growth and division. But different kinds of cancer are caused by a different set of genes. So to be able to better understand, diagnose and treat cancer, it is essential to know which of the genes in the cancer cells are working abnormally. The advances in data mining, machine learning, soft computing, and pattern recognition have addressed the challenges posed by the researchers to develop computationally effective models to identify the new class of disease and develop diagnostic or therapeutic targets. This paper proposed an Ensemble Aprior Gentic Algorithm Feature Selection (EapGAFS) for microarray dataset classification. The proposed algorithm comprises of the genetic algorithm implemented with aprior learning for the microarray attributes classification. The proposed EapGAFS uses the rule set mining in the genetic algorithm for the microarray dataset processing. Through framed rule set the proposed model extract the attribute features in the dataset. Finally, with the ensemble classifier model the microarray dataset were classified for the processing. The performance of the proposed EapGAFS is conventional classifiers for the collected microarray dataset of the breast cancer, Hepatities, diabeties, and bupa. The comparative analysis of the proposed EapGAFS with the conventional classifier expressed that the proposed EapGAFS exhibits improved performance in the microarray dataset classification. The performance of the proposed EapGAFS is improved ~4 – 6% than the conventional classifiers such as Adaboost and ensemble

    En-PaFlower: An Ensemble Approach using PSO and Flower Pollination Algorithm for Cancer Diagnosis

    Get PDF
    Machine learning now is used across many sectors and provides consistently precise predictions. The machine learning system is able to learn effectively because the training dataset contains examples of previously completed tasks. After learning how to process the necessary data, researchers have proven that machine learning algorithms can carry out the whole work autonomously. In recent years, cancer has become a major cause of the worldwide increase in mortality. Therefore, early detection of cancer improves the chance of a complete recovery, and Machine Learning (ML) plays a significant role in this perspective. Cancer diagnostic and prognosis microarray dataset is available with the biopsy dataset. Because of its importance in making diagnoses and classifying cancer diseases, the microarray data represents a massive amount. It may be challenging to do an analysis on a large number of datasets, though. As a result, feature selection is crucial, and machine learning provides classification techniques. These algorithms choose the relevant features that help build a more precise categorization model. Accurately classifying diseases is facilitated as a result, which aids in disease prevention. This work aims to synthesize existing knowledge on cancer diagnosis using machine learning techniques into a compact report.  Current research work aims to propose an ensemble-based machine learning model En-PaFlower using Particle Swarm Optimization (PSO) as the feature selection algorithm, Flower Pollination algorithm (FPA) as the optimization algorithm with the majority voting algorithm. Finally, the performance of the proposed algorithm is evaluated over three different types of cancer disease datasets with accuracy, precision, recall, specificity, and F-1 Score etc as the evaluation parameters. The empirical analysis shows that the proposed methodology shows highest accuracy as 95.65%

    Deep Functional Mapping For Predicting Cancer Outcome

    Get PDF
    The effective understanding of the biological behavior and prognosis of cancer subtypes is becoming very important in-patient administration. Cancer is a diverse disorder in which a significant medical progression and diagnosis for each subtype can be observed and characterized. Computer-aided diagnosis for early detection and diagnosis of many kinds of diseases has evolved in the last decade. In this research, we address challenges associated with multi-organ disease diagnosis and recommend numerous models for enhanced analysis. We concentrate on evaluating the Magnetic Resonance Imaging (MRI), Computed Tomography (CT), and Positron Emission Tomography (PET) for brain, lung, and breast scans to detect, segment, and classify types of cancer from biomedical images. Moreover, histopathological, and genomic classification of cancer prognosis has been considered for multi-organ disease diagnosis and biomarker recommendation. We considered multi-modal, multi-class classification during this study. We are proposing implementing deep learning techniques based on Convolutional Neural Network and Generative Adversarial Network. In our proposed research we plan to demonstrate ways to increase the performance of the disease diagnosis by focusing on a combined diagnosis of histology, image processing, and genomics. It has been observed that the combination of medical imaging and gene expression can effectively handle the cancer detection situation with a higher diagnostic rate rather than considering the individual disease diagnosis. This research puts forward a blockchain-based system that facilitates interpretations and enhancements pertaining to automated biomedical systems. In this scheme, a secured sharing of the biomedical images and gene expression has been established. To maintain the secured sharing of the biomedical contents in a distributed system or among the hospitals, a blockchain-based algorithm is considered that generates a secure sequence to identity a hash key. This adaptive feature enables the algorithm to use multiple data types and combines various biomedical images and text records. All data related to patients, including identity, pathological records are encrypted using private key cryptography based on blockchain architecture to maintain data privacy and secure sharing of the biomedical contents

    Cancer prediction using graph-based gene selection and explainable classifier

    Get PDF
    Several Artificial Intelligence-based models have been developed for cancer prediction. In spite of the promise of artificial intelligence, there are very few models which bridge the gap between traditional human-centered prediction and the potential future of machine-centered cancer prediction. In this study, an efficient and effective model is developed for gene selection and cancer prediction. Moreover, this study proposes an artificial intelligence decision system to provide physicians with a simple and human-interpretable set of rules for cancer prediction. In contrast to previous deep learning-based cancer prediction models, which are difficult to explain to physicians due to their black-box nature, the proposed prediction model is based on a transparent and explainable decision forest model. The performance of the developed approach is compared to three state-of-the-art cancer prediction including TAGA, HPSO and LL. The reported results on five cancer datasets indicate that the developed model can improve the accuracy of cancer prediction and reduce the execution time

    Predicting breast cancer risk, recurrence and survivability

    Full text link
    This thesis focuses on predicting breast cancer at early stages by using machine learning algorithms based on biological datasets. The accuracy of those algorithms has been improved to enable the physicians to enhance the success of treatment, thus saving lives and avoiding several further medical tests

    Gene selection and classification in autism gene expression data

    Get PDF
    Autism spectrum disorders (ASD) are neurodevelopmental disorders that are currently diagnosed on the basis of abnormal stereotyped behaviour as well as observable deficits in communication and social functioning. Although a variety of candidate genes have been attributed to the disorder, no single gene is applicable to more than 1–2% of the general ASD population. Despite extensive efforts, definitive genes that contribute to autism susceptibility have yet to be identified. The major problems in dealing with the gene expression dataset of autism include the presence of limited number of samples and large noises due to errors of experimental measurements and natural variation. In this study, a systematic combination of three important filters, namely t-test (TT), Wilcoxon Rank Sum (WRS) and Feature Correlation (COR) are applied along with efficient wrapper algorithm based on geometric binary particle swarm optimization-support vector machine (GBPSO-SVM), aiming at selecting and classifying the most attributed genes of autism. A new approach based on the criterion of median ratio, mean ratio and variance deviations is also applied to reduce the initial dataset prior to its involvement. Results showed that the most discriminative genes that were identified in the first and last selection steps concluded the presence of a repetitive gene (CAPS2), which was assigned as the most ASD risk gene. The fused result of genes subset that were selected by the GBPSO-SVM algorithm increased the classification accuracy to about 92.10%, which is higher than those reported in literature for the same autism dataset. Noticeably, the application of ensemble using random forest (RF) showed better performance compared to that of previous studies. However, the ensemble approach based on the employment of SVM as an integrator of the fused genes from the output branches of GBPSO-SVM outperformed the RF integrator. The overall improvement was ascribed to the selection strategies that were taken to reduce the dataset and the utilization of efficient wrapper based GBPSO-SVM algorithm
    • …
    corecore