5,670 research outputs found

    Algorithms Implemented for Cancer Gene Searching and Classifications

    Get PDF
    Understanding the gene expression is an important factor to cancer diagnosis. One target of this understanding is implementing cancer gene search and classification methods. However, cancer gene search and classification is a challenge in that there is no an obvious exact algorithm that can be implemented individually for various cancer cells. In this paper a research is con-ducted through the most common top ranked algorithms implemented for cancer gene search and classification, and how they are implemented to reach a better performance. The paper will distinguish algorithms implemented for Bio image analysis for cancer cells and algorithms implemented based on DNA array data. The main purpose of this paper is to explore a road map towards presenting the most current algorithms implemented for cancer gene search and classification

    Elephant Search with Deep Learning for Microarray Data Analysis

    Full text link
    Even though there is a plethora of research in Microarray gene expression data analysis, still, it poses challenges for researchers to effectively and efficiently analyze the large yet complex expression of genes. The feature (gene) selection method is of paramount importance for understanding the differences in biological and non-biological variation between samples. In order to address this problem, a novel elephant search (ES) based optimization is proposed to select best gene expressions from the large volume of microarray data. Further, a promising machine learning method is envisioned to leverage such high dimensional and complex microarray dataset for extracting hidden patterns inside to make a meaningful prediction and most accurate classification. In particular, stochastic gradient descent based Deep learning (DL) with softmax activation function is then used on the reduced features (genes) for better classification of different samples according to their gene expression levels. The experiments are carried out on nine most popular Cancer microarray gene selection datasets, obtained from UCI machine learning repository. The empirical results obtained by the proposed elephant search based deep learning (ESDL) approach are compared with most recent published article for its suitability in future Bioinformatics research.Comment: 12 pages, 5 Tabl

    Examining applying high performance genetic data feature selection and classification algorithms for colon cancer diagnosis

    Get PDF
    Background and Objectives: This paper examines the accuracy and efficiency (time complexity) of high performance genetic data feature selection and classification algorithms for colon cancer diagnosis. The need for this research derives from the urgent and increasing need for accurate and efficient algorithms. Colon cancer is a leading cause of death worldwide, hence it is vitally important for the cancer tissues to be expertly identified and classified in a rapid and timely manner, to assure both a fast detection of the disease and to expedite the drug discovery process. Methods: In this research, a three-phase approach was proposed and implemented: Phases One and Two examined the feature selection algorithms and classification algorithms employed separately, and Phase Three examined the performance of the combination of these. Results: It was found from Phase One that the Particle Swarm Optimization (PSO) algorithm performed best with the colon dataset as a feature selection (29 genes selected) and from Phase Two that the Sup- port Vector Machine (SVM) algorithm outperformed other classifications, with an accuracy of almost 86%. It was also found from Phase Three that the combined use of PSO and SVM surpassed other algorithms in accuracy and performance, and was faster in terms of time analysis (94%). Conclusions: It is concluded that applying feature selection algorithms prior to classification algorithms results in better accuracy than when the latter are applied alone. This conclusion is important and significant to industry and society

    Supervised learning-based tagSNP selection for genome-wide disease classifications

    Get PDF
    The article was originally published by BMC Genomics. doi:10.1186/1471-2164-9-S1-S6Comprehensive evaluation of common genetic variations through association of single nucleotide polymorphisms (SNPs) with complex human diseases on the genome-wide scale is an active area in human genome research. One of the fundamental questions in a SNP-disease association study is to find an optimal subset of SNPs with predicting power for disease status. To find that subset while reducing study burden in terms of time and costs, one can potentially reconcile information redundancy from associations between SNP markersResearch supports received from ICASA (Institute for Complex Additive Systems Analysis, a division of New Mexico Tech) and the Radiology Department of Brigham and Women's Hospital (BWH) are gratefully acknowledged. The authors highly appreciate Dr. Liang at SUNY-Buffalo for her invaluable help and insightful discussion during this study and Ms. Kim Lawson at BWH Radiology Department for her manuscript editing and very constructive comments.Supervised Recursive Feature AdditionsSupport Vector bases Recursive Feature Additioncomplex diseasegeneticsdisease prediction

    Partition Decoupling for Multi-gene Analysis of Gene Expression Profiling Data

    Get PDF
    We present the extention and application of a new unsupervised statistical learning technique--the Partition Decoupling Method--to gene expression data. Because it has the ability to reveal non-linear and non-convex geometries present in the data, the PDM is an improvement over typical gene expression analysis algorithms, permitting a multi-gene analysis that can reveal phenotypic differences even when the individual genes do not exhibit differential expression. Here, we apply the PDM to publicly-available gene expression data sets, and demonstrate that we are able to identify cell types and treatments with higher accuracy than is obtained through other approaches. By applying it in a pathway-by-pathway fashion, we demonstrate how the PDM may be used to find sets of mechanistically-related genes that discriminate phenotypes.Comment: Revise

    The Marker State Space (MSS) Method for Classifying Clinical Samples

    Get PDF
    The development of accurate clinical biomarkers has been challenging in part due to the diversity between patients and diseases. One approach to account for the diversity is to use multiple markers to classify patients, based on the concept that each individual marker contributes information from its respective subclass of patients. Here we present a new strategy for developing biomarker panels that accounts for completely distinct patient subclasses. Marker State Space (MSS) defines "marker states" based on all possible patterns of high and low values among a panel of markers. Each marker state is defined as either a case state or a control state, and a sample is classified as case or control based on the state it occupies. MSS was used to define multi-marker panels that were robust in cross validation and training-set/test-set analyses and that yielded similar classification accuracy to several other classification algorithms. A three-marker panel for discriminating pancreatic cancer patients from control subjects revealed subclasses of patients based on distinct marker states. MSS provides a straightforward approach for modeling highly divergent subclasses of patients, which may be adaptable for diverse applications. © 2013 Fallon et al
    corecore