Search CORE

4 research outputs found

Specific Tuning Parameter for Directed Random Walk Algorithm Cancer Classification

Author: Kasim Shahreen
Mohamad Mohd Saberi
Seah Choon Sen
Publication venue: 'Insight Society'
Publication date: 25/02/2017
Field of study

Accuracy of cancerous gene classification is a central challenge in clinical cancer research. Microarray-based gene biomarkers have proved the performance and its ability over traditional clinical parameters. However, gene biomarkers of an individual are less robustness due to litter reproducibility between different cohorts of patients. Several methods incorporating pathway information such as directed random walk have been proposed to infer the pathway activity. This paper discusses the implementation of group specific tuning parameter in directed random walk algorithm. In this experiment, gene expression data and pathway data are used as input data. Throughout this experiment, more significant pathway activities can be identified which increases the accuracy of cancer classification. The lung cancer gene is used as the experimental dataset, with which, the sDRW is used in determining significant pathways. More risk-active pathways are identified throughout this experiment

International Journal on Advanced Science, Engineering and Information Technology

Analysis of microarray and next generation sequencing data for classification and biomarker discovery in relation to complex diseases

Author: Elyasigomari Vahid
Publication venue: 'Queen Mary University of London'
Publication date: 21/09/2017
Field of study

PhDThis thesis presents an investigation into gene expression profiling, using microarray and next generation sequencing (NGS) datasets, in relation to multi-category diseases such as cancer. It has been established that if the sequence of a gene is mutated, it can result in the unscheduled production of protein, leading to cancer. However, identifying the molecular signature of different cancers amongst thousands of genes is complex. This thesis investigates tools that can aid the study of gene expression to infer useful information towards personalised medicine. For microarray data analysis, this study proposes two new techniques to increase the accuracy of cancer classification. In the first method, a novel optimisation algorithm, COA-GA, was developed by synchronising the Cuckoo Optimisation Algorithm and the Genetic Algorithm for data clustering in a shuffle setup, to choose the most informative genes for classification purposes. Support Vector Machine (SVM) and Multilayer Perceptron (MLP) artificial neural networks are utilised for the classification step. Results suggest this method can significantly increase classification accuracy compared to other methods. An additional method involving a two-stage gene selection process was developed. In this method, a subset of the most informative genes are first selected by the Minimum Redundancy Maximum Relevance (MRMR) method. In the second stage, optimisation algorithms are used in a wrapper setup with SVM to minimise the selected genes whilst maximising the accuracy of classification. A comparative performance assessment suggests that the proposed algorithm significantly outperforms other methods at selecting fewer genes that are highly relevant to the cancer type, while maintaining a high classification accuracy. In the case of NGS, a state-of-the-art pipeline for the analysis of RNA-Seq data is investigated to discover differentially expressed genes and differential exon usages between normal and AIP positive Drosophila datasets, which are produced in house at Queen Mary, University of London. Functional genomic of differentially expressed genes were examined and found to be relevant to the case study under investigation. Finally, after normalising the RNA-Seq data, machine learning approaches similar to those in microarray was successfully implemented for these datasets

Queen Mary Research Online

Integrative gene selection for classification of microarray data

Author: Mustapha Norwati
Ong Huey Fang
Sulaiman Md. Nasir
Publication venue: 'Canadian Center of Science and Education'
Publication date: 01/01/2011
Field of study

Microarray data classification is one of the major interests in health informatics that aims at discovering hidden patterns in gene expression profiles. The main challenge in building this classification system is the curse of dimensionality problem. Thus, there is a considerable amount of studies on gene selection method for building effective classification models. However, most of the approaches consider solely on gene expression values, and as a result, the selected genes might not be biologically meaningful. This paper presents an integrative gene selection for improving microarray data classification performance. The proposed approach employs the association analysis technique to integrate both gene expression and biological data in identifying informative genes. The experimental results show that the proposed gene selection outperformed the traditional method in terms of accuracy and number of selected genes

Crossref

Universiti Putra Malaysia Institutional Repository

Integrative Gene Selection for Classification of Microarray Data

Author
Publication venue: 'Canadian Center of Science and Education'
Publication date
Field of study

Crossref