213 research outputs found

    Classication and Clustering Using Intelligent Techniques: Application to Microarray Cancer Data

    Get PDF
    Analysis and interpretation of DNA Microarray data is a fundamental task in bioinformatics. Feature Extraction plays a critical role in better performance of the classifier. We address the dimension reduction of DNA features in which relevant features are extracted among thousands of irrelevant ones through dimensionality reduction. This enhances the speed and accuracy of the classifiers. Principal Component Analysis is a technique used for feature extraction which helps to retrieve intrinsic information from high dimensional data in eigen spaces to solve the curse of dimensionality problem. Neural Networks and Support Vector Machine are implemented on reduced data set and their performances are measured in terms of predictive accuracy, specificity, and sensitivity. Next, we propose a Multiobjective Genetic Algorithm-based fuzzy clustering technique using real coded encoding of cluster centers for clustering and classification. This technique is implemented on microarray cancer data to select training data using multiobjective genetic algorithm with non-dominated sorting. The two objective functions for this multiobjective techniques are optimization of cluster compactness as well as separation simultaneously. This approach identifies the solution. Support Vector Machine classifier is further trained by the selected training points which have high confidence value. Then remaining points are classified by trained SVM classifier. Finally, the four clustering label vectors through majority voting ensemble are combined. The performance of the proposed MOGA-SVM, classification and clustering method has been compared to MOGA-BP, SVM, BP. The performance are measured in terms of Silhoutte Index, ARI Index respectively. The experiment were carried on three public domain cancer data sets, viz., Ovarian, Colon and Leukemia cancer

    Klasifikasi Data Microarray Menggunakan Discrete Wavelet Transform dan Extreme Learning Machine

    Get PDF
    AbstrakData microarray digunakan sebagai alternatif untuk diagnosa penyakit kanker karena kesulitan dalam dignosa kanker berdasarkan bentuk morfologis, yaitu perbedaan morfologis yang tipis antar jenis kanker yang berbeda. Penelitian ini bertujuan untuk membangun pengklasifikasi data microarray. Proses klasifikasi diawali dengan reduksi dimensi data microarray menggunakan DWT, dengan cara mendekomposisi sampel hingga level tertentu, kemudian mengambil nilai koefisien aproksimasi pada level tersebut sebagai fitur sampel. Fitur tersebut selanjutnya menjadi masukan untuk klasifikasi. Metode klasifikasi yang digunakan adalah ELM yang diterapkan pada RBFN. Dataset yang digunakan adalah data microarray multikelas, yaitu dataset GCM (16.063 gen, 14 kelas) dan Subtypes-Leukemia (12.600 gen, 7 kelas).Pengujian dilakukan dengan cara membagi data latih dan data uji secara random sepuluh kali dengan proporsi data yang sama. Classifier yang dihasilkan dari penelitian ini untuk dataset GCM belum memiliki performa yang cukup baik, ditunjukkan dengan nilai akurasi sekitar 75% ± 6,25% dan nilai minimum sensitivity yang masih rendah, yaitu 15% ± 19,95% menunjukkan bahwa sensitivity untuk tiap kelas belum merata, terdapat beberapa kelas yang sensitivity-nya masih rendah. Namun, classifier untuk dataset Subtypes-Leukemia yang memiliki jumlah kelas lebih sedikit dari dataset GCM memiliki performa yang cukup baik, ditunjukkan dengan nilai akurasi 87,68% ± 2,88% dan minimum sensitivity 51,90% ± 20,29%.   Kata kunci— microarray, ekspresi gen, DWT, ELM, RBFN AbstractMicroarray data is used as an alternative in cancer diagnosis because of the difficulties cancer diagnosis based on morphologis structures. Different classes of cancer usually have poor distintion of morphologis structures. The aim of this reserach is to bulid microarray data classfier. The classification process is started by reducing dimension of microarray data. The method used to reduce the microarray data dimension is DWT by decomposing the samples until certain decomposition level and then use approximation coefficients at those level as feature to classifier. Classifier used in this reserach is ELM implemeted on RBFN. Dataset used are GCM (16.063 genes, 14 classes) and Subtypes-Leukemia (12.600 genes, 7 classes). Testing process is done by randomly dividing the training and testing data ten times with same proprotion of training and testing data. The perfomance of classifier built in this research is not so good for GCM dataset, shown by accuracy 75% ± 6,25% and mean of minimum sensitivity 15% ± 19,95%. The low minimum sensitivity indicate that there are few classes that have low sensitivity. But the classifier for Subtypes-Leukemia dataset give better result, that is accuracy 87,68% ± 2,88%  and mean of minimum sensitivity 51,90% ± 20,29%.    Keywords— microarray, gene expression, DWT, ELM, RBF

    Deep Artificial Neural Networks and Neuromorphic Chips for Big Data Analysis: Pharmaceutical and Bioinformatics Applications

    Get PDF
    [Abstract] Over the past decade, Deep Artificial Neural Networks (DNNs) have become the state-of-the-art algorithms in Machine Learning (ML), speech recognition, computer vision, natural language processing and many other tasks. This was made possible by the advancement in Big Data, Deep Learning (DL) and drastically increased chip processing abilities, especially general-purpose graphical processing units (GPGPUs). All this has created a growing interest in making the most of the potential offered by DNNs in almost every field. An overview of the main architectures of DNNs, and their usefulness in Pharmacology and Bioinformatics are presented in this work. The featured applications are: drug design, virtual screening (VS), Quantitative Structure–Activity Relationship (QSAR) research, protein structure prediction and genomics (and other omics) data mining. The future need of neuromorphic hardware for DNNs is also discussed, and the two most advanced chips are reviewed: IBM TrueNorth and SpiNNaker. In addition, this review points out the importance of considering not only neurons, as DNNs and neuromorphic chips should also include glial cells, given the proven importance of astrocytes, a type of glial cell which contributes to information processing in the brain. The Deep Artificial Neuron–Astrocyte Networks (DANAN) could overcome the difficulties in architecture design, learning process and scalability of the current ML methods.Galicia. Consellería de Cultura, Educación e Ordenación Universitaria; GRC2014/049Galicia. Consellería de Cultura, Educación e Ordenación Universitaria; R2014/039Instituto de Salud Carlos III; PI13/0028

    Computational models and approaches for lung cancer diagnosis

    Full text link
    The success of treatment of patients with cancer depends on establishing an accurate diagnosis. To this end, the aim of this study is to developed novel lung cancer diagnostic models. New algorithms are proposed to analyse the biological data and extract knowledge that assists in achieving accurate diagnosis results

    An Improved Wavelet Neural Network For Classification And Function Approximation

    Get PDF
    Properly designing a wavelet neural network (WNN) is crucial for achieving the optimal generalization performance. In this thesis, two different approaches were proposed for improving the predictive capability of WNNs. First, the types of activation functions used in the hidden layer of the WNN were varied. Second, the proposed enhanced fuzzy c-means clustering algorithm—specifically, the modified point symmetry-based fuzzy c-means (MPSDFCM) algorithm—was employed in selecting the locations of the translation vectors of the WNN. The modified WNN was then applied in the areas of classification and function approximation

    Improved sequential and batch learning in neural networks using the tangent plane algorithm

    Get PDF
    The principal aim of this research is to investigate and develop improved sequential and batch learning algorithms based upon the tangent plane algorithm for artificial neural networks. A secondary aim is to apply the newly developed algorithms to multi-category cancer classification problems in the bio-informatics area, which involves the study of dna or protein sequences, macro-molecular structures, and gene expressions
    corecore