Search CORE

17 research outputs found

Effective Feature Selection for Classification of Promoter Sequences

Author: Acharya Kshitish K. (3569336)
Kouser K. (3569339)
Lalitha Rangarajan (3569342)
Lavanya P. G. (3569345)
Publication venue
Publication date: 15/12/2016
Field of study

<div><p>Exploring novel computational methods in making sense of biological data has not only been a necessity, but also productive. A part of this trend is the search for more efficient in silico methods/tools for analysis of promoters, which are parts of DNA sequences that are involved in regulation of expression of genes into other functional molecules. Promoter regions vary greatly in their function based on the sequence of nucleotides and the arrangement of protein-binding short-regions called motifs. In fact, the regulatory nature of the promoters seems to be largely driven by the selective presence and/or the arrangement of these motifs. Here, we explore computational classification of promoter sequences based on the pattern of motif distributions, as such classification can pave a new way of functional analysis of promoters and to discover the functionally crucial motifs. We make use of Position Specific Motif Matrix (PSMM) features for exploring the possibility of accurately classifying promoter sequences using some of the popular classification techniques. The classification results on the complete feature set are low, perhaps due to the huge number of features. We propose two ways of reducing features. Our test results show improvement in the classification output after the reduction of features. The results also show that decision trees outperform SVM (Support Vector Machine), KNN (K Nearest Neighbor) and ensemble classifier LibD3C, particularly with reduced features. The proposed feature selection methods outperform some of the popular feature transformation methods such as PCA and SVD. Also, the methods proposed are as accurate as MRMR (feature selection method) but much faster than MRMR. Such methods could be useful to categorize new promoters and explore regulatory mechanisms of gene expressions in complex eukaryotic species.</p></div

Directory of Open Access Journals

The Francis Crick Institute

SVM Classification Results for five different kernels for Test v/s Background1 (Variance Reduced).

Author: Acharya Kshitish K. (3569336)
Kouser K. (3569339)
Lalitha Rangarajan (3569342)
Lavanya P. G. (3569345)
Publication venue
Publication date
Field of study

<p>SVM Classification Results for five different kernels for Test v/s Background1 (Variance Reduced).</p

The Francis Crick Institute

Analysis of classification accuracies on dataset 2.

Author: Acharya Kshitish K. (3569336)
Kouser K. (3569339)
Lalitha Rangarajan (3569342)
Lavanya P. G. (3569345)
Publication venue
Publication date
Field of study

<p>10 (a): Decision Trees. 10 (b): different classifiers 10 (c): different feature selections/transformations.</p

The Francis Crick Institute

Analysis of classification accuracies for various parameters on dataset 2.

Author: Acharya Kshitish K. (3569336)
Kouser K. (3569339)
Lalitha Rangarajan (3569342)
Lavanya P. G. (3569345)
Publication venue
Publication date
Field of study

<p>9(a), 9(b): KNN, 9(c), 9(d): SVM.</p

The Francis Crick Institute

LibD3C classification accuracies for MRMR and P value reduced features on dataset 2.

Author: Acharya Kshitish K. (3569336)
Kouser K. (3569339)
Lalitha Rangarajan (3569342)
Lavanya P. G. (3569345)
Publication venue
Publication date
Field of study

<p>LibD3C classification accuracies for MRMR and P value reduced features on dataset 2.</p

The Francis Crick Institute

Feature reduction (Variance) pattern for 3 files of dataset 2.

Author: Acharya Kshitish K. (3569336)
Kouser K. (3569339)
Lalitha Rangarajan (3569342)
Lavanya P. G. (3569345)
Publication venue
Publication date
Field of study

<p>Feature reduction (Variance) pattern for 3 files of dataset 2.</p

The Francis Crick Institute

SVM Classification Results for Linear Kernel for test v/s all five backgrounds (Variance Reduced).

Author: Acharya Kshitish K. (3569336)
Kouser K. (3569339)
Lalitha Rangarajan (3569342)
Lavanya P. G. (3569345)
Publication venue
Publication date
Field of study

<p>SVM Classification Results for Linear Kernel for test v/s all five backgrounds (Variance Reduced).</p

The Francis Crick Institute

Hypothetical feature matrix of PSMMs of 4 promoters from two classes and their P values.

Author: Acharya Kshitish K. (3569336)
Kouser K. (3569339)
Lalitha Rangarajan (3569342)
Lavanya P. G. (3569345)
Publication venue
Publication date
Field of study

<p>Hypothetical feature matrix of PSMMs of 4 promoters from two classes and their P values.</p

The Francis Crick Institute

PSMMs of two promoters/samples.

Author: Acharya Kshitish K. (3569336)
Kouser K. (3569339)
Lalitha Rangarajan (3569342)
Lavanya P. G. (3569345)
Publication venue
Publication date
Field of study

<p>PSMMs of two promoters/samples.</p

The Francis Crick Institute

Decision Tree Classification Results for test v/s all five backgrounds (Variance Reduced).

Author: Acharya Kshitish K. (3569336)
Kouser K. (3569339)
Lalitha Rangarajan (3569342)
Lavanya P. G. (3569345)
Publication venue
Publication date
Field of study

<p>Decision Tree Classification Results for test v/s all five backgrounds (Variance Reduced).</p

The Francis Crick Institute

core

core