396-403To analyze
differentially expressed genes in colon cancer, we compared expression profiles
of colorectal cancer cells from normal colonic cells using data of DNA
microarray consisting of 6584 human genes. Each probe set on the array consisted
of EST (expressed sequence tag) sequence of 20 feature pairs of 25 bp sequence.
The data set comprised of 61 samples, divided into two groups of 40 samples for
tumor cells (Group 1) and 21 samples for normal cells (Group 2). In order to do background
adjustments for the negative expression values, the data was transformed into
log base 2 and estimation of missing values was performed by K-nearest neighbor
method, followed by normalization using ‘minimum mean ratio’ among arrays. The basic statistics used for the significance analysis was J5
test, which was computed for each probe and for each contrast with a threshold
value of 4.0 and mean as the measure of central tendency. The differentially
expressed genes were expressed at high frequency in tumour samples. The Naive
Bayes Classifier Algorithm was used to test defined classification of samples
of genes. Correlation distance was measured with the help of Pearson’s
correlation distance. On the basis of J5 test scores, top 5 upregulated genes, viz., vasopressin-neurophysin 2-copeptin
preproprotein, cytochrome, P450 2A7 isoform, major centromere autoantigen B,
myelin associated glycoprotein and bone morphogenetic protein 1 isoform 3
precursor, were selected for further analysis. The above said genes have not
yet been reported to be differentially overexpressed in colon cancer cells,
while their overexpression was reported in other cancers, such as, lung and
breast cancer, etc. These genes can be used for prediction and analyses of the
gene products, which will help in designing new diagnostic and treatment
strategies for the colon cancer