5 research outputs found
Novel and simple transformation algorithm for combining microarray data sets
BACKGROUND:
With microarray technology, variability in experimental environments such as RNA sources, microarray production, or the use of different platforms, can cause bias. Such systematic differences present a substantial obstacle to the analysis of microarray data, resulting in inconsistent and unreliable information. Therefore, one of the most pressing challenges in the field of microarray technology is how to integrate results from different microarray experiments or combine data sets prior to the specific analysis.
RESULTS:
Two microarray data sets based on a 17k cDNA microarray system were used, consisting of 82 normal colon mucosa and 72 colorectal cancer tissues. Each data set was prepared from either total RNA or amplified mRNA, and the difference of RNA source between these two data sets was detected by ANOVA (Analysis of variance) model. A simple integration method was introduced which was based on the distributions of gene expression ratios among different microarray data sets. The method transformed gene expression ratios into the form of a reference data set on a gene by gene basis. Hierarchical clustering analysis, density and box plots, and mixture scores with correlation coefficients revealed that the two data sets were well intermingled, indicating that the proposed method minimized the experimental bias. In addition, any RNA source effect was not detected by the proposed transformation method. In the mixed data set, two previously identified subgroups of normal and tumor were well separated, and the efficiency of integration was more prominent in tumor groups than normal groups. The transformation method was slightly more effective when a data set with strong homogeneity in the same experimental group was used as a reference data set.
CONCLUSION:
Proposed method is simple but useful to combine several data sets from different experimental conditions. With this method, biologically useful information can be detectable by applying various analytic methods to the combined data set with increased sample size.ope
An Attempt for Combining Microarray Data Sets by Adjusting Gene Expressions
PURPOSE:
The diverse experimental environments in microarray technology, such as the different platforms or different RNA sources, can cause biases in the analysis of multiple microarrays. These systematic effects present a substantial obstacle for the analysis of microarray data, and the resulting information may be inconsistent and unreliable. Therefore, we introduced a simple integration method for combining microarray data sets that are derived from different experimental conditions, and we expected that more reliable information can be detected from the combined data set rather than from the separated data sets.
MATERIALS AND METHODS:
This method is based on the distributions of the gene expression ratios among the different microarray data sets and it transforms, gene by gene, the gene expression ratios into the form of the reference data set. The efficiency of the proposed integration method was evaluated using two microarray data sets, which were derived from different RNA sources, and a newly defined measure, the mixture score.
RESULTS:
The proposed integration method intermixed the two data sets that were obtained from different RNA sources, which in turn reduced the experimental bias between the two data sets, and the mixture score increased by 24.2%. A data set combined by the proposed method preserved the inter-group relationship of the separated data sets.
CONCLUSION:
The proposed method worked well in adjusting systematic biases, including the source effect. The ability to use an effectively integrated microarray data set yields more reliable results due to the larger sample size and this also decreases the chance of false negatives.ope
Genome-wide screening of genes associated with liver metastasis of colorectal cancer using a cDNA microarray
의과학과/석사[한글]
대장암의 주된 사인은 간 전이에 의한 것으로 알려져 있다. 이러한 간 전이 기전을 이해하기 위하여 cDNA microarray를 이용한 gene expression profiling 기법을 사용하여 대장암에서의 간 전이에 관련되어 있는 유전자를 선별하였다.대장암 환자 중 진단 당시 수술적 절제가 가능한 간 전이를 동반한 22명의 환자들로부터 수술 당시 정상 대장 조직, 원발성 대장 종양 조직, 간 전이 종양 조직을 각각 쌍으로 얻고, 가능한 경우 추가적으로 정상 간 조직과 대장 용종을 채취하였다. Total RNA를 추출하여 mRNA T7 linear amplification 방법을 이용해 mRNA만을 선택적으로 증폭한 후, 17K cDNA microarray에 reference RNA를 이용한 indirect 방법으로 microarray 실험을 수행하였다. 결과들을 평준화와 전처리 하고 unsupervised hierarchical clustering을 통하여 전체 유전자의 발현 양상을 살핀 후에, 14개의 학습군과 8개의 검증군으로 나누어 분석을 진행하였다. 학습군에서 Significant Analysis of Microarrays (SAM) 을 이용하여 정상 대장 조직과 간 조직에서 유의하게 발현이 차이가 나는 장기 특이 유전자를 선별하고 추후 분석에서 제외하였다. 장기 특이 유전자군이 제거된 유전자를 이용하여 학습군에서 원발성 대장 종양과 간 전이 종양에서의 특이 유전자를 Prediction Analysis of Microarrays (PAM)을 이용하여 선별 및 검정한 후, NIH-DAVID database (http://apps1.niaid.nih.gov/david/)를 이용하여 유전자의 기능을 살펴보았다.모든 실험 결과를 100 % non-missing proportion에서 선별된 12,823 개의 유전자를 이용하여 unsupervised hierarchical clustering 을 수행하였을 때, unsupervised clustering 임에도 불구하고 크게는 정상 대장, 정상 간, 원발성 대장 종양, 간 전이 종양 조직으로 구분되는 경향을 보였다. 정상 대장과 간 조직으로 SAM 을 이용하여 false discovery rate 0.062 % 에서 4,616 개의 장기 특이 유전자를 선별하고 이 유전자 군을 제거한 8,207 개 유전자를 이용하여 분석을 진행하였다. 14 쌍의 원발성 대장 종양과 간 전이 종양 조직 학습군을 PAM을 이용하여 학습하고 training error와 교차 검정의 misclassification error가 공통으로 0인 구간에서 81개, 126개, 252개의 유의 유전자군을 먼저 선별했다. 이 세 개의 유전자군을 각각 8 개의 독립된 검증군을 이용한 검정과 supervised hierarchical clustering을 통하여 원발성 대장 종양과 간 전이 종양을 적절히 구분할 수 있는 126 개의 최적 유전자군을 선별하였다.이렇게 선별된 126 개의 유전자 중 간 전이 종양에서 38 개는 up regulation 되었고 88 개는 down regulation 되었다. 선별된 유전자에는 중에는 WNT5A, lipocalin 2, E-cadherin, deiodinase 등 기존에 알려진 oncogene 이나 cell adhesion 관련 유전자들과 16 개의 EST 가 포함되어 있었다. 특히 MMP-1, MMP-2, WNT5A 는 원발성 대장 종양에 비해 간 전이 종양에서 발현 양이 감소하였고, 이들은 간으로의 전이하는 초기단계에 관여하는 것으로 여겨진다. 반대로 TIMP-1은 발현 양이 증가함을 관찰하였다 (p < 0.01). 결론적으로, 대장암에서 whole genome의 유전 발현 양상을 살펴보았고, 간 전이 과정에서 중요한 역할을 하리라 사료되는 126개의 유전자를 확인하였다.
[영문]Liver metastasis is the major cause of death in colorectal cancer patients. By using cDNA microarray gene expression profiling, we selected genes of metastatic potential from colorectal cancer patients with liver metastasis. We performed cDNA microarray in an indirect design based on the T7 linear mRNA amplification method with paired 22 colorectal normal mucosa, primary tumor, normal liver and metastasis liver tumor tissue RNAs using 17K human cDNA microarray. After the normalization, we evaluated the gene expression profilings of all samples with unsupervised clustering. Next, we selected organ specific genes with normal colon mucosa and liver tissue using Significant Analysis of Microarrays (SAM). After that, we selected and validated liver metastasis related genes from colorectal primary tumor and liver metastasis tumor within genes which subtracted organ specific genes using Prediction Analysis of Microarrays (PAM). The selected genes were functionally annotated using NIH-DAVID database (http://apps1.niaid.nih.gov/david/).Unsupervised hierarchical clustering with 12,823 genes with 100% non-missing proportion of the 22 paired samples showed different expression profiling between primary tumor and liver metastasis. Next, we selected 4,616 organ specific genes with normal colon mucosa and liver tissue at the false discovery rate of 0.062% using SAM, which were subtracted for not using further analysis. After that, we identified 38 and 88 genes of up and down regulated in metastatic lesions, respectively, with selected 8,207 genes of 14 train set using PAM, and validated the selected genes in 8 independent test set. 126 selected genes contained many known oncogenes and cell adhesion molecules such as wingless-type MMTV integration site family member 5A, lipocalin 2, E-cadherin, deiodinase as well as 16 ESTs. We especially observed that MMP-1, MMP-2 and WNT5A expressions were significantly decreased in liver metastatic tumors (p<0.001), suggesting their roles in early stage colorectal cancer rather than systemic metastasis. However, TIMP-1 expressions were significantly increased in liver metastatic tumors (p<0.001). In conclusion, we scanned whole genome using a cDNA microarray and identified 126 genes which might play a significant role in liver metastasis in colorectal cancer.ope
Identification of novel gastric cancer-associated CNVs by integrated analysis of microarray
BACKGROUND: Microarray-CGH facilitates analysis of cancer-associated genomic differences between normal and tumor tissues and provides a genome-wide assessment of copy number variations (CNVs).
METHODS: To identify CNVs and their clinical significance in gastric cancer, Microarray-CGH was performed to identify CNVs with genomic DNA (gDNA) from normal placenta tissue, peripheral blood mononuclear cells (PBMCs), and normal gastric tissue.
RESULTS: A total of 20 CNVs, including 8 novel CNVs, were identified by Microarray-CGH. Among the 20 CNVs, 5 showed an aberration frequency of over 50%. In addition, mRNA expression of W72437 (TFIIH), AI968311 (GAGE10), AI352361, and AA169807 (PTCH1) in normal tissues and AA485362 (GPX1), AI201652, and AI968311 (GAGE10) in cancer tissues was associated with DNA change. As a whole, incidences of oncogene-like, suppressor-like, and innocent CNVs were 13.8%, 13.2%, and 73.0%, respectively (gain 11.4%, loss 11.8%). AA936795 (C19orf61) appeared as an oncogene-like CNV (9/30, 30%), A1352361 (13/30, 43%), and AA281797 (LOC728340, 10/30, 33%) appeared as tumor suppressor-related CNVs.
CONCLUSIONS: This study identified gastric cancer-associated and innocent CNVs in gDNA isolated from placenta tissue and PBMC, which are generally used as reference samples in Microarray-CGH. These novel CNVs may be used for gastric cancer-specific gene selection in comparative analysis of genomics.ope
Whole genome analysis for liver metastasis gene signatures in colorectal cancer
Liver metastasis is one of the major causes of death in colorectal cancer (CRC) patients. To understand this process, we investigated whether the gene expression profiling of matched colorectal carcinomas and liver metastases could reveal key molecular events involved in tumor progression and metastasis. We performed experiments using a cDNA microarray containing 17,104 genes with the following tissue samples: paired tissues of 25 normal colorectal mucosa, 27 primary colorectal tumors, 13 normal liver and 27 liver metastasis, and 20 primary colorectal tumors without liver metastasis. To remove the effect of normal cell contamination, we selected 4,583 organ-specific genes with a false discovery rate (FDR) of 0.0067% by comparing normal colon and liver tissues using significant analysis of microarray, and these genes were excluded from further analysis. We then identified and validated 46 liver metastasis-specific genes with an accuracy of 83.3% by comparing the expression of paired primary colorectal tumors and liver metastases using prediction analysis of microarray. The 46 selected genes contained several known oncogenes and 2 ESTs. To confirm that the results correlated with the microarray expression patterns, we performed RT-PCR with WNT5A and carbonic anhydrase II. Additionally, we observed that 21 of the 46 genes were differentially expressed (FDR = 2.27%) in primary tumors with synchronous liver metastasis compared with primary tumors without liver metastasis. We scanned the human genome using a cDNA microarray and identified 46 genes that may play an important role in the progression of liver metastasis in CRC.ope
