41 research outputs found
Novel Rank-Based Statistical Methods Reveal MicroRNAs with Differential Expression in Multiple Cancer Types
BACKGROUND:MicroRNAs (miRNAs) regulate target genes at the post-transcriptional level and play important roles in cancer pathogenesis and development. Variation amongst individuals is a significant confounding factor in miRNA (or other) expression studies. The true character of biologically or clinically meaningful differential expression can be obscured by inter-patient variation. In this study we aim to identify miRNAs with consistent differential expression in multiple tumor types using a novel data analysis approach. METHODS:Using microarrays we profiled the expression of more than 700 miRNAs in 28 matched tumor/normal samples from 8 different tumor types (breast, colon, liver, lung, lymphoma, ovary, prostate and testis). This set is unique in putting emphasis on minimizing tissue type and patient related variability using normal and tumor samples from the same patient. We develop scores for comparing miRNA expression in the above matched sample data based on a rigorous characterization of the distribution of order statistics over a discrete state set, including exact p-values. Specifically, we compute a Rank Consistency Score (RCoS) for every miRNA measured in our data. Our methods are also applicable in various other contexts. We compare our methods, as applied to matched samples, to paired t-test and to the Wilcoxon Signed Rank test. RESULTS:We identify consistent (across the cancer types measured) differentially expressed miRNAs. 41 miRNAs are under-expressed in cancer compared to normal, at FDR (False Discovery Rate) of 0.05 and 17 are over-expressed at the same FDR level. Differentially expressed miRNAs include known oncomiRs (e.g miR-96) as well as miRNAs that were not previously universally associated with cancer. Specific examples include miR-133b and miR-486-5p, which are consistently down regulated and mir-629* which is consistently up regulated in cancer, in the context of our cohort. Data is available in GEO. Software is available at: http://bioinfo.cs.technion.ac.il/people/zohar/RCoS
Multilocus analysis of SNP and metabolic data within a given pathway
BACKGROUND: Complex traits, which are under the influence of multiple and possibly interacting genes, have become a subject of new statistical methodological research. One of the greatest challenges facing human geneticists is the identification and characterization of susceptibility genes for common multifactorial diseases and their association to different quantitative phenotypic traits. RESULTS: Two types of data from the same metabolic pathway were used in the analysis: categorical measurements of 18 SNPs; and quantitative measurements of plasma levels of several steroids and their precursors. Using the combinatorial partitioning method we tested various thresholds for each metabolic trait and each individual SNP locus. One SNP in CYP19, 3UTR, two SNPs in CYP1B1 (R48G and A119S) and one in CYP1A1 (T461N) were significantly differently distributed between the high and low level metabolic groups. The leave one out cross validation method showed that 6 SNPs in concert make 65% correct prediction of phenotype. Further we used pattern recognition, computing the p-value by Monte Carlo simulation to identify sets of SNPs and physiological characteristics such as age and weight that contribute to a given metabolic level. Since the SNPs detected by both methods reside either in the same gene (CYP1B1) or in 3 different genes in immediate vicinity on chromosome 15 (CYP19, CYP11 and CYP1A1) we investigated the possibility that they form intragenic and intergenic haplotypes, which may jointly account for a higher activity in the pathway. We identified such haplotypes associated with metabolic levels. CONCLUSION: The methods reported here may enable to study multiple low-penetrance genetic factors that together determine various quantitative phenotypic traits. Our preliminary data suggest that several genes coding for proteins involved in a common pathway, that happen to be located on common chromosomal areas and may form intragenic haplotypes, together account for a higher activity of the whole pathway
Robust interlaboratory reproducibility of a gene expression signature measurement consistent with the needs of a new generation of diagnostic tools
The increasing use of DNA microarrays in biomedical research, toxicogenomics, pharmaceutical development, and diagnostics has focused attention on the reproducibility and reliability of microarray measurements. While the reproducibility of microarray gene expression measurements has been the subject of several recent reports, there is still a need for systematic investigation into what factors most contribute to variability of measured expression levels observed among different laboratories and different experimenters.SCOPUS: ar.jinfo:eu-repo/semantics/publishe
Diversity of human copy number variation and multicopy genes
Copy number variants affect both disease and normal phenotypic variation, but those lying within heavily duplicated, highly identical sequence have been difficult to assay. By analyzing short-read mapping depth for 159 human genomes, we demonstrated accurate estimation of absolute copy number for duplications as small as 1.9 kilobase pairs, ranging from 0 to 48 copies. We identified 4.1 million singly unique nucleotide positions informative in distinguishing specific copies and used them to genotype the copy and content of specific paralogs within highly duplicated gene families. These data identify human-specific expansions in genes associated with brain development, reveal extensive population genetic diversity, and detect signatures consistent with gene conversion in the human species. Our approach makes ∼1000 genes accessible to genetic studies of disease association
A Second-Generation Genomewide Screen for Asthma-Susceptibility Alleles in a Founder Population
A genomewide screen for asthma- and atopy-susceptibility loci was conducted, using 563 markers, in 693 Hutterites who are members of a single 15-generation pedigree, nearly doubling the sample size from the authors' earlier studies. The resulting increase in power led to the identification of 23 loci in 18 chromosomal regions showing evidence for linkage that is, in general, 10-fold more significant (P<.001 vs. P<.01) than the linkages reported previously in this population. Moreover, linkages to loci in 11 chromosomal regions were identified for the first time in the Hutterites in this report, including five regions (5p, 5q, 8p, 14q, and 16q) showing evidence both of linkage, by the likelihood ratio (LR) χ(2), and of disequilibrium, by the transmission/disequilibrium test. A region on chromosome 19 continues to show evidence for linkage, by both tests, in this study. Studies of 17 candidate genes provide evidence for association with variation in the IL4RA gene (16p12), the HLA class II genes (6p21), and the interferon-α gene cluster (9p22), but the lack of evidence for linkage in these regions by the LR χ(2) test suggests that these are minor susceptibility loci. A polymorphism in the CD14 gene is in linkage disequilibrium with an as yet unidentified susceptibility allele in the 5q cytokine cluster, a region showing evidence for linkage among the Hutterites. Finally, 10 of the regions showing evidence for linkage in the Hutterites have shown evidence of linkage to related phenotypes in other genome screens, suggesting that these regions may contain common alleles that have relatively large effects on asthma and atopy phenotypes in diverse populations
c â—‹ Imperial College Press ANALYSIS OF SNP-EXPRESSION ASSOCIATION MATRICES
High throughput expression profiling and genotyping technologies provide the means to study the genetic determinants of population variation in gene expression variation. In this paper we present a general statistical framework for the simultaneous analysis of gene expression data and SNP genotype data measured for the same cohort. The framework consists of methods to associate transcripts with SNPs affecting their expression, algorithms to detect subsets of transcripts that share significantly many associations with a subset of SNPs, and methods to visualize the identified relations. We apply our framework to SNP-expression data collected from 50 breast cancer patients. Our results demonstrate an overabundance of transcript-SNP associations in this data, and pinpoint 259 260 A. Tsalenko et al. SNPs that are potential master regulators of transcription. We also identify several statistically significant transcript-subsets with common putative regulators that fall into well-defined functional categories
Antibody Arrays Identify Potential Diagnostic Markers of Hepatocellular Carcinoma
Hepatocellular carcinoma (HCC) is the third leading cause of cancer deaths worldwide. Effective treatment of HCC patients is hampered by the lack of sensitive and specific diagnostic markers of HCC. Alpha-fetoprotein (AFP), the currently used HCC marker, misses 30%–50% of HCC patients, who therefore remain undiagnosed and untreated. In order to identify novel diagnostic markers that can be used individually or in combination with AFP, we used an antibody array platform to detect the levels of candidate proteins in the plasma of HCC patients (n = 48) and patients with chronic hepatitis B or C viral infections (n = 19) (both of which are the major risk factors of HCC). We identified 7 proteins that significantly differentiate HCC patients from hepatitis patients (p < 0.05) (AFP, CTNNB, CSF1, SELL, IGFBP6, IL6R, and VCAM1).Importantly, we also identified 8 proteins that significantly differentiate HCC patients with ‘normal’ levels of AFP (<20 ng/ml) from hepatitis patients (p < 0.05) (IL1RN, IFNG, CDKN1A, RETN, CXCL14, CTNNB, FGF2, and SELL). These markers are potentially important complementary markers to AFP. Using an independent immunoassay method in an independent group of 23 HCC patients and 22 hepatitis patients, we validated that plasma levels of CTNNB were significantly higher in the HCC group (p = 0.020). In conclusion, we used an antibody array platform to identify potential circulating diagnostic markers of HCC, some of which may be valuable when used in combination with AFP. The clinical utility of these newly identified HCC diagnostic markers needs to be systematically evaluated