794 research outputs found
Hierarchical maximum likelihood clustering approach
Objective:
In this work, we focused on developing a clustering approach for biological data. In many biological
analyses, such as multi-omics data analysis and genome-wide
association studies (GWAS) analysis, it is crucial to find groups of data belonging to subtypes of diseases or tumors. Methods:
Conventionally, the k-means clustering algorithm is
overwhelmingly applied in many areas including biological
sciences. There are, however, several alternative clustering algorithms that can be applied, including support vector clustering. In this paper, taking into consideration the nature of biological data, we propose a maximum likelihood clustering scheme based on a hierarchical framework.
Results: This method can perform clustering even when the data belonging to different groups overlap. It can also perform clustering when the number of samples is lower than the data dimensionality.
Conclusion: The proposed scheme is free from selecting initial settings to begin the search process. In addition, it does not require the computation of the first and second derivative of likelihood functions, as is required by many other maximum likelihood based methods.
Significance: This algorithm uses distribution and centroid
information to cluster a sample and was applied to biological data. A Matlab implementation of this method can be downloaded from the web-link
http://www.riken.jp/en/research/labs/ims/med_sci_math/
Differential quantification of CYP2D6 gene copy number by four different quantitative real-time PCR assays
Copy number variations (CNVs) in the CYP2D6 gene contribute to interindividual variation in drug metabolism. As the most common duplicated allele in Asian populations is the nonfunctional CYP2D6*36 allele, the goal of this study was to identify CNV assays that can differentiate between multiple copies of the CYP2D6*36 allele and multiple copies of other CYP2D6 alleles. We determined CYP2D6 gene copy numbers in 32 individuals with known CYP2D6 CNVs from the Coriell Japanese-Chinese panel using four quantitative real-time PCR assays. These assays target different regions of the CYP2D6 gene: 5'-flanking region, intron 2, intron 6, and exon 9 (Ex9). The specific target site of the Ex9 assay was verified by sequencing the PCR amplicon. Three of the CYP2D6 CNV assays (5'-flanking region, intron 2, and intron 6) estimated CYP2D6 copy numbers that were concordant for all 32 individuals. However, the Ex9 assay was concordant in only 10 of 32 samples. The 10 concordant samples did not contain any CYP2D6*36 alleles and the 22 discordant samples contained at least one CYP2D6*36 allele. In addition, the Ex9 assay accurately quantified all of the non-CYP2D6*36 alleles in all samples. Ex9 amplicon sequencing indicated that it targets a region of CYP2D6 exon 9 that undergoes partial gene-conversion in the CYP2D6*36 allele. In conclusion, CYP2D6 Ex9 CNV assay can be used to determine the copy number of non-CYP2D6*36 alleles. Selective amplification of non-CYP2D6*36 sequence by the Ex9 assay should be useful in determining the number of functional copies of CYP2D6 in Asian populations
Recommended from our members
Genome Wide Association Study of Age at Menarche in the Japanese Population
Age at menarche (AAM) is a complex trait involving both genetic and environmental factors. To identify the genetic factors associated with AAM, we conducted a large-scale meta-analysis of genome-wide association studies using more than 15,000 Japanese female samples. Here, we identified an association between SNP (single nucleotide polymorphism) rs364663 at the LIN28B locus and AAM, with a P-value of 5.49×10−7 and an effect size of 0.089 (year). We also evaluated 33 SNPs that were previously reported to be associated with AAM in women of European ancestry. Among them, two SNPs rs4452860 and rs7028916 in TMEM38B indicated significant association with AAM in the same directions as reported in previous studies (P = 0.0013 with an effect size of 0.051) even after Bonferroni correction for the 33 SNPs. In addition, six loci in or near CCDC85A, LOC100421670, CA10, ZNF483, ARNTL, and RXRG exhibited suggestive association with AAM (P<0.05). Our findings elucidated the impact of genetic variations on AAM in the Japanese population
Recommended from our members
Genome Wide Association Study of Age at Menarche in the Japanese Population
Age at menarche (AAM) is a complex trait involving both genetic and environmental factors. To identify the genetic factors associated with AAM, we conducted a large-scale meta-analysis of genome-wide association studies using more than 15,000 Japanese female samples. Here, we identified an association between SNP (single nucleotide polymorphism) rs364663 at the LIN28B locus and AAM, with a P-value of 5.49×10−7 and an effect size of 0.089 (year). We also evaluated 33 SNPs that were previously reported to be associated with AAM in women of European ancestry. Among them, two SNPs rs4452860 and rs7028916 in TMEM38B indicated significant association with AAM in the same directions as reported in previous studies (P = 0.0013 with an effect size of 0.051) even after Bonferroni correction for the 33 SNPs. In addition, six loci in or near CCDC85A, LOC100421670, CA10, ZNF483, ARNTL, and RXRG exhibited suggestive association with AAM (P<0.05). Our findings elucidated the impact of genetic variations on AAM in the Japanese population.</p
Comprehensive Analysis of Risk Factors for Periodontitis Focusing on the Saliva Microbiome and Polymorphism
Few studies have exhaustively assessed relationships among polymorphisms, the microbiome, and periodontitis. The objective of the present study was to assess associations simultaneously among polymorphisms, the microbiome, and periodontitis. We used propensity score matching with a 1:1 ratio to select subjects, and then 22 individuals (mean age +/- standard deviation, 60.7 +/- 9.9 years) were analyzed. After saliva collection, V3-4 regions of the 16S rRNA gene were sequenced to investigate microbiome composition, alpha diversity (Shannon index, Simpson index, Chao1, and abundance-based coverage estimator) and beta diversity using principal coordinate analysis (PCoA) based on weighted and unweighted UniFrac distances. A total of 51 single-nucleotide polymorphisms (SNPs) related to periodontitis were identified. The frequencies of SNPs were collected from Genome-Wide Association Study data. The PCoA of unweighted UniFrac distance showed a significant difference between periodontitis and control groups (p 0.05). Two families (Lactobacillaceae and Desulfobulbaceae) and one species (Porphyromonas gingivalis) were observed only in the periodontitis group. No SNPs showed significant expression. These results suggest that periodontitis was related to the presence of P. gingivalis and the families Lactobacillaceae and Desulfobulbaceae but not SNPs
Recommended from our members
Genome-Wide Association Study of Breast Cancer in the Japanese Population
Breast cancer is the most common malignancy among women in worldwide including Japan. Several studies have identified common genetic variants to be associated with the risk of breast cancer. Due to the complex linkage disequilibrium structure and various environmental exposures in different populations, it is essential to identify variants associated with breast cancer in each population, which subsequently facilitate the better understanding of mammary carcinogenesis. In this study, we conducted a genome-wide association study (GWAS) as well as whole-genome imputation with 2,642 cases and 2,099 unaffected female controls. We further examined 13 suggestive loci (P−5) using an independent sample set of 2,885 cases and 3,395 controls and successfully validated two previously-reported loci, rs2981578 (combined P-value of 1.31×10−12, OR = 1.23; 95% CI = 1.16–.30) on chromosome 10q26 (FGFR2), rs3803662 (combined P-value of 2.79×10−11, OR = 1.21; 95% CI = 1.15–.28) and rs12922061 (combined P-value of 3.97×10−10, OR = 1.23; 95% CI = 1.15–.31) on chromosome 16q12 (TOX3-LOC643714). Weighted genetic risk score on the basis of three significantly associated variants and two previously reported breast cancer associated loci in East Asian population revealed that individuals who carry the most risk alleles in category 5 have 2.2 times higher risk of developing breast cancer in the Japanese population than those who carry the least risk alleles in reference category 1. Although we could not identify additional loci associated with breast cancer, our study utilized one of the largest sample sizes reported to date, and provided genetic status that represent the Japanese population. Further local and international collaborative study is essential to identify additional genetic variants that could lead to a better, accurate prediction for breast cancer.</p
Recommended from our members
Identification of a Functional Variant in the <i>MICA</i> Promoter Which Regulates <i>MICA</i> Expression and Increases HCV-Related Hepatocellular Carcinoma Risk
Hepatitis C virus (HCV) infection is the major cause of hepatocellular carcinoma (HCC) in Japan. We previously identified the association of SNP rs2596542 in the 5' flanking region of the MHC class I polypeptide-related sequence A (MICA) gene with the risk of HCV-induced HCC. In the current study, we performed detailed functional analysis of 12 candidate SNPs in the promoter region and found that a SNP rs2596538 located at 2.8 kb upstream of the MICA gene affected the binding of a nuclear protein(s) to the genomic segment including this SNP. By electrophoretic mobility shift assay (EMSA) and chromatin immunoprecipitation (ChIP) assay, we identified that transcription factor Specificity Protein 1 (SP1) can bind to the protective G allele, but not to the risk A allele. In addition, reporter construct containing the G allele was found to exhibit higher transcriptional activity than that containing the A allele. Moreover, SNP rs2596538 showed stronger association with HCV-induced HCC (P = 1.82×10−5 and OR = 1.34) than the previously identified SNP rs2596542. We also found significantly higher serum level of soluble MICA (sMICA) in HCV-induced HCC patients carrying the G allele than those carrying the A allele (P = 0.00616). In summary, we have identified a functional SNP that is associated with the expression of MICA and the risk for HCV-induced HCC.</p
Genome-wide association studies identify polygenic effects for completed suicide in the Japanese population
Suicide is a significant public health problem worldwide, and several Asian countries including Japan have relatively high suicide rates on a world scale. Twin, family, and adoption studies have suggested high heritability for suicide, but genetics lags behind due to difficulty in obtaining samples from individuals who died by suicide, especially in non-European populations. In this study, we carried out genome-wide association studies combining two independent datasets totaling 746 suicides and 14,049 non-suicide controls in the Japanese population. Although we identified no genome-wide significant single-nucleotide polymorphisms (SNPs), we demonstrated significant SNP-based heritability (35–48%; P < 0.001) for completed suicide by genomic restricted maximum-likelihood analysis and a shared genetic risk between two datasets (P best = 2.7 × 10−13) by polygenic risk score analysis. This study is the first genome-wide association study for suicidal behavior in an East Asian population, and our results provided the evidence of polygenic architecture underlying completed suicide
Germline pathogenic variants of 11 breast cancer genes in 7,051 Japanese patients and 11,241 controls
Pathogenic variants in highly penetrant genes are useful for the diagnosis, therapy, and surveillance for hereditary breast cancer. Large-scale studies are needed to inform future testing and variant classification processes in Japanese. We performed a case-control association study for variants in coding regions of 11 hereditary breast cancer genes in 7051 unselected breast cancer patients and 11,241 female controls of Japanese ancestry. Here, we identify 244 germline pathogenic variants. Pathogenic variants are found in 5.7% of patients, ranging from 15% in women diagnosed <40 years to 3.2% in patients ≥80 years, with BRCA1/2, explaining two-thirds of pathogenic variants identified at all ages. BRCA1/2, PALB2, and TP53 are significant causative genes. Patients with pathogenic variants in BRCA1/2 or PTEN have significantly younger age at diagnosis. In conclusion, BRCA1/2, PALB2, and TP53 are the major hereditary breast cancer genes, irrespective of age at diagnosis, in Japanese women
Recommended from our members
Aromatase inhibitors, estrogens and musculoskeletal pain: estrogen-dependent T-cell leukemia 1A (TCL1A) gene-mediated regulation of cytokine expression
Introduction: Arthralgias and myalgias are major side effects associated with aromatase inhibitor (AI) therapy of breast cancer. In a recent genome-wide association study, we identified SNPs - including one that created an estrogen response element near the 3' end of the T-cell leukemia 1A (TCL1A) gene - that were associated with musculoskeletal pain in women on adjuvant AI therapy for breast cancer. We also showed estrogen-dependent, SNP-modulated variation in TCL1A expression and, in preliminary experiments, showed that TCL1A could induce IL-17RA expression. In the present study, we set out to determine whether these SNPs might influence cytokine expression and effect more widely, and, if so, to explore the mechanism of TCL1A-related AI-induced side effects. Methods: The functional genomic experiments performed included determinations of TCL1A, cytokine and cytokine receptor expression in response to estrogen treatment of U2OS cells and lymphoblastoid cell lines that had been stably transfected with estrogen receptor alpha. Changes in mRNA and protein expression after gene knockdown and overexpression were also determined, as was NF-κB transcriptional activity. Results: Estradiol (E2) increased TCL1A expression and, in a TCL1A SNP-dependent fashion, also altered the expression of IL-17, IL-17RA, IL-12, IL-12RB2 and IL-1R2. TCL1A expression was higher in E2-treated lymphoblastoid cell lines with variant SNP genotypes, and induction of the expression of cytokine and cytokine receptor genes was mediated by TCL1A. Finally, estrogen receptor alpha blockade with ICI-182,780 in the presence of E2 resulted in greatly increased NF-κB transcriptional activity, but only in cells that carried variant SNP genotypes. These results linked variant TCL1A SNP sequences that are associated with AI-dependent musculoskeletal pain with increased E2-dependent TCL1A expression and with downstream alterations in cytokine and cytokine receptor expression as well as NF-κB transcriptional activity. Conclusions: SNPs near the 3' terminus of TCL1A were associated with AI-dependent musculoskeletal pain. E2 induced SNP-dependent TCL1A expression, which in turn altered IL-17, IL-17RA, IL-12, IL-12RB2, and IL-1R2 expression as well as NF-κB transcriptional activity. These results provide a pharmacogenomic explanation for a clinically important adverse drug reaction as well as insights into a novel estrogen-dependent mechanism for the modulation of cytokine and cytokine receptor expression
- …