652 research outputs found

    Hierarchical maximum likelihood clustering approach

    Get PDF
    Objective: In this work, we focused on developing a clustering approach for biological data. In many biological analyses, such as multi-omics data analysis and genome-wide association studies (GWAS) analysis, it is crucial to find groups of data belonging to subtypes of diseases or tumors. Methods: Conventionally, the k-means clustering algorithm is overwhelmingly applied in many areas including biological sciences. There are, however, several alternative clustering algorithms that can be applied, including support vector clustering. In this paper, taking into consideration the nature of biological data, we propose a maximum likelihood clustering scheme based on a hierarchical framework. Results: This method can perform clustering even when the data belonging to different groups overlap. It can also perform clustering when the number of samples is lower than the data dimensionality. Conclusion: The proposed scheme is free from selecting initial settings to begin the search process. In addition, it does not require the computation of the first and second derivative of likelihood functions, as is required by many other maximum likelihood based methods. Significance: This algorithm uses distribution and centroid information to cluster a sample and was applied to biological data. A Matlab implementation of this method can be downloaded from the web-link http://www.riken.jp/en/research/labs/ims/med_sci_math/

    Germline pathogenic variants of 11 breast cancer genes in 7,051 Japanese patients and 11,241 controls

    Get PDF
    Pathogenic variants in highly penetrant genes are useful for the diagnosis, therapy, and surveillance for hereditary breast cancer. Large-scale studies are needed to inform future testing and variant classification processes in Japanese. We performed a case-control association study for variants in coding regions of 11 hereditary breast cancer genes in 7051 unselected breast cancer patients and 11,241 female controls of Japanese ancestry. Here, we identify 244 germline pathogenic variants. Pathogenic variants are found in 5.7% of patients, ranging from 15% in women diagnosed <40 years to 3.2% in patients ≥80 years, with BRCA1/2, explaining two-thirds of pathogenic variants identified at all ages. BRCA1/2, PALB2, and TP53 are significant causative genes. Patients with pathogenic variants in BRCA1/2 or PTEN have significantly younger age at diagnosis. In conclusion, BRCA1/2, PALB2, and TP53 are the major hereditary breast cancer genes, irrespective of age at diagnosis, in Japanese women

    Clinical Pharmacogenetics Implementation Consortium (CPIC) Guidelines for Human Leukocyte Antigen B (HLA-B) Genotype and Allopurinol Dosing: 2015 update

    Get PDF
    The Clinical Pharmacogenetics Implementation Consortium (CPIC) Guidelines for HLA-B*58:01 Genotype and Allopurinol Dosing was originally published in February 2013. We reviewed the recent literature and concluded that none of the evidence would change the therapeutic recommendations in the original guideline; therefore, the original publication remains clinically current. However, we have updated the Supplemental Material and included additional resources for applying CPIC guidelines into the electronic health record. Up-to-date information can be found at PharmGKB (http://www.pharmgkb.org)

    A Genome-Wide Association Study of Nephrolithiasis in the Japanese Population Identifies Novel Susceptible Loci at 5q35.3, 7p14.3, and 13q14.1

    Get PDF
    Nephrolithiasis is a common nephrologic disorder with complex etiology. To identify the genetic factor(s) for nephrolithiasis, we conducted a three-stage genome-wide association study (GWAS) using a total of 5,892 nephrolithiasis cases and 17,809 controls of Japanese origin. Here we found three novel loci for nephrolithiasis: RGS14-SLC34A1-PFN3-F12 on 5q35.3 (rs11746443; P = 8.51×10−12, odds ratio (OR) = 1.19), INMT-FAM188B-AQP1 on 7p14.3 (rs1000597; P = 2.16×10−14, OR = 1.22), and DGKH on 13q14.1 (rs4142110; P = 4.62×10−9, OR = 1.14). Subsequent analyses in 21,842 Japanese subjects revealed the association of SNP rs11746443 with the reduction of estimated glomerular filtration rate (eGFR) (P = 6.54×10−8), suggesting a crucial role for this variation in renal function. Our findings elucidated the significance of genetic variations for the pathogenesis of nephrolithiasis

    Association of Common Variants in TNFRSF13B, TNFSF13, and ANXA3 with Serum Levels of Non-Albumin Protein and Immunoglobulin Isotypes in Japanese

    Get PDF
    We performed a genome-wide association study (GWAS) on levels of serum total protein (TP), albumin (ALB), and non-albumin protein (NAP). We analyzed SNPs on autosomal chromosomes using data from 9,103 Japanese individuals, followed by a replication study of 1,600 additional individuals. We confirmed the previously- reported association of GCKR on chromosome 2p23.3 with serum ALB (rs1260326, Pmeta = 3.1×10−9), and additionally identified the significant genome-wide association of rs4985726 in TNFRSF13B on 17p11.2 with both TP and NAP (Pmeta = 1.2×10−14 and 7.1×10−24, respectively). For NAP, rs3803800 and rs11552708 in TNFSF13 on 17p13.1 (Pmeta = 7.2×10−15 and 7.5×10−10, respectively) as well as rs10007186 on 4q21.2 near ANXA3 (Pmeta = 1.3×10−9) also indicated significant associations. Interestingly, TNFRSF13B and TNFSF13 encode a tumor necrosis factor (TNF) receptor and its ligand, which together constitute an important receptor-ligand axis for B-cell homeostasis and immunoglobulin production. Furthermore, three SNPs, rs4985726, rs3803800, and rs11552708 in TNFRSF13B and TNFSF13, were indicated to be associated with serum levels of IgG (P<2.3×10−3) and IgM (P<0.018), while rs3803800 and rs11552708 were associated with IgA (P<0.013). Rs10007186 in 4q21.2 was associated with serum levels of IgA (P = 0.036), IgM (P = 0.019), and IgE (P = 4.9×10−4). Our results should add interesting knowledge about the regulation of major serum components

    GWAS of bipolar disorder

    Get PDF
    Genome-wide association studies (GWASs) have identified several susceptibility loci for bipolar disorder (BD) and shown that the genetic architecture of BD can be explained by polygenicity, with numerous variants contributing to BD. In the present GWAS (Phase I/II), which included 2964 BD and 61 887 control subjects from the Japanese population, we detected a novel susceptibility locus at 11q12.2 (rs28456, P=6.4 × 10−9), a region known to contain regulatory genes for plasma lipid levels (FADS1/2/3). A subsequent meta-analysis of Phase I/II and the Psychiatric GWAS Consortium for BD (PGC-BD) identified another novel BD gene, NFIX (P best=5.8 × 10−10), and supported three regions previously implicated in BD susceptibility: MAD1L1 (P best=1.9 × 10−9), TRANK1 (P best=2.1 × 10−9) and ODZ4 (P best=3.3 × 10−9). Polygenicity of BD within Japanese and trans-European-Japanese populations was assessed with risk profile score analysis. We detected higher scores in BD cases both within (Phase I/II) and across populations (Phase I/II and PGC-BD). These were defined by (1) Phase II as discovery and Phase I as target, or vice versa (for ‘within Japanese comparisons’, Pbest~10−29, R2~2%), and (2) European PGC-BD as discovery and Japanese BD (Phase I/II) as target (for ‘trans-European-Japanese comparison,’ Pbest~10−13, R2~0.27%). This ‘trans population’ effect was supported by estimation of the genetic correlation using the effect size based on each population (liability estimates~0.7). These results indicate that (1) two novel and three previously implicated loci are significantly associated with BD and that (2) BD ‘risk’ effect are shared between Japanese and European populations

    ParaHaplo: A program package for haplotype-based whole-genome association study using parallel computing

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Since more than a million single-nucleotide polymorphisms (SNPs) are analyzed in any given genome-wide association study (GWAS), performing multiple comparisons can be problematic. To cope with multiple-comparison problems in GWAS, haplotype-based algorithms were developed to correct for multiple comparisons at multiple SNP loci in linkage disequilibrium. A permutation test can also control problems inherent in multiple testing; however, both the calculation of exact probability and the execution of permutation tests are time-consuming. Faster methods for calculating exact probabilities and executing permutation tests are required.</p> <p>Methods</p> <p>We developed a set of computer programs for the parallel computation of accurate P-values in haplotype-based GWAS. Our program, ParaHaplo, is intended for workstation clusters using the Intel Message Passing Interface (MPI). We compared the performance of our algorithm to that of the regular permutation test on JPT and CHB of HapMap.</p> <p>Results</p> <p>ParaHaplo can detect smaller differences between 2 populations than SNP-based GWAS. We also found that parallel-computing techniques made ParaHaplo 100-fold faster than a non-parallel version of the program.</p> <p>Conclusion</p> <p>ParaHaplo is a useful tool in conducting haplotype-based GWAS. Since the data sizes of such projects continue to increase, the use of fast computations with parallel computing--such as that used in ParaHaplo--will become increasingly important. The executable binaries and program sources of ParaHaplo are available at the following address: <url>http://sourceforge.jp/projects/parallelgwas/?_sl=1</url></p
    • …
    corecore