704 research outputs found

    Faster k-Medoids Clustering: Improving the PAM, CLARA, and CLARANS Algorithms

    Full text link
    Clustering non-Euclidean data is difficult, and one of the most used algorithms besides hierarchical clustering is the popular algorithm Partitioning Around Medoids (PAM), also simply referred to as k-medoids. In Euclidean geometry the mean-as used in k-means-is a good estimator for the cluster center, but this does not hold for arbitrary dissimilarities. PAM uses the medoid instead, the object with the smallest dissimilarity to all others in the cluster. This notion of centrality can be used with any (dis-)similarity, and thus is of high relevance to many domains such as biology that require the use of Jaccard, Gower, or more complex distances. A key issue with PAM is its high run time cost. We propose modifications to the PAM algorithm to achieve an O(k)-fold speedup in the second SWAP phase of the algorithm, but will still find the same results as the original PAM algorithm. If we slightly relax the choice of swaps performed (at comparable quality), we can further accelerate the algorithm by performing up to k swaps in each iteration. With the substantially faster SWAP, we can now also explore alternative strategies for choosing the initial medoids. We also show how the CLARA and CLARANS algorithms benefit from these modifications. It can easily be combined with earlier approaches to use PAM and CLARA on big data (some of which use PAM as a subroutine, hence can immediately benefit from these improvements), where the performance with high k becomes increasingly important. In experiments on real data with k=100, we observed a 200-fold speedup compared to the original PAM SWAP algorithm, making PAM applicable to larger data sets as long as we can afford to compute a distance matrix, and in particular to higher k (at k=2, the new SWAP was only 1.5 times faster, as the speedup is expected to increase with k)

    Association of neurexin 3 polymorphisms with smoking behavior.

    Full text link
    The Neurexin 3 gene (NRXN3) has been associated with dependence on various addictive substances, as well as with the degree of smoking in schizophrenic patients and impulsivity among tobacco abusers. To further evaluate the role of NRXN3 in nicotine addiction, we analyzed single nucleotide polymorphisms (SNPs) and a copy number variant (CNV) within the NRXN3 genomic region. An initial study was carried out on 157 smokers and 595 controls, all of Spanish Caucasian origin. Nicotine dependence was assessed using the Fagerstrom index and the number of cigarettes smoked per day. The 45 NRXN3 SNPs genotyped included all the SNPs previously associated with disease, and a previously described deletion within NRXN3. This analysis was replicated in 276 additional independent smokers and 568 controls. Case-control association analyses were performed at the allele, genotype and haplotype levels. Allelic and genotypic association tests showed that three NRXN3 SNPs were associated with a lower risk of being a smoker. The haplotype analysis showed that one block of 16 Kb, consisting of two of the significant SNPs (rs221473 and rs221497), was also associated with lower risk of being a smoker in both the discovery and the replication cohorts, reaching a higher level of significance when the whole sample was considered [odds ratio = 0.57 (0.42-0.77), permuted P = 0.0075]. By contrast, the NRXN3 CNV was not associated with smoking behavior. Taken together, our results confirm a role for NRXN3 in susceptibility to smoking behavior, and strongly implicate this gene in genetic vulnerability to addictive behaviors

    Fat Mass and Obesity-Associated Gene (FTO) in Eating Disorders: Evidence for Association of the rs9939609 Obesity Risk Allele with Bulimia nervosa and Anorexia nervosa

    Get PDF
    Objective: The common single nucleotide polymorphism (SNP) rs9939609 in the fat mass and obesity-associated gene (FTO) is associated with obesity. As genetic variants associated with weight regulation might also be implicated in the etiology of eating disorders, we evaluated whether SNP rs9939609 is associated with bulimia nervosa (BN) and anorexia nervosa (AN). Methods: Association of rs9939609 with BN and AN was assessed in 689 patients with AN, 477 patients with BN, 984 healthy non-population-based controls, and 3,951 population-based controls (KORA-S4). Based on the familial and premorbid occurrence of obesity in patients with BN, we hypothesized an association of the obesity risk A-allele with BN. Results: In accordance with our hypothesis, we observed evidence for association of the rs9939609 A-allele with BN when compared to the non-population-based controls (unadjusted odds ratio (OR) = 1.142, one-sided 95% confidence interval (CI) 1.001-infinity; one-sided p = 0.049) and a trend in the population-based controls (OR = 1.124, one-sided 95% CI 0.932-infinity; one-sided p = 0.056). Interestingly, compared to both control groups, we further detected a nominal association of the rs9939609 A-allele to AN (OR = 1.181, 95% CI 1.027-1.359, two-sided p = 0.020 or OR = 1.673, 95% CI 1.101-2.541, two-sided p = 0.015,). Conclusion: Our data suggest that the obesity-predisposing FTO allele might be relevant in both AN and BN. Copyright (C) 2012 S. Karger GmbH, Freibur

    Genetic architecture distinguishes systemic juvenile idiopathic arthritis from other forms of juvenile idiopathic arthritis: clinical and therapeutic implications

    Get PDF
    OBJECTIVES: Juvenile idiopathic arthritis (JIA) is a heterogeneous group of conditions unified by the presence of chronic childhood arthritis without an identifiable cause. Systemic JIA (sJIA) is a rare form of JIA characterised by systemic inflammation. sJIA is distinguished from other forms of JIA by unique clinical features and treatment responses that are similar to autoinflammatory diseases. However, approximately half of children with sJIA develop destructive, long-standing arthritis that appears similar to other forms of JIA. Using genomic approaches, we sought to gain novel insights into the pathophysiology of sJIA and its relationship with other forms of JIA. METHODS: We performed a genome-wide association study of 770 children with sJIA collected in nine countries by the International Childhood Arthritis Genetics Consortium. Single nucleotide polymorphisms were tested for association with sJIA. Weighted genetic risk scores were used to compare the genetic architecture of sJIA with other JIA subtypes. RESULTS: The major histocompatibility complex locus and a locus on chromosome 1 each showed association with sJIA exceeding the threshold for genome-wide significance, while 23 other novel loci were suggestive of association with sJIA. Using a combination of genetic and statistical approaches, we found no evidence of shared genetic architecture between sJIA and other common JIA subtypes. CONCLUSIONS: The lack of shared genetic risk factors between sJIA and other JIA subtypes supports the hypothesis that sJIA is a unique disease process and argues for a different classification framework. Research to improve sJIA therapy should target its unique genetics and specific pathophysiological pathways

    A genome-wide association study of anorexia nervosa.

    Get PDF
    Anorexia nervosa (AN) is a complex and heritable eating disorder characterized by dangerously low body weight. Neither candidate gene studies nor an initial genome-wide association study (GWAS) have yielded significant and replicated results. We performed a GWAS in 2907 cases with AN from 14 countries (15 sites) and 14 860 ancestrally matched controls as part of the Genetic Consortium for AN (GCAN) and the Wellcome Trust Case Control Consortium 3 (WTCCC3). Individual association analyses were conducted in each stratum and meta-analyzed across all 15 discovery data sets. Seventy-six (72 independent) single nucleotide polymorphisms were taken forward for in silico (two data sets) or de novo (13 data sets) replication genotyping in 2677 independent AN cases and 8629 European ancestry controls along with 458 AN cases and 421 controls from Japan. The final global meta-analysis across discovery and replication data sets comprised 5551 AN cases and 21 080 controls. AN subtype analyses (1606 AN restricting; 1445 AN binge-purge) were performed. No findings reached genome-wide significance. Two intronic variants were suggestively associated: rs9839776 (P=3.01 × 10(-7)) in SOX2OT and rs17030795 (P=5.84 × 10(-6)) in PPP3CA. Two additional signals were specific to Europeans: rs1523921 (P=5.76 × 10(-)(6)) between CUL3 and FAM124B and rs1886797 (P=8.05 × 10(-)(6)) near SPATA13. Comparing discovery with replication results, 76% of the effects were in the same direction, an observation highly unlikely to be due to chance (P=4 × 10(-6)), strongly suggesting that true findings exist but our sample, the largest yet reported, was underpowered for their detection. The accrual of large genotyped AN case-control samples should be an immediate priority for the field

    New loci associated with birth weight identify genetic links between intrauterine growth and adult height and metabolism.

    Get PDF
    Birth weight within the normal range is associated with a variety of adult-onset diseases, but the mechanisms behind these associations are poorly understood. Previous genome-wide association studies of birth weight identified a variant in the ADCY5 gene associated both with birth weight and type 2 diabetes and a second variant, near CCNL1, with no obvious link to adult traits. In an expanded genome-wide association meta-analysis and follow-up study of birth weight (of up to 69,308 individuals of European descent from 43 studies), we have now extended the number of loci associated at genome-wide significance to 7, accounting for a similar proportion of variance as maternal smoking. Five of the loci are known to be associated with other phenotypes: ADCY5 and CDKAL1 with type 2 diabetes, ADRB1 with adult blood pressure and HMGA2 and LCORL with adult height. Our findings highlight genetic links between fetal growth and postnatal growth and metabolism
    corecore