11 research outputs found
Genome-wide linkage and haplotype sharing analysis implicates the <i>MCDR3</i> locus as a candidate region for a developmental macular disorder in association with digit abnormalities
<p><i>Background</i>: Developmental macular disorders are a heterogeneous group of rare retinal conditions that can cause significant visual impairment from childhood. Among these disorders, autosomal dominant North Carolina macular dystrophy (NCMD) has been mapped to 6q16 (<i>MCDR1</i>) with recent support for a non-coding disease mechanism of <i>PRDM13</i>. A second locus on 5p15-5p13 (<i>MCDR3</i>) has been implicated in a similar phenotype, but the disease-causing mechanism still remains unknown.</p> <p><i>Methods</i>: Two families affected by a dominant developmental macular disorder that closely resembles NCMD in association with digit abnormalities were included in the study. Family members with available DNA were genotyped using the Affymetrix GeneChip Human Mapping 250K Sty array. A parametric multipoint linkage analysis assuming a fully penetrant dominant model was performed using MERLIN. Haplotype sharing analysis was carried out using the non-parametric Homozygosity Haplotype method. Whole-exome sequencing was conducted on selected affected individuals.</p> <p><i>Results</i>: Linkage analysis excluded <i>MCDR1</i> from the candidate regions (LOD < â2). There was suggestive linkage (LOD = 2.7) at two loci, including 9p24.1 and 5p15.32 that overlapped with <i>MCDR3</i>. The haplotype sharing analysis in one of the families revealed a 5 cM shared IBD segment at 5p15.32 (<i>p</i> value = 0.004). Whole-exome sequencing did not provide conclusive evidence for disease-causing alleles.</p> <p><i>Conclusions</i>: These findings do not exclude that this phenotype may be allelic with NCMD <i>MCDR3</i> at 5p15 and leave the possibility of a non-coding disease mechanism, in keeping with recent findings on 6q16. Further studies, including whole-genome sequencing, may help elucidate the underlying genetic cause of this phenotype and shed light on macular development and function.</p
Comparison of results from stochastic GUESSFM and stepwise searches.
<p><i>p</i> values are shown for the stepwise search and compare the listed model to the model above (or the null model, for single SNPs). Bayesian Information Criterion (BIC) is shown for stepwise models and index models found through GUESSFM search.</p
Comparison of of several multivariate methods for fine mapping using simulated data.
<p>We simulated quantitative phenotype data with between two and five causal variants using genotype data from the T1D dataset for the <i>IL2RA</i> region. The simulated data sets were analysed using forward stepwise regression, GUESSFM, the lasso, the group lasso and the elastic net. GUESSFM produces credible sets for each variant chosen using the snp.picker algorithm described in Materials and Methods. We defined pseudo âcredible setsâ for the other approaches as the set of SNPs with <i>r</i><sup>2</sup> > 0.8 with a selected SNP. We calculated the discovery rate (the proportion of causal variants within at least one credible set, y axis) and false discovery rate (proportion of detected variants whose credible sets did not contain any causal variant, x axis) at different thresholds for the stepwise <i>p</i> value, the group marginal posterior probability of inclusion (gMPPI) for GUESSFM and the regularization parameter(s) across simulated datasets (see <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1005272#sec004" target="_blank">Methods</a> for details). GUESSFM-3 and GUESSFM-5 refer to GUESSFM run with a prior expectation of three or five causal variants per region, respectively. Results are averaged over 1000 replicates.</p
The proportion of naive CD4<sup>+</sup> T cells that express CD25 (log scale) increases with age.
<p>The MS protective allele for the M2 SNP rs41295055:C > T associates with fewer CD4<sup>+</sup> T cells expressing CD25 across all ages (<i>p</i> = 3.45 Ă 10<sup>â8</sup>), and is statistically preferred to the previously reported M1 SNP, rs2104286:T > C (<i>p</i> = 2.56 Ă 10<sup>â6</sup>; Î BIC = 8.43). S and P are used to represent the (common) MS-susceptible and (rare) MS-protective alleles respectively at each SNP. These SNPs are in limited LD (<i>r</i><sup>2</sup> = 0.3).</p
Six sets of SNPs can best explain the association of T1D and MS in the chromosome 10p15 region.
<p><b>LD</b>: a heatmap indicating the <i>r</i><sup>2</sup> between SNPs. <b>Assoc</b>: MPPI for MS and T1D the SNPs in a group, with total MPPI across a SNP group, gMPPI, indicated by the height of the shaded rectangle (see <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1005272#pgen.1005272.t005" target="_blank">Table 5</a> for numerical details). SNP groups are labelled by the letters A-F for reference. SNPs in this track are ordered by SNP group for ease of visualisation. <b>Genes</b>: SNPs are mapped back to physical position and shown in relation to genes in the region. <b>RNAseq</b>: read counts in two pooled replicates of resting (ârest1â and ârest2â) and anti-CD3/CD28 stimulated (âstim1â and âstim2â) CD4<sup>+</sup> T cells; y axes were truncated to allow visualization of intronic read counts. Note the different limits for resting and stimulated cells, which show greater transcription of all protein coding genes in the region. <b>DNase</b>: DNase hypersensitivity measured in CD4 cells by the Roadmap consortium. Replicate 1 (ârest1â) is RO_01689; replicate 2 (ârest2â) is RO_01736; y axes were truncated again to improve visualization.</p
Overview of the fine mapping tailored stochastic search strategy in GUESSFM.
<p>1. SNPs are clustered based on genotype data. Tagging is used to remove cases of extreme LD (<i>r</i><sup>2</sup> > 0.99) by selecting one SNP from each cluster (âtag setâ), that which is in highest average <i>r</i><sup>2</sup> with all other SNPs. 2. All possible models that can be formed from the tag SNPs may be considered by GUESS. Here, all seven possible models are considered but, in practice, with larger numbers of tags than shown here, GUESS employs a stochastic search strategy to consider only a subset of models, prioritising those with greatest statistical support. 3. GUESS selects the most likely models amongst those it has visited. Here, it selects two of the seven, but in larger data sets we retain the 30,000 most likely. 4. Each of these selected models is expanded by considering all possible substitutions of tags by other members of their tag set. Each expanded model is then assessed again individually, using an approximate Bayes factor [<a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1005272#pgen.1005272.ref014" target="_blank">14</a>].</p
Haplotype analysis of MS association using the index SNPs for the one and two SNP models selected.
<p>Minor alleles are indicated by bold font. Fq = haplotype frequency.</p
LD (<i>r</i><sup>2</sup>) between SNPs in selected models for T1D using stochastic GUESSFM and stepwise searches.
<p>Each GUESSFM search signal has a correspondng SNP found by conditional stepwise regression in moderate to strong LD.</p
AJ individuals have higher CD polygenic risk score than NJ controls.
<p>NJ: non-Jewish; AJ: Ashkenazi Jewish; CD: Crohnâs disease; PRS: polygenic risk score. <b>A</b>) Density plot of CD polygenic risk scores in 454 AJ (green) and 35,007 NJ(purple)controls. AJ controls have higher CD polygenic risk score than NJ controls (0.97 s.d. higher, p<10<sup>â16</sup>). <b>B</b>) Density plot of CD polygenic risk scores in 1,938 AJ (green) and 20,652 NJ CD (purple) cases (0.54 s.d. higher, p<10<sup>â16</sup>). For both density plots the scores have been scaled to NJ controls, thus resulting in an NJ control PRS density of mean equal to 0 and variance equal to 1 (see Online Methods). <b>C</b>) Ranked (decreasing order) CD associated variants by estimated contribution to the differences in genetic risk between AJ and NJ. Associated variants with estimated contribution greater than or equal to 0.01, computed as 2 log(odds ratio) (AJ frequencyâNJ frequency), assuming additive effects on the log scale, are highlighted in green. Associated variants with estimated contribution less than or equal to -0.01 are highlighted in purple. Forward slashes represent a break in variants highlighted.</p
Enrichment of alleles discovered in AJ exome sequencing project.
<p><b>A)</b> Histogram of estimated log enrichment statistic, defined as the log of the bias corrected odds ratio comparing the allele frequency in AJ population to the maximum allele frequency estimated from NFE, AFR, and AMR populations in ExAC. For each histogram bin we show a bar plot of the expected number of alleles belonging to the two groups we analyzed: 1) enriched (green) and 2) not enriched (white). <b>B)</b> Bar plots of estimated percentage of alleles belonging to the two groups we analyzed for all protein-coding (ALL), synonymous (SYN), protein-altering (PRA), and protein-truncating variants (PTV). An estimate of 34% of protein-coding alleles observed in AJ have a mean shift of 15-fold increased odds of the alternate allele compared to other reference populations. This observation is supported by the property that compared to intergenic variants, coding variants tend to be younger for a given frequency and the more pathogenic a variant, the younger it is, therefore tending to be population specific[<a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1007329#pgen.1007329.ref013" target="_blank">13</a>].</p