9 research outputs found
Additional file 1: Tables S1-S12. of Discovery of rare, diagnostic AluYb8/9 elements in diverse human populations
This file contains supplementary Tables S1-S13 as well as a table of contents with table names (XLSX 1249 kb
Additional file 4: of Discovery of rare, diagnostic AluYb8/9 elements in diverse human populations
FASTA sequences of 68 Alu elements. This file contains high-quality sequence from Sanger sequencing of 68 Alu elements. The nucleotides are color-coded for Alu, TSD, A and B boxes, SRP9/14 sites, and pol III termination signals (DOCX 34 kb
Additional file 2: Supplementary Methods. of Discovery of rare, diagnostic AluYb8/9 elements in diverse human populations
This file contains supplementary methods that contain the improved ME-Scan protocol (DOCX 29 kb
Relationship Estimation from Whole-Genome Sequence Data
<div><p>The determination of the relationship between a pair of individuals is a fundamental application of genetics. Previously, we and others have demonstrated that identity-by-descent (IBD) information generated from high-density single-nucleotide polymorphism (SNP) data can greatly improve the power and accuracy of genetic relationship detection. Whole-genome sequencing (WGS) marks the final step in increasing genetic marker density by assaying all single-nucleotide variants (SNVs), and thus has the potential to further improve relationship detection by enabling more accurate detection of IBD segments and more precise resolution of IBD segment boundaries. However, WGS introduces new complexities that must be addressed in order to achieve these improvements in relationship detection. To evaluate these complexities, we estimated genetic relationships from WGS data for 1490 known pairwise relationships among 258 individuals in 30 families along with 46 population samples as controls. We identified several genomic regions with excess pairwise IBD in both the pedigree and control datasets using three established IBD methods: GERMLINE, fastIBD, and ISCA. These spurious IBD segments produced a 10-fold increase in the rate of detected false-positive relationships among controls compared to high-density microarray datasets. To address this issue, we developed a new method to identify and mask genomic regions with excess IBD. This method, implemented in ERSA 2.0, fully resolved the inflated cryptic relationship detection rates while improving relationship estimation accuracy. ERSA 2.0 detected all 1<sup>st</sup> through 6<sup>th</sup> degree relationships, and 55% of 9<sup>th</sup> through 11<sup>th</sup> degree relationships in the 30 families. We estimate that WGS data provides a 5% to 15% increase in relationship detection power relative to high-density microarray data for distant relationships. Our results identify regions of the genome that are highly problematic for IBD mapping and introduce new software to accurately detect 1<sup>st</sup> through 9<sup>th</sup> degree relationships from whole-genome sequence data.</p></div
Genomic Regions in hg19 coordinates of at least 5-to-expected IBD of at least 4-fold.
<p>Genomic Regions in hg19 coordinates of at least 5-to-expected IBD of at least 4-fold.</p
Predicted relationships for 595 individual pairs in three groups of population controls: 561 pairs from 34 European controls (CEU), 28 pairs from 8 East Asian controls (ASI), and 6 pairs from 4 Mexican-American controls (MXL).
<p>Numerical values in the table are results of WGS data, numerical values in parentheses are results of “SNP microarray” and “exon” data.</p
Performance of relationship estimation in 30 sequenced families using (A) GERMLINE-ERSA2.0, (B) fastIBD-ERSA2.0, and (C) ISCA-ERSA2.0.
<p>Area of the circles indicates the percentage of individual pairs whose estimated degrees of relationship are exactly the same as reported relationship. FS: full sibling. PO: parent offspring. UN: unrelated individuals. All ERSA analyses employed IBD masking. Histograms represent the number of pairs in each relationship category. Most of the pedigrees were ascertained on the basis of common, complex or rare, Mendelian diseases. As we have previously reported, this ascertainment can produce a downward bias in distant relationship estimates <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1004144#pgen.1004144-Huff1" target="_blank">[10]</a>, which may account for the differences in relationship estimates between sequenced and simulated pedigrees for 10<sup>th</sup> through 12<sup>th</sup> degree relationships (see <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1004144#pgen.1004144.s005" target="_blank">Figure S5</a>).</p
Regions where excess IBD is detected by three IBD methods among the control populations.
<p>Regions that give rise to excess IBD inferences in GERMLINE (A–C), fastIBD (D–F), and ISCA (G–I) IBD. Black and red shading denotes degree of excess IBD detected (see legend).</p
Comparison of IBD inferred by GERMLINE, fastIBD, and ISCA.
<p>IBD between one simulated sibling pair was shown as an example (chromosome 1). Blue segments indicate haploid-identity (IBD1) and red segments indicate diploid-identity (IBD2).</p