23 research outputs found
A framework for the detection of de novo mutations in family-based sequencing data
Germline mutation detection from human DNA sequence data is challenging due to the rarity of such events relative to the intrinsic error rates of sequencing technologies and the uneven coverage across the genome. We developed PhaseByTransmission (PBT) to identify de novo single nucleotide variants and short insertions and deletions (indels) from sequence data collected in parent-offspring trios. We compute the joint probability of the data given the genotype likelihoods in the individual family members, the known familial relationships and a prior probability for the mutation rate. Candidate de novo mutations (DNMs) are reported along with their posterior probability, providing a systematic way to prioritize them for validation. Our tool is integrated in the Genome Analysis Toolkit and can be used together with the ReadBackedPhasing module to infer the parental origin of DNMs based on phase-informative reads. Using simulated data, we show that PBT outperforms existing tools, especially in low coverage data and on the X chromosome. We further show that PBT displays high validation rates on empirical parent-offspring sequencing data for whole-exome data from 104 trios and X-chromosome data from 249 parent-offspring families. Finally, we demonstrate an association between father's age at conception and the number of DNMs in female offspring's X chromosome, consistent with previous literature reports
A framework for the detection of de novo mutations in family-based sequencing data
Francioli LC, Cretu-Stancu M, Garimella KV, et al. A framework for the detection of de novo mutations in family-based sequencing data. European Journal of Human Genetics. 2016;25(2):227-233
Prioritization of genes driving congenital phenotypes of patients with de novo genomic structural variants
Background:Genomic structural variants (SVs) can affect many genes and regulatory elements. Therefore, the molecular mechanisms driving the phenotypes of patients carrying de novo SVs are frequently unknown.
Methods:We applied a combination of systematic experimental and bioinformatic methods to improve the molecular diagnosis of 39 patients with multiple congenital abnormalities and/or intellectual disability harboring apparent de novo SVs, most with an inconclusive diagnosis after regular genetic testing.
Results: In 7 of these cases (18%), whole-genome sequencing analysis revealed disease-relevant complexities of the SVs missed in routine microarray-based analyses. We developed a computational tool to predict the effects on genes directly affected by SVs and on genes indirectly affected likely due to the changes in chromatin organization and impact on regulatory mechanisms. By combining these functional predictions with extensive phenotype information, candidate driver genes were identified in 16/39 (41%) patients. In 8 cases, evidence was found for the involvement of multiple candidate drivers contributing to different parts of the phenotypes. Subsequently, we applied this computational method to two cohorts containing a total of 379 patients with previously detected and classified de novo SVs and identified candidate driver genes in 189 cases (50%), including 40 cases whose SVs were previously not classified as pathogenic. Pathogenic position effects were predicted in 28% of all studied cases with balanced SVs and in 11% of the cases with copy number variants.
Conclusions:These results demonstrate an integrated computational and experimental approach to predict driver genes based on analyses of WGS data with phenotype association and chromatin organization datasets. These analyses nominate new pathogenic loci and have strong potential to improve the molecular diagnosis of patients with de novo SVs
Recommended from our members
A framework for the detection of de novo mutations in family-based sequencing data
Germline mutation detection from human DNA sequence data is challenging due to the rarity of such events relative to the intrinsic error rates of sequencing technologies and the uneven coverage across the genome. We developed PhaseByTransmission (PBT) to identify de novo single nucleotide variants and short insertions and deletions (indels) from sequence data collected in parent-offspring trios. We compute the joint probability of the data given the genotype likelihoods in the individual family members, the known familial relationships and a prior probability for the mutation rate. Candidate de novo mutations (DNMs) are reported along with their posterior probability, providing a systematic way to prioritize them for validation. Our tool is integrated in the Genome Analysis Toolkit and can be used together with the ReadBackedPhasing module to infer the parental origin of DNMs based on phase-informative reads. Using simulated data, we show that PBT outperforms existing tools, especially in low coverage data and on the X chromosome. We further show that PBT displays high validation rates on empirical parent-offspring sequencing data for whole-exome data from 104 trios and X-chromosome data from 249 parent-offspring families. Finally, we demonstrate an association between father's age at conception and the number of DNMs in female offspring's X chromosome, consistent with previous literature reports
Lower frequency of the HLA-G UTR-4 haplotype in women with unexplained recurrent miscarriage
HLA-G expressed by trophoblasts at the fetal-maternal interface and its soluble form have immunomodulatory effects. HLA-G expression depends on the combination of DNA polymorphisms. We hypothesized that combinations of specific single nucleotide polymorphisms (SNPs) in the 3'untranslated region (3'UTR) of HLA-G play a role in unexplained recurrent miscarriage. In a case control design, 100 cases with at least three unexplained consecutive miscarriages prior to the 20th week of gestation were included. Cases were at time of the third miscarriage younger than 36 years, and they conceived all their pregnancies from the same partner. The control group included 89 women with an uneventful pregnancy. The association of HLA-G 3'UTR SNPs and specific HLA-G haplotype with recurrent miscarriage was studied with logistic regression. Odds ratios (OR) and 95% confidence intervals (95% CI) were reported. Individual SNPs were not significantly associated with recurrent miscarriage after correction for multiple comparisons. However, the presence of the UTR-4 haplotype, which included +3003C, was significantly lower in women with recurrent miscarriage (OR 0.4, 95% CI 0.2-0.8, p = 0.015). In conclusion, this is the first study to perform a comprehensive analysis of HLA-G SNPs and HLA-G haplotypes in a well-defined group of women with recurrent miscarriage and women with uneventful pregnancy. The UTR-4 haplotype was less frequently observed in women with recurrent miscarriage, suggesting an immunoregulatory role of this haplotype for continuation of the pregnancy without complications. Thus, association of HLA-G with recurrent miscarriage is not related to single polymorphisms in the 3'UTR, but is rather dependent on haplotypes
Lower frequency of the HLA-G UTR-4 haplotype in women with unexplained recurrent miscarriage
HLA-G expressed by trophoblasts at the fetal-maternal interface and its soluble form have immunomodulatory effects. HLA-G expression depends on the combination of DNA polymorphisms. We hypothesized that combinations of specific single nucleotide polymorphisms (SNPs) in the 3'untranslated region (3'UTR) of HLA-G play a role in unexplained recurrent miscarriage. In a case control design, 100 cases with at least three unexplained consecutive miscarriages prior to the 20th week of gestation were included. Cases were at time of the third miscarriage younger than 36 years, and they conceived all their pregnancies from the same partner. The control group included 89 women with an uneventful pregnancy. The association of HLA-G 3'UTR SNPs and specific HLA-G haplotype with recurrent miscarriage was studied with logistic regression. Odds ratios (OR) and 95% confidence intervals (95% CI) were reported. Individual SNPs were not significantly associated with recurrent miscarriage after correction for multiple comparisons. However, the presence of the UTR-4 haplotype, which included +3003C, was significantly lower in women with recurrent miscarriage (OR 0.4, 95% CI 0.2-0.8, p = 0.015). In conclusion, this is the first study to perform a comprehensive analysis of HLA-G SNPs and HLA-G haplotypes in a well-defined group of women with recurrent miscarriage and women with uneventful pregnancy. The UTR-4 haplotype was less frequently observed in women with recurrent miscarriage, suggesting an immunoregulatory role of this haplotype for continuation of the pregnancy without complications. Thus, association of HLA-G with recurrent miscarriage is not related to single polymorphisms in the 3'UTR, but is rather dependent on haplotypes
A framework for the detection of de novo mutations in family-based sequencing data
Germline mutation detection from human DNA sequence data is challenging due to the rarity of such events relative to the intrinsic error rates of sequencing technologies and the uneven coverage across the genome. We developed PhaseByTransmission (PBT) to identify de novo single nucleotide variants and short insertions and deletions (indels) from sequence data collected in parent-offspring trios. We compute the joint probability of the data given the genotype likelihoods in the individual family members, the known familial relationships and a prior probability for the mutation rate. Candidate de novo mutations (DNMs) are reported along with their posterior probability, providing a systematic way to prioritize them for validation. Our tool is integrated in the Genome Analysis Toolkit and can be used together with the ReadBackedPhasing module to infer the parental origin of DNMs based on phase-informative reads. Using simulated data, we show that PBT outperforms existing tools, especially in low coverage data and on the X chromosome. We further show that PBT displays high validation rates on empirical parent-offspring sequencing data for whole-exome data from 104 trios and X-chromosome data from 249 parent-offspring families. Finally, we demonstrate an association between father’s age at conception and the number of DNMs in female offspring’s X chromosome, consistent with previous literature reports
A framework for the detection of de novo mutations in family-based sequencing data
Germline mutation detection from human DNA sequence data is challenging due to the rarity of such events relative to the intrinsic error rates of sequencing technologies and the uneven coverage across the genome. We developed PhaseByTransmission (PBT) to identify de novo single nucleotide variants and short insertions and deletions (indels) from sequence data collected in parent-offspring trios. We compute the joint probability of the data given the genotype likelihoods in the individual family members, the known familial relationships and a prior probability for the mutation rate. Candidate de novo mutations (DNMs) are reported along with their posterior probability, providing a systematic way to prioritize them for validation. Our tool is integrated in the Genome Analysis Toolkit and can be used together with the ReadBackedPhasing module to infer the parental origin of DNMs based on phase-informative reads. Using simulated data, we show that PBT outperforms existing tools, especially in low coverage data and on the X chromosome. We further show that PBT displays high validation rates on empirical parent-offspring sequencing data for whole-exome data from 104 trios and X-chromosome data from 249 parent-offspring families. Finally, we demonstrate an association between father’s age at conception and the number of DNMs in female offspring’s X chromosome, consistent with previous literature reports