89 research outputs found
Identification of a RAI1-associated disease network through integration of exome sequencing, transcriptomics, and 3D genomics.
Smith-Magenis syndrome (SMS) is a developmental disability/multiple congenital anomaly disorder resulting from haploinsufficiency of RAI1. It is characterized by distinctive facial features, brachydactyly, sleep disturbances, and stereotypic behaviors.
We investigated a cohort of 15 individuals with a clinical suspicion of SMS who showed neither deletion in the SMS critical region nor damaging variants in RAI1 using whole exome sequencing. A combination of network analysis (co-expression and biomedical text mining), transcriptomics, and circularized chromatin conformation capture (4C-seq) was applied to verify whether modified genes are part of the same disease network as known SMS-causing genes.
Potentially deleterious variants were identified in nine of these individuals using whole-exome sequencing. Eight of these changes affect KMT2D, ZEB2, MAP2K2, GLDC, CASK, MECP2, KDM5C, and POGZ, known to be associated with Kabuki syndrome 1, Mowat-Wilson syndrome, cardiofaciocutaneous syndrome, glycine encephalopathy, mental retardation and microcephaly with pontine and cerebellar hypoplasia, X-linked mental retardation 13, X-linked mental retardation Claes-Jensen type, and White-Sutton syndrome, respectively. The ninth individual carries a de novo variant in JAKMIP1, a regulator of neuronal translation that was recently found deleted in a patient with autism spectrum disorder. Analyses of co-expression and biomedical text mining suggest that these pathologies and SMS are part of the same disease network. Further support for this hypothesis was obtained from transcriptome profiling that showed that the expression levels of both Zeb2 and Map2k2 are perturbed in Rai1 (-/-) mice. As an orthogonal approach to potentially contributory disease gene variants, we used chromatin conformation capture to reveal chromatin contacts between RAI1 and the loci flanking ZEB2 and GLDC, as well as between RAI1 and human orthologs of the genes that show perturbed expression in our Rai1 (-/-) mouse model.
These holistic studies of RAI1 and its interactions allow insights into SMS and other disorders associated with intellectual disability and behavioral abnormalities. Our findings support a pan-genomic approach to the molecular diagnosis of a distinctive disorder
Microcephaly, epilepsy, and neonatal diabetes due to compound heterozygous mutations in IER3IP1: Insights into the natural history of a rare disorder
Neonatal diabetes mellitus is known to have over 20 different monogenic causes. A syndrome of permanent neonatal diabetes along with primary microcephaly with simplified gyral pattern associated with severe infantile epileptic encephalopathy was recently described in two independent reports in which disease-causing homozygous mutations were identified in the immediate early response-3 interacting protein-1 (IER3IP1) gene. We report here an affected male born to a non-consanguineous couple who was noted to have insulin-requiring permanent neonatal diabetes, microcephaly, and generalized seizures. He was also found to have cortical blindness, severe developmental delay and numerous dysmorphic features. He experienced a slow improvement but not abrogation of seizure frequency and severity on numerous anti-epileptic agents. His clinical course was further complicated by recurrent respiratory tract infections and he died at 8years of age. Whole exome sequencing was performed on DNA from the proband and parents. He was found to be a compound heterozygote with two different mutations in IER3IP1: p.Val21Gly (V21G) and a novel frameshift mutation p.Phe27fsSer*25. IER3IP1 is a highly conserved protein with marked expression in the cerebral cortex and in beta cells. This is the first reported case of compound heterozygous mutations within IER3IP1 resulting in neonatal diabetes. The triad of microcephaly, generalized seizures, and permanent neonatal diabetes should prompt screening for mutations in IER3IP1. As mutations in genes such as NEUROD1 and PTF1A could cause a similar phenotype, next-generation sequencing approaches-such as exome sequencing reported here-may be an efficient means of uncovering a diagnosis in future cases
Identification of genetic risk variants for deep vein thrombosis by multiplexed next-generation sequencing of 186 hemostatic/pro-inflammatory genes
BACKGROUND:
Next-generation DNA sequencing is opening new avenues for genetic association studies in common diseases that, like deep vein thrombosis (DVT), have a strong genetic predisposition still largely unexplained by currently identified risk variants. In order to develop sequencing and analytical pipelines for the application of next-generation sequencing to complex diseases, we conducted a pilot study sequencing the coding area of 186 hemostatic/proinflammatory genes in 10 Italian cases of idiopathic DVT and 12 healthy controls.
RESULTS:
A molecular-barcoding strategy was used to multiplex DNA target capture and sequencing, while retaining individual sequence information. Genomic libraries with barcode sequence-tags were pooled (in pools of 8 or 16 samples) and enriched for target DNA sequences. Sequencing was performed on ABI SOLiD-4 platforms. We produced > 12 gigabases of raw sequence data to sequence at high coverage (average: 42X) the 700-kilobase target area in 22 individuals. A total of 1876 high-quality genetic variants were identified (1778 single nucleotide substitutions and 98 insertions/deletions). Annotation on databases of genetic variation and human disease mutations revealed several novel, potentially deleterious mutations. We tested 576 common variants in a case-control association analysis, carrying the top-5 associations over to replication in up to 719 DVT cases and 719 controls. We also conducted an analysis of the burden of nonsynonymous variants in coagulation factor and anticoagulant genes. We found an excess of rare missense mutations in anticoagulant genes in DVT cases compared to controls and an association for a missense polymorphism of FGA (rs6050; p = 1.9
7 10(-5), OR 1.45; 95% CI, 1.22-1.72; after replication in > 1400 individuals).
CONCLUSIONS:
We implemented a barcode-based strategy to efficiently multiplex sequencing of hundreds of candidate genes in several individuals. In the relatively small dataset of our pilot study we were able to identify bona fide associations with DVT. Our study illustrates the potential of next-generation sequencing for the discovery of genetic variation predisposing to complex diseases
An Integrated TCGA Pan-Cancer Clinical Data Resource to Drive High-Quality Survival Outcome Analytics
For a decade, The Cancer Genome Atlas (TCGA) program collected clinicopathologic annotation data along with multi-platform molecular profiles of more than 11,000 human tumors across 33 different cancer types. TCGA clinical data contain key features representing the democratized nature of the data collection process. To ensure proper use of this large clinical dataset associated with genomic features, we developed a standardized dataset named the TCGA Pan-Cancer Clinical Data Resource (TCGA-CDR), which includes four major clinical outcome endpoints. In addition to detailing major challenges and statistical limitations encountered during the effort of integrating the acquired clinical data, we present a summary that includes endpoint usage recommendations for each cancer type. These TCGA-CDR findings appear to be consistent with cancer genomics studies independent of the TCGA effort and provide opportunities for investigating cancer biology using clinical correlates at an unprecedented scale. Analysis of clinicopathologic annotations for over 11,000 cancer patients in the TCGA program leads to the generation of TCGA Clinical Data Resource, which provides recommendations of clinical outcome endpoint usage for 33 cancer types
The Drosophila melanogaster Genetic Reference Panel
A major challenge of biology is understanding the relationship between molecular genetic variation and variation in quantitative traits, including fitness. This relationship determines our ability to predict phenotypes from genotypes and to understand how evolutionary forces shape variation within and between species. Previous efforts to dissect the genotype-phenotype map were based on incomplete genotypic information. Here, we describe the Drosophila melanogaster Genetic Reference Panel (DGRP), a community resource for analysis of population genomics and quantitative traits. The DGRP consists of fully sequenced inbred lines derived from a natural population. Population genomic analyses reveal reduced polymorphism in centromeric autosomal regions and the X chromosome, evidence for positive and negative selection, and rapid evolution of the X chromosome. Many variants in novel genes, most at low frequency, are associated with quantitative traits and explain a large fraction of the phenotypic variance. The DGRP facilitates genotype-phenotype mapping using the power of Drosophila genetics
An integrated map of structural variation in 2,504 human genomes
Structural variants are implicated in numerous diseases and make up the majority of varying nucleotides among human genomes. Here we describe an integrated set of eight structural variant classes comprising both balanced and unbalanced variants, which we constructed using short-read DNA sequencing data and statistically phased onto haplotype blocks in 26 human populations. Analysing this set, we identify numerous gene-intersecting structural variants exhibiting population stratification and describe naturally occurring homozygous gene knockouts that suggest the dispensability of a variety of human genes. We demonstrate that structural variants are enriched on haplotypes identified by genome-wide association studies and exhibit enrichment for expression quantitative trait loci. Additionally, we uncover appreciable levels of structural variant complexity at different scales, including genic loci subject to clusters of repeated rearrangement and complex structural variants with multiple breakpoints likely to have formed through individual mutational events. Our catalogue will enhance future studies into structural variant demography, functional impact and disease association. © 2015 Macmillan Publishers Limited. All rights reserved
Exome-wide assessment of isolated biliary atresia: A report from the National Birth Defects Prevention Study using child–parent trios and a case–control design to identify novel rare variants
The etiology of biliary atresia (BA) is unknown, but recent studies suggest a role for rare protein-altering variants (PAVs). Exome sequencing data from the National Birth Defects Prevention Study on 54 child–parent trios, one child–mother duo, and 1513 parents of children with other birth defects were analyzed. Most (91%) cases were isolated BA. We performed (1) a trio-based analysis to identify rare de novo, homozygous, and compound heterozygous PAVs and (2) a case–control analysis using a sequence kernel-based association test to identify genes enriched with rare PAVs. While we replicated previous findings on PKD1L1, our results do not suggest that recurrent de novo PAVs play important roles in BA susceptibility. In fact, our finding in NOTCH2, a disease gene associated with Alagille syndrome, highlights the difficulty in BA diagnosis. Notably, IFRD2 has been implicated in other gastrointestinal conditions and warrants additional study. Overall, our findings strengthen the hypothesis that the etiology of BA is complex
Molecular Features of Cancers Exhibiting Exceptional Responses to Treatment
A small fraction of cancer patients with advanced disease survive significantly longer than patients with clinically comparable tumors. Molecular mechanisms for exceptional responses to therapy have been identified by genomic analysis of tumor biopsies from individual patients. Here, we analyzed tumor biopsies from an unbiased cohort of 111 exceptional responder patients using multiple platforms to profile genetic and epigenetic aberrations as well as the tumor microenvironment. Integrative analysis uncovered plausible mechanisms for the therapeutic response in nearly a quarter of the patients. The mechanisms were assigned to four broad categories—DNA damage response, intracellular signaling, immune engagement, and genetic alterations characteristic of favorable prognosis—with many tumors falling into multiple categories. These analyses revealed synthetic lethal relationships that may be exploited therapeutically and rare genetic lesions that favor therapeutic success, while also providing a wealth of testable hypotheses regarding oncogenic mechanisms that may influence the response to cancer therapy. Profiling multi-platform genomics of 110 cancer patients with an exceptional therapeutic response, Wheeler et al. identify putative molecular mechanisms explaining this survival phenotype in ∼23% of cases. Therapeutic success is related to rare molecular features of responding tumors, exploiting synthetic lethality and oncogene addiction
Integrating sequence and array data to create an improved 1000 Genomes Project haplotype reference panel
A major use of the 1000 Genomes Project (1000GP) data is genotype imputation in genome-wide association studies (GWAS). Here we develop a method to estimate haplotypes from low-coverage sequencing data that can take advantage of single-nucleotide polymorphism (SNP) microarray genotypes on the same samples. First the SNP array data are phased to build a backbone (or 'scaffold') of haplotypes across each chromosome. We then phase the sequence data 'onto' this haplotype scaffold. This approach can take advantage of relatedness between sequenced and non-sequenced samples to improve accuracy. We use this method to create a new 1000GP haplotype reference set for use by the human genetic community. Using a set of validation genotypes at SNP and bi-allelic indels we show that these haplotypes have lower genotype discordance and improved imputation performance into downstream GWAS samples, especially at low-frequency variants. © 2014 Macmillan Publishers Limited. All rights reserved
Polygenic transcriptome risk scores for COPD and lung function improve cross-ethnic portability of prediction in the NHLBI TOPMed program
While polygenic risk scores (PRSs) enable early identification of genetic risk for chronic obstructive pulmonary disease (COPD), predictive performance is limited when the discovery and target populations are not well matched. Hypothesizing that the biological mechanisms of disease are shared across ancestry groups, we introduce a PrediXcan-derived polygenic transcriptome risk score (PTRS) to improve cross-ethnic portability of risk prediction. We constructed the PTRS using summary statistics from application of PrediXcan on large-scale GWASs of lung function (forced expiratory volume in 1 s [FEV1] and its ratio to forced vital capacity [FEV1/FVC]) in the UK Biobank. We examined prediction performance and cross-ethnic portability of PTRS through smoking-stratified analyses both on 29,381 multi-ethnic participants from TOPMed population/family-based cohorts and on 11,771 multi-ethnic participants from TOPMed COPD-enriched studies. Analyses were carried out for two dichotomous COPD traits (moderate-to-severe and severe COPD) and two quantitative lung function traits (FEV1 and FEV1/FVC). While the proposed PTRS showed weaker associations with disease than PRS for European ancestry, the PTRS showed stronger association with COPD than PRS for African Americans (e.g., odds ratio [OR] = 1.24 [95% confidence interval [CI]: 1.08–1.43] for PTRS versus 1.10 [0.96–1.26] for PRS among heavy smokers with ≥ 40 pack-years of smoking) for moderate-to-severe COPD. Cross-ethnic portability of the PTRS was significantly higher than the PRS (paired t test p < 2.2 × 10−16 with portability gains ranging from 5% to 28%) for both dichotomous COPD traits and across all smoking strata. Our study demonstrates the value of PTRS for improved cross-ethnic portability compared to PRS in predicting COPD risk
- …