4,219 research outputs found

    Identification of De Novo Copy Number Variants Associated with Human Disorders of Sexual Development

    Get PDF
    Disorders of sexual development (DSD), ranging in severity from genital abnormalities to complete sex reversal, are among the most common human birth defects with incidence rates reaching almost 3%. Although causative alterations in key genes controlling gonad development have been identified, the majority of DSD cases remain unexplained. To improve the diagnosis, we screened 116 children born with idiopathic DSD using a clinically validated array-based comparative genomic hybridization platform. 8951 controls without urogenital defects were used to compare with our cohort of affected patients. Clinically relevant imbalances were found in 21.5% of the analyzed patients. Most anomalies (74.2%) evaded detection by the routinely ordered karyotype and were scattered across the genome in gene-enriched subtelomeric loci. Among these defects, confirmed de novo duplication and deletion events were noted on 1p36.33, 9p24.3 and 19q12-q13.11 for ambiguous genitalia, 10p14 and Xq28 for cryptorchidism and 12p13 and 16p11.2 for hypospadias. These variants were significantly associated with genitourinary defects (P = 6.08×10−12). The causality of defects observed in 5p15.3, 9p24.3, 22q12.1 and Xq28 was supported by the presence of overlapping chromosomal rearrangements in several unrelated patients. In addition to known gonad determining genes including SRY and DMRT1, novel candidate genes such as FGFR2, KANK1, ADCY2 and ZEB2 were encompassed. The identification of risk germline rearrangements for urogenital birth defects may impact diagnosis and genetic counseling and contribute to the elucidation of the molecular mechanisms underlying the pathogenesis of human sexual development

    Genome-wide association study of primary tooth eruption identifies pleiotropic loci associated with height and craniofacial distances

    Get PDF
    Twin and family studies indicate that the timing of primary tooth eruption is highly heritable, with estimates typically exceeding 80%. To identify variants involved in primary tooth eruption we performed a population based genome-wide association study of ‘age at first tooth’ and ‘number of teeth’ using 5998 and 6609 individuals respectively from the Avon Longitudinal Study of Parents and Children (ALSPAC) and 5403 individuals from the 1966 Northern Finland Birth Cohort (NFBC1966). We tested 2,446,724 SNPs imputed in both studies. Analyses were controlled for the effect of gestational age, sex and age of measurement. Results from the two studies were combined using fixed effects inverse variance meta-analysis. We identified a total of fifteen independent loci, with ten loci reaching genome-wide significance (p<5x10−8) for ‘age at first tooth’ and eleven loci for ‘number of teeth’. Together these associations explain 6.06% of the variation in ‘age of first tooth’ and 4.76% of the variation in ‘number of teeth’. The identified loci included eight previously unidentified loci, some containing genes known to play a role in tooth and other developmental pathways, including a SNP in the protein-coding region of BMP4 (rs17563, P= 9.080x10−17). Three of these loci, containing the genes HMGA2, AJUBA and ADK, also showed evidence of association with craniofacial distances, particularly those indexing facial width. Our results suggest that the genome-wide association approach is a powerful strategy for detecting variants involved in tooth eruption, and potentially craniofacial growth and more generally organ development

    The Genetic Architecture of Structural Renal and Urinary Tract Malformations

    Get PDF
    Structural renal and urinary tract malformations are the most common cause of kidney failure in children. These congenital anomalies of the kidneys and urinary tract (CAKUT) are a phenotypically diverse group of malformations that result from defects in embryonic kidney, ureter, and bladder development. A genetic basis for CAKUT has been proposed, with over 50 monogenic causes reported, however, a molecular diagnosis is detected in less than 20% of patients. In this thesis, I used bioinformatics and statistical genetics methodology to investigate the genetic architecture of structural renal and urinary tract malformations using whole-genome sequencing (WGS) data from the 100,000 Genomes Project. Population-based rare and common variant association testing was performed in over 800 cases and 20,000 controls of diverse ancestry seeking enrichment of single-nucleotide/indel and structural variation on a genome-wide, per-gene, and cis-regulatory element basis. Using a sequencing-based genome-wide association study (GWAS) I identified the first robust genetic associations of posterior urethral valves (PUV), the most common cause of kidney failure in boys. Bayesian fine-mapping and functional annotation mapped these two loci to the transcription factor TBX5 and planar cell polarity gene PTK7, with both signals replicated in an independent cohort. Significant enrichment of rare structural variation affecting cis-regulatory elements was also detected providing novel insights into the pathogenesis of this poorly understood disorder. I also demonstrated that the contribution of known monogenic disease to CAKUT has been overestimated and that common and low-frequency variation plays an important role in phenotypic variability. These findings support an omnigenic rather than monogenic model of inheritance for CAKUT and are consistent with the extensive genotypic-phenotypic heterogeneity, variable expressivity, and incomplete penetrance observed in this condition. Finally, this work demonstrates the value of sequencing-based GWAS methodology in rare disease, beyond conventional monogenic gene discovery, and provides strong support for an inclusive diverse-ancestry approach

    Copy number variation analysis in the context of electronic medical records and large-scale genomics consortium efforts

    Get PDF
    The goal of this paper is to review recent research on copy number variations (CNVs) and their association with complex and rare diseases. In the latter part of this paper, we focus on how large biorepositories such as the electronic medical record and genomics (eMERGE) consortium may be best leveraged to systematically mine for potentially pathogenic CNVs, and we end with a discussion of how such variants might be reported back for inclusion in electronic medical records as part of medical history

    Prevalence and Comorbidity of Atopic Dermatitis in Children: A Large-Scale Population Study Based on Real-World Data

    Get PDF
    This study aimed at exploring atopic dermatitis (AD) prevalence in children and exhaustively analyzing their comorbidity. We conducted a descriptive analysis of their socio-demographic and comorbidity characteristics in the EpiChron Cohort (Aragon, Spain). Adjusted odds ratios (OR) were calculated for each comorbidity using logistic regression models. In total, 33, 591 children had a diagnosis of AD, resulting in an overall prevalence of 15.5%. AD prevalence was higher in girls compared to boys, in 3-9-year-olds compared to children of other ages, and in Spanish children compared to those of other nationalities. Multimorbidity was present in 43% of children, with the most frequent chronic comorbidities being asthma (13.1%), psychosocial disorders (7.9%), and visual impairment (7.8%). Many diseases were, regardless of their prevalence, statistically associated with AD. The strongest associations (odds ratio (OR) (95% confidence interval (CI))) were found in asthma (2.10 (2.02-2.17)), allergic rhinitis (2.00 (1.91-2.10)), and irritable bowel syndrome (1.90 (1.56-2.31)). A better understanding of the array of comorbidities associated with AD in children might help improve their clinical management. Future longitudinal studies are encouraged to shed light on the potential underlying pathophysiological mechanisms involved in the identified associations

    Epigenomic and Transcriptomic Profiling for the Study of Monogenic and Polygenic Traits and Disease

    Full text link
    Many trait-associated genomic loci are in non-coding regions of the genome. Determining which genetic variants in these regions are causally related to a trait and elucidating their downstream effects can be difficult. Layering transcriptomic and epigenomic data on top of genetic variation data can help nominate causal phenotype-associated variants and generate hypotheses about their effects in different cellular contexts. In this thesis, I first apply RNA-sequencing (RNA-seq) and the assay for transposase accessible chromatin using sequencing (ATAC-seq) to investigate gene expression and chromatin accessibility in the Danforth mouse, a model of caudal birth defects. The Danforth phenotype results from an endogenous retroviral insertion near the Ptf1a gene. I identify 49 genes differentially expressed between Danforth and WT E9.5 tailbuds, including increased expression of Ptf1a and the nearby Gm13344 lncRNA in Danforth. A gene ontology enrichment analysis indicates differentially expressed genes are enriched in the hedgehog signaling pathway, suggesting disruption of hedgehog signaling may cause the Danforth phenotype. I identify one region of increased chromatin accessibility in Danforth relative to WT mice, localizing to the Gm13344 promoter. This region is orthologous to a human PTF1A enhancer, suggesting it may mediate Ptf1a overexpression in the Danforth mouse. Next, I apply a software package for the quality control of ATAC-seq data (developed in our lab) to public datasets to measure heterogeneity, and analyze GM12878 ATAC-seq data to quantify the impact of Tn5 transposase concentration and sequencing lane cluster density. I find that increasing cluster density shifts the ATAC-seq fragment length distribution towards shorter fragments and results in greater transcription start site enrichment. I show that increasing Tn5 transposase concentration increases the enrichment of reads in enhancers and promoters, with ~80% of ATAC-seq peaks showing increased signal with increasing Tn5 concentration (5% FDR). Peaks bound by the CTCF transcription factor are less sensitive to Tn5 concentration than those bound by other transcription factors. This analysis demonstrates the difficulties in reliably quantifying chromatin accessibility and utilizing public datasets. I then apply single-nucleus ATAC-seq and RNA-seq to human and rat skeletal muscle to generate cell type specific transcriptomic and chromatin accessibility maps. I integrate these maps with UK Biobank genome-wide association study (GWAS) data to explore enrichment of GWAS signals in cell type specific ATAC-seq peaks. I demonstrate the utility of these maps by nominating causal genetic variants and cell types at several GWAS loci, including the T2D-associated ARL15 locus. At the ARL15 locus I nominate a credible set variant in a highly mesenchymal stem cell specific ATAC-seq peak. Lastly, to gain insight into the genetic regulation of chromatin architecture and its association with aerobic exercise capacity, I analyze skeletal muscle ATAC-seq (n = 129) and RNA-seq (n = 143) from a rat model for untrained running capacity. Although no genes associate with running capacity at 5% FDR, a gene ontology enrichment analysis indicates that the genes with the strongest association are enriched in fatty acid oxidation pathways, consistent with previous findings in this rat model. I identify no ATAC-seq peaks associated with running capacity (5% FDR) but find 4,477 ATAC-seq peaks associate with at least one SNP (5% FDR). Together, these projects demonstrate the value of epigenomic and transcriptomic data in the investigation of monogenic and polygenic traits, as well as the challenges and limitations of applying epigenomic and transcriptomic data in this context.PHDBioinformaticsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/163000/1/porchard_1.pd

    Analysis of Genotype, Phenotype, and Age Progression in Phelan-McDermid Syndrome

    Get PDF
    Phelan-McDermid syndrome is a developmental disability syndrome associated with deletions of the terminal end of one copy of chromosome 22q13. The observed chromosomal aberrations include simple terminal deletions, interstitial deletions, deletions and duplications, and duplications without deletions. All patients have some degree of developmental disability and many also have hypotonia, autism, minor dysmorphic features, and seizures. I performed an epidemiological and cytogenetic investigation to better understand the etiology of Phelan-McDermid syndrome and to provide information to patients and their families, clinicians, and researchers investigating developmental disabilities. Deletions vary widely in size, from 60 kb to more than 9 Mb, but almost all cases are missing one copy of the subtelomeric gene SHANK3, a candidate gene for neurological features. The results of this study established that larger deletions are associated with more severe disability establishing the rationale to investigate the role of additional genes or genomic regions for clinical features. Statistical association analyses identified specific genomic regions as associated with 22 clinical features. In particular, speech is highly correlated with deletion size indicating that speech-related genes or genomic elements located in genomic bands 22q13.2q13.31 may be critical in determining a patient\u27s ability to communicate verbally. The use of protein interaction networks identified candidate genes within these narrowed genomic regions. Also, a longitudinal assessment of phenotypes observed among individuals aged 0.4 to 64 years established significant variation of phenotypes by age, such that future investigations need to take age into account when conducting genotype-phenotype studies. In particular, we find that behavioral difficulties subside and low muscle tone becomes less prominent as children age, however seizures, autism, and some chronic diseases become more apparent in teens and adults
    corecore