13 research outputs found

    NovoGraph: Human genome graph construction from multiple long-read de novo assemblies [version 2; referees: 2 approved]

    Get PDF
    Genome graphs are emerging as an important novel approach to the analysis of high-throughput human sequencing data. By explicitly representing genetic variants and alternative haplotypes in a mappable data structure, they can enable the improved analysis of structurally variable and hyperpolymorphic regions of the genome. In most existing approaches, graphs are constructed from variant call sets derived from short-read sequencing. As long-read sequencing becomes more cost-effective and enables de novo assembly for increasing numbers of whole genomes, a method for the direct construction of a genome graph from sets of assembled human genomes would be desirable. Such assembly-based genome graphs would encompass the wide spectrum of genetic variation accessible to long-read-based de novo assembly, including large structural variants and divergent haplotypes. Here we present NovoGraph, a method for the construction of a human genome graph directly from a set of de novo assemblies. NovoGraph constructs a genome-wide multiple sequence alignment of all input contigs and creates a graph by merging the input sequences at positions that are both homologous and sequence-identical. NovoGraph outputs resulting graphs in VCF format that can be loaded into third-party genome graph toolkits. To demonstrate NovoGraph, we construct a genome graph with 23,478,835 variant sites and 30,582,795 variant alleles from de novo assemblies of seven ethnically diverse human genomes (AK1, CHM1, CHM13, HG003, HG004, HX1, NA19240). Initial evaluations show that mapping against the constructed graph reduces the average mismatch rate of reads from sample NA12878 by approximately 0.2%, albeit at a slightly increased rate of reads that remain unmapped

    Exome Sequencing Implicates Ancestry-Related Mendelian Variation at SYNE1 in Childhood-Onset Essential Hypertension

    Get PDF
    Childhood-onset essential hypertension (COEH) is an uncommon form of hypertension that manifests in childhood or adolescence and, in the United States, disproportionately affects children of African ancestry. The etiology of COEH is unknown, but its childhood onset, low prevalence, high heritability, and skewed ancestral demography suggest the potential to identify rare genetic variation segregating in a Mendelian manner among affected individuals and thereby implicate genes important to disease pathogenesis. However, no COEH genes have been reported to date. Here, we identify recessive segregation of rare and putatively damaging missense variation in the spectrin domain of spectrin repeat containing nuclear envelope protein 1 (SYNE1), a cardiovascular candidate gene, in 3 of 16 families with early-onset COEH without an antecedent family history. By leveraging exome sequence data from an additional 48 COEH families, 1,700 in-house trios, and publicly available data sets, we demonstrate that compound heterozygous SYNE1 variation in these COEH individuals occurred more often than expected by chance and that this class of biallelic rare variation was significantly enriched among individuals of African genetic ancestry. Using in vitro shRNA knockdown of SYNE1, we show that reduced SYNE1 expression resulted in a substantial decrease in the elasticity of smooth muscle vascular cells that could be rescued by pharmacological inhibition of the downstream RhoA/Rho-associated protein kinase pathway. These results provide insights into the molecular genetics and underlying pathophysiology of COEH and suggest a role for precision therapeutics in the future

    Epigenome-wide association studies identify novel DNA methylation sites associated with PTSD: A meta-analysis of 23 military and civilian cohorts

    Get PDF
    BACKGROUND: The occurrence of post-traumatic stress disorder (PTSD) following a traumatic event is associated with biological differences that can represent the susceptibility to PTSD, the impact of trauma, or the sequelae of PTSD itself. These effects include differences in DNA methylation (DNAm), an important form of epigenetic gene regulation, at multiple CpG loci across the genome. Moreover, these effects can be shared or specific to both central and peripheral tissues. Here, we aim to identify blood DNAm differences associated with PTSD and characterize the underlying biological mechanisms by examining the extent to which they mirror associations across multiple brain regions. METHODS: As the Psychiatric Genomics Consortium (PGC) PTSD Epigenetics Workgroup, we conducted the largest cross-sectional meta-analysis of epigenome-wide association studies (EWASs) of PTSD to date, involving 5077 participants (2156 PTSD cases and 2921 trauma-exposed controls) from 23 civilian and military studies. PTSD diagnosis assessments were harmonized following the standardized guidelines established by the PGC-PTSD Workgroup. DNAm was assayed from blood using either Illumina HumanMethylation450 or MethylationEPIC (850K) BeadChips. A common QC pipeline was applied. Within each cohort, DNA methylation was regressed on PTSD, sex (if applicable), age, blood cell proportions, and ancestry. An inverse variance-weighted meta-analysis was performed. We conducted replication analyses in tissue from multiple brain regions, neuronal nuclei, and a cellular model of prolonged stress. RESULTS: We identified 11 CpG sites associated with PTSD in the overall meta-analysis (1.44e-09 < p < 5.30e-08), as well as 14 associated in analyses of specific strata (military vs civilian cohort, sex, and ancestry), including CpGs in AHRR and CDC42BPB. Many of these loci exhibit blood-brain correlation in methylation levels and cross-tissue associations with PTSD in multiple brain regions. Methylation at most CpGs correlated with their annotated gene expression levels. CONCLUSIONS: This study identifies 11 PTSD-associated CpGs, also leverages data from postmortem brain samples, GWAS, and genome-wide expression data to interpret the biology underlying these associations and prioritize genes whose regulation differs in those with PTSD

    Optimal Diffeomorphic Matching in Biomedical Image Processing

    Get PDF
    We consider optimal matching of submanifolds such as curves and surfaces by a variational approach based on Hilbert spaces of diffeomorphic transformations. In an abstract setting, the optimal matching is formulated as a minimization problem involving actions of diffeomorphisms on regular Borel measures considered as supporting measures of the reference and the target submanifolds. The objective functional consists of two parts measuring the elastic energy of the dynamically deformed surfaces and the quality of the matching. To make the problem computationally accessible, we use reproducing kernel Hilbert spaces with radial kernels and weighted sums of Dirac measures which gives rise to diffeomorphic point matching and amounts to the solution of a finite dimensional minimization problem. We present a matching algorithm based on the first order necessary optimality conditions which include an initial-value problem for a dynamical system in the trajectories describing the deformation of the surfaces and a final-time problem associated with the adjoint equations. The performance of the algorithm is illustrated by numerical results for examples from medical image analysis

    Diffeomorphic Matching and Dynamic Deformable Surfaces in 3D Medical Imaging

    Get PDF
    We consider optimal matching of submanifolds such as curves and surfaces by a variational approach based on Hilbert spaces of diffeomorphic transformations. In an abstract setting, the optimal matching is formulated as a minimization problem involving actions of diffeomorphisms on regular Borel measures considered as supporting measures of the reference and the target submanifolds. The objective functional consists of two parts measuring the elastic energy of the dynamically deformed surfaces and the quality of the matching. To make the problem computationally accessible, we use reproducing kernel Hilbert spaces with radial kernels and weighted sums of Dirac measures which gives rise to diffeomorphic point matching and amounts to the solution of a finite dimensional minimization problem. We present a matching algorithm based on the first order necessary optimality conditions which include an initial-value problem for a dynamical system in the trajectories describing the deformation of the surfaces and a final-time problem associated with the adjoint equations. The performance of the algorithm is illustrated by numerical results for examples from medical image analysis
    corecore