27 research outputs found
ggtranscript: an R package for the visualization and interpretation of transcript isoforms using ggplot2
MOTIVATION: The advent of long-read sequencing technologies has increased demand for the visualisation and interpretation of transcripts. However, tools that perform such visualizations remain inflexible and lack the ability to easily identify differences between transcript structures. Here, we introduce ggtranscript, an R package that provides a fast and flexible method to visualize and compare transcripts. As a ggplot2 extension, ggtranscript inherits the functionality and familiarity of ggplot2 making it easy to use. AVAILABILITY: ggtranscript is an R package available at https://github.com/dzhang32/ggtranscript (DOI: https://doi.org/10.5281/zenodo.6374061) via an open-source MIT license. Further is available at https://dzhang32.github.io/ggtranscript/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online
IntroVerse: a comprehensive database of introns across human tissues
Dysregulation of RNA splicing contributes to both rare and complex diseases. RNA-sequencing data from human tissues has shown that this process can be inaccurate, resulting in the presence of novel introns detected at low frequency across samples and within an individual. To enable the full spectrum of intron use to be explored, we have developed IntroVerse, which offers an extensive catalogue on the splicing of 332,571 annotated introns and a linked set of 4,679,474 novel junctions covering 32,669 different genes. This dataset has been generated through the analysis of 17,510 human control RNA samples from 54 tissues provided by the Genotype-Tissue Expression Consortium. IntroVerse has two unique features: (i) it provides a complete catalogue of novel junctions and (ii) each novel junction has been assigned to a specific annotated intron. This unique, hierarchical structure offers multiple uses, including the identification of novel transcripts from known genes and their tissue-specific usage, and the assessment of background splicing noise for introns thought to be mis-spliced in disease states. IntroVerse provides a user-friendly web interface and is freely available at https://rytenlab.com/browser/app/introverse
Human-lineage-specific genomic elements are associated with neurodegenerative disease and APOE transcript usage
Knowledge of genomic features specific to the human lineage may provide insights into brain-related diseases. We leverage high-depth whole genome sequencing data to generate a combined annotation identifying regions simultaneously depleted for genetic variation (constrained regions) and poorly conserved across primates. We propose that these constrained, non-conserved regions (CNCRs) have been subject to human-specific purifying selection and are enriched for brain-specific elements. We find that CNCRs are depleted from protein-coding genes but enriched within lncRNAs. We demonstrate that per-SNP heritability of a range of brain-relevant phenotypes are enriched within CNCRs. We find that genes implicated in neurological diseases have high CNCR density, including APOE, highlighting an unannotated intron-3 retention event. Using human brain RNA-sequencing data, we show the intron-3-retaining transcript to be more abundant in Alzheimer?s disease with more severe tau and amyloid pathological burden. Thus, we demonstrate potential association of human-lineage-specific sequences in brain development and neurological disease.FUNDING: Acknowledgements The authors are grateful to the participants in the Religious Order Study, the Memory and Aging Project. Z.C. and R.H.R. were supported by grants from the Leonard Wolfson Foundation. M.R. was supported by the United Kingdom Medical Research Council (MRC) through the award of a Tenure Track Clinician Scientist Fellowship (MR/ N008324/1). J.H. was supported by the UK Dementia Research Institute which receives its funding from DRI Limited, funded by the UK Medical Research Council, Alzheimer’s Society and Alzheimer’s Research UK. J.H. has also been funded by the Medical Research Council (award MR/N026004/1), Wellcome Trust (award 202903/Z/16/Z), Dolby Family Fund and National Institute for Health Research University College London Hospitals Biomedical Research Centre. J.B. is supported through the Science and Technology Agency, Séneca Foundation, CARM, Spain (research project 00007/COVI/20)
Functional genomics provide key insights to improve the diagnostic yield of hereditary ataxia
Improvements in functional genomic annotation have led to a critical mass of neurogenetic discoveries. This is exemplified in hereditary ataxia, a heterogeneous group of disorders characterised by incoordination from cerebellar dysfunction. Associated pathogenic variants in more than 300 genes have been described, leading to a detailed genetic classification partitioned by age-of-onset. Despite these advances, up to 75% of patients with ataxia remain molecularly undiagnosed even following whole genome sequencing, as exemplified in the 100,000 Genomes Project. This study aimed to understand whether we can improve our knowledge of the genetic architecture of hereditary ataxia by leveraging functional genomic annotations, and as a result, generate insights and strategies that raise the diagnostic yield. To achieve these aims, we used publicly-available multi-omics data to generate 294 genic features, capturing information relating to a gene's structure, genetic variation, tissue-specific, cell-type-specific and temporal expression, as well as protein products of a gene. We studied these features across genes typically causing childhood-onset, adult-onset or both types of disease first individually, then collectively. This led to the generation of testable hypotheses which we investigated using whole genome sequencing data from up to 2,182 individuals presenting with ataxia and 6,658 non-neurological probands recruited in the 100,000 Genomes Project. Using this approach, we demonstrated a high short tandem repeat (STR) density within childhood-onset genes suggesting that we may be missing pathogenic repeat expansions within this cohort. This was verified in both childhood- and adult-onset ataxia patients from the 100,000 Genomes Project who were unexpectedly found to have a trend for higher repeat sizes even at naturally-occurring STRs within known ataxia genes, implying a role for STRs in pathogenesis. Using unsupervised analysis, we found significant similarities in genomic annotation across the gene panels, which suggested adult- and childhood-onset patients should be screened using a common diagnostic gene set. We tested this within the 100,000 Genomes Project by assessing the burden of pathogenic variants among childhood-onset genes in adult-onset patients and vice versa. This demonstrated a significantly higher burden of rare, potentially pathogenic variants in conventional childhood-onset genes among individuals with adult-onset ataxia. Our analysis has implications for the current clinical practice in genetic testing for hereditary ataxia. We suggest that the diagnostic rate for hereditary ataxia could be increased by removing the age-of-onset partition, and through a modified screening for repeat expansions in naturally-occurring STRs within known ataxia-associated genes, in effect treating these regions as candidate pathogenic loci
RAB32 Ser71Arg in autosomal dominant Parkinson's disease:linkage, association, and functional analyses
BACKGROUND: Parkinson's disease is a progressive neurodegenerative disorder with multifactorial causes, among which genetic risk factors play a part. The RAB GTPases are regulators and substrates of LRRK2, and variants in the LRRK2 gene are important risk factors for Parkinson's disease. We aimed to explore genetic variability in RAB GTPases within cases of familial Parkinson's disease.METHODS: We did whole-exome sequencing in probands from families in Canada and Tunisia with Parkinson's disease without a genetic cause, who were recruited from the Centre for Applied Neurogenetics (Vancouver, BC, Canada), an international consortium that includes people with Parkinson's disease from 36 sites in 24 countries. 61 RAB GTPases were genetically screened, and candidate variants were genotyped in relatives of the probands to assess disease segregation by linkage analysis. Genotyping was also done to assess variant frequencies in individuals with idiopathic Parkinson's disease and controls, matched for age and sex, who were also from the Centre for Applied Neurogenetics but unrelated to the probands or each other. All participants were aged 18 years or older. The sequencing and genotyping findings were validated by case-control association analyses using bioinformatic data obtained from publicly available clinicogenomic databases (AMP-PD, GP2, and 100 000 Genomes Project) and a private German clinical diagnostic database (University of Tübingen). Clinical and pathological findings were summarised and haplotypes were determined. In-vitro studies were done to investigate protein interactions and enzyme activities.FINDINGS: Between June 1, 2010, and May 31, 2017, 130 probands from Canada and Tunisia (47 [36%] female and 83 [64%] male; mean age 72·7 years [SD 11·7; range 38-96]; 109 White European ancestry, 18 north African, two east Asian, and one Hispanic] underwent whole-exome sequencing. 15 variants in RAB GTPase genes were identified, of which the RAB32 variant c.213C>G (Ser71Arg) cosegregated with autosomal dominant Parkinson's disease in three families (nine affected individuals; non-parametric linkage Z score=1·95; p=0·03). 2604 unrelated individuals with Parkinson's disease and 344 matched controls were additionally genotyped, and five more people originating from five countries (Canada, Italy, Poland, Turkey, and Tunisia) were identified with the RAB32 variant. From the database searches, in which 6043 individuals with Parkinson's disease and 62 549 controls were included, another eight individuals were identified with the RAB32 variant from four countries (Canada, Germany, UK, and USA). Overall, the association of RAB32 c.213C>G (Ser71Arg) with Parkinson's disease was significant (odds ratio [OR] 13·17, 95% CI 2·15-87·23; p=0·0055; I2=99·96%). In the people who had the variant, Parkinson's disease presented at age 54·6 years (SD 12·75, range 31-81, n=16), and two-thirds had a family history of parkinsonism. RAB32 Ser71Arg heterozygotes shared a common haplotype, although penetrance was incomplete. Findings in one individual at autopsy showed sparse neurofibrillary tangle pathology in the midbrain and thalamus, without Lewy body pathology. In functional studies, RAB32 Arg71 activated LRRK2 kinase to a level greater than RAB32 Ser71.INTERPRETATION: RAB32 Ser71Arg is a novel genetic risk factor for Parkinson's disease, with reduced penetrance. The variant was found in individuals with Parkinson's disease from multiple ethnic groups, with the same haplotype. In-vitro assays show that RAB32 Arg71 activates LRRK2 kinase, which indicates that genetically distinct causes of familial parkinsonism share the same mechanism. The discovery of RAB32 Ser71Arg also suggests several genetically inherited causes of Parkinson's disease originated to control intracellular immunity. This shared aetiology should be considered in future translational research, while the global epidemiology of RAB32 Ser71Arg needs to be assessed to inform genetic counselling.FUNDING: National Institutes of Health, the Canada Excellence Research Chairs program, Aligning Science Across Parkinson's, the Michael J Fox Foundation for Parkinson's Research, and the UK Medical Research Council.</p
Genomewide Association Studies of LRRK2 Modifiers of Parkinson's Disease.
OBJECTIVE: The aim of this study was to search for genes/variants that modify the effect of LRRK2 mutations in terms of penetrance and age-at-onset of Parkinson's disease. METHODS: We performed the first genomewide association study of penetrance and age-at-onset of Parkinson's disease in LRRK2 mutation carriers (776 cases and 1,103 non-cases at their last evaluation). Cox proportional hazard models and linear mixed models were used to identify modifiers of penetrance and age-at-onset of LRRK2 mutations, respectively. We also investigated whether a polygenic risk score derived from a published genomewide association study of Parkinson's disease was able to explain variability in penetrance and age-at-onset in LRRK2 mutation carriers. RESULTS: A variant located in the intronic region of CORO1C on chromosome 12 (rs77395454; p value = 2.5E-08, beta = 1.27, SE = 0.23, risk allele: C) met genomewide significance for the penetrance model. Co-immunoprecipitation analyses of LRRK2 and CORO1C supported an interaction between these 2 proteins. A region on chromosome 3, within a previously reported linkage peak for Parkinson's disease susceptibility, showed suggestive associations in both models (penetrance top variant: p value = 1.1E-07; age-at-onset top variant: p value = 9.3E-07). A polygenic risk score derived from publicly available Parkinson's disease summary statistics was a significant predictor of penetrance, but not of age-at-onset. INTERPRETATION: This study suggests that variants within or near CORO1C may modify the penetrance of LRRK2 mutations. In addition, common Parkinson's disease associated variants collectively increase the penetrance of LRRK2 mutations. ANN NEUROL 2021;90:82-94
Genome-wide association study of REM sleep behavior disorder identifies polygenic risk and brain expression effects
Rapid-eye movement (REM) sleep behavior disorder (RBD), enactment of dreams during REM sleep, is an early clinical symptom of alpha-synucleinopathies and defines a more severe subtype. The genetic background of RBD and its underlying mechanisms are not well understood. Here, we perform a genome-wide association study of RBD, identifying five RBD risk loci near SNCA, GBA, TMEM175, INPP5F, and SCARB2. Expression analyses highlight SNCA-AS1 and potentially SCARB2 differential expression in different brain regions in RBD, with SNCA-AS1 further supported by colocalization analyses. Polygenic risk score, pathway analysis, and genetic correlations provide further insights into RBD genetics, highlighting RBD as a unique alpha-synucleinopathy subpopulation that will allow future early intervention
TDP-43 loss and ALS-risk SNPs drive mis-splicing and depletion of UNC13A
Variants of UNC13A, a critical gene for synapse function, increase the risk of amyotrophic lateral sclerosis and frontotemporal dementia1-3, two related neurodegenerative diseases defined by mislocalization of the RNA-binding protein TDP-434,5. Here we show that TDP-43 depletion induces robust inclusion of a cryptic exon in UNC13A, resulting in nonsense-mediated decay and loss of UNC13A protein. Two common intronic UNC13A polymorphisms strongly associated with amyotrophic lateral sclerosis and frontotemporal dementia risk overlap with TDP-43 binding sites. These polymorphisms potentiate cryptic exon inclusion, both in cultured cells and in brains and spinal cords from patients with these conditions. Our findings, which demonstrate a genetic link between loss of nuclear TDP-43 function and disease, reveal the mechanism by which UNC13A variants exacerbate the effects of decreased TDP-43 function. They further provide a promising therapeutic target for TDP-43 proteinopathies
Genome-wide association study of {REM} sleep behavior disorder identifies polygenic risk and brain expression effects
AbstractRapid-eye movement (REM) sleep behavior disorder (RBD), enactment of dreams during REM sleep, is an early clinical symptom of alpha-synucleinopathies and defines a more severe subtype. The genetic background of RBD and its underlying mechanisms are not well understood. Here, we perform a genome-wide association study of RBD, identifying five RBD risk loci near SNCA, GBA, TMEM175, INPP5F, and SCARB2. Expression analyses highlight SNCA-AS1 and potentially SCARB2 differential expression in different brain regions in RBD, with SNCA-AS1 further supported by colocalization analyses. Polygenic risk score, pathway analysis, and genetic correlations provide further insights into RBD genetics, highlighting RBD as a unique alpha-synucleinopathy subpopulation that will allow future early intervention
Genome sequencing analysis identifies new loci associated with Lewy body dementia and provides insights into its genetic architecture
The genetic basis of Lewy body dementia (LBD) is not well understood. Here, we performed whole-genome sequencing in large cohorts of LBD cases and neurologically healthy controls to study the genetic architecture of this understudied form of dementia and to generate a resource for the scientific community. Genome-wide association analysis identified five independent risk loci, whereas genome-wide gene-aggregation tests implicated mutations in the gene GBA. Genetic risk scores demonstrate that LBD shares risk profiles and pathways with Alzheimer’s and Parkinson’s disease, providing a deeper molecular understanding of the complex genetic architecture of this age-related neurodegenerative condition