276 research outputs found

    Non-coding regulatory elements: potential roles in disease and the case of epilepsy

    Get PDF
    Non-coding DNA (ncDNA) refers to the portion of the genome that does not code for proteins and accounts for the greatest physical proportion of the human genome. ncDNA includes sequences that are transcribed into RNA molecules, such as ribosomal RNAs (rRNAs), microRNAs (miRNAs), long non-coding RNAs (lncRNAs), and un-transcribed sequences that have regulatory functions, including gene promoters and enhancers. Variation in non-coding regions of the genome have an established role in human disease, with growing evidence from many areas, including several cancers, Parkinson's disease and autism. Here, we review the features and functions of the regulatory elements that are present in the non-coding genome and the role that these regions have in human disease. We then review the existing research in epilepsy and emphasise the potential value of further exploring non-coding regulatory elements in epilepsy. In addition, we outline the most widely used techniques for recognising regulatory elements throughout the genome, current methodologies for investigating variation and the main challenges associated with research in the field of non-coding DNA

    Improving GENCODE reference gene annotation using a high-stringency proteogenomics workflow.

    Get PDF
    Complete annotation of the human genome is indispensable for medical research. The GENCODE consortium strives to provide this, augmenting computational and experimental evidence with manual annotation. The rapidly developing field of proteogenomics provides evidence for the translation of genes into proteins and can be used to discover and refine gene models. However, for both the proteomics and annotation groups, there is a lack of guidelines for integrating this data. Here we report a stringent workflow for the interpretation of proteogenomic data that could be used by the annotation community to interpret novel proteogenomic evidence. Based on reprocessing of three large-scale publicly available human data sets, we show that a conservative approach, using stringent filtering is required to generate valid identifications. Evidence has been found supporting 16 novel protein-coding genes being added to GENCODE. Despite this many peptide identifications in pseudogenes cannot be annotated due to the absence of orthogonal supporting evidence

    Flexible Data Analysis Pipeline for High-Confidence Proteogenomics.

    Get PDF
    Proteogenomics leverages information derived from proteomic data to improve genome annotations. Of particular interest are "novel" peptides that provide direct evidence of protein expression for genomic regions not previously annotated as protein-coding. We present a modular, automated data analysis pipeline aimed at detecting such "novel" peptides in proteomic data sets. This pipeline implements criteria developed by proteomics and genome annotation experts for high-stringency peptide identification and filtering. Our pipeline is based on the OpenMS computational framework; it incorporates multiple database search engines for peptide identification and applies a machine-learning approach (Percolator) to post-process search results. We describe several new and improved software tools that we developed to facilitate proteogenomic analyses that enhance the wealth of tools provided by OpenMS. We demonstrate the application of our pipeline to a human testis tissue data set previously acquired for the Chromosome-Centric Human Proteome Project, which led to the addition of five new gene annotations on the human reference genome

    The cultural capitalists: notes on the ongoing reconfiguration of trafficking culture in Asia

    Get PDF
    Most analysis of the international flows of the illicit art market has described a global situation in which a postcolonial legacy of acquisition and collection exploits cultural heritage by pulling it westwards towards major international trade nodes in the USA and Europe. As the locus of consumptive global economic power shifts, however, these traditional flows are pulled in other directions: notably for the present commentary, towards and within Asia

    Genome-wide association study: Exploring the genetic basis for responsiveness to ketogenic dietary therapies for drug-resistant epilepsy

    Get PDF
    OBJECTIVE: With the exception of specific metabolic disorders, predictors of response to ketogenic dietary therapies (KDTs) are unknown. We aimed to determine whether common variation across the genome influences the response to KDT for epilepsy. METHODS: We genotyped individuals who were negative for glucose transporter type 1 deficiency syndrome or other metabolic disorders, who received KDT for epilepsy. Genotyping was performed with the Infinium HumanOmniExpressExome Beadchip. Hospital records were used to obtain demographic and clinical data. KDT response (≥50% seizure reduction) at 3-month follow-up was used to dissect out nonresponders and responders. We then performed a genome-wide association study (GWAS) in nonresponders vs responders, using a linear mixed model and correcting for population stratification. Variants with minor allele frequency <0.05 and those that did not pass quality control filtering were excluded. RESULTS: After quality control filtering, the GWAS of 112 nonresponders vs 123 responders revealed an association locus at 6p25.1, 61 kb upstream of CDYL (rs12204701, P = 3.83 × 10-8 , odds ratio [A] = 13.5, 95% confidence interval [CI] 4.07-44.8). Although analysis of regional linkage disequilibrium around rs12204701 did not strengthen the likelihood of CDYL being the candidate gene, additional bioinformatic analyses suggest it is the most likely candidate. SIGNIFICANCE: CDYL deficiency has been shown to disrupt neuronal migration and to influence susceptibility to epilepsy in mice. Further exploration with a larger replication cohort is warranted to clarify whether CDYL is the causal gene underlying the association signal

    Do aluminium-based phosphate binders continue to have a role in contemporary nephrology practice?

    Get PDF
    Background: Aluminium-containing phosphate binders have long been used for treatment of hyperphosphatemia in dialysis patients. Their safety became controversial in the early 1980's after reports of aluminium related neurological and bone disease began to appear. Available historical evidence however, suggests that neurological toxicity may have primarily been caused by excessive exposure to aluminium in dialysis fluid, rather than aluminium-containing oral phosphate binders. Limited evidence suggests that aluminium bone disease may also be on the decline in the era of aluminium removal from dialysis fluid, even with continued use of aluminium binders

    Discerning natural and anthropogenic organic matter inputs to salt marsh sediments of Ria Formosa lagoon (South Portugal)

    Get PDF
    Sedimentary organic matter (OM) origin and molecular composition provide useful information to understand carbon cycling in coastal wetlands. Core sediments from threors' Contributionse transects along Ria Formosa lagoon intertidal zone were analysed using analytical pyrolysis (Py-GC/MS) to determine composition, distribution and origin of sedimentary OM. The distribution of alkyl compounds (alkanes, alkanoic acids and alkan-2-ones), polycyclic aromatic hydrocarbons (PAHs), lignin-derived methoxyphenols, linear alkylbenzenes (LABs), steranes and hopanes indicated OM inputs to the intertidal environment from natural-autochthonous and allochthonous-as well as anthropogenic. Several n-alkane geochemical indices used to assess the distribution of main OM sources (terrestrial and marine) in the sediments indicate that algal and aquatic macrophyte derived OM inputs dominated over terrigenous plant sources. The lignin-derived methoxyphenol assemblage, dominated by vinylguaiacol and vinylsyringol derivatives in all sediments, points to large OM contribution from higher plants. The spatial distributions of PAHs (polyaromatic hydrocarbons) showed that most pollution sources were mixed sources including both pyrogenic and petrogenic. Low carbon preference indexes (CPI > 1) for n-alkanes, the presence of UCM (unresolved complex mixture) and the distribution of hopanes (C-29-C-36) and steranes (C-27-C-29) suggested localized petroleum-derived hydrocarbon inputs to the core sediments. Series of LABs were found in most sediment samples also pointing to domestic sewage anthropogenic contributions to the sediment OM.EU Erasmus Mundus Joint Doctorate fellowship (FUECA, University of Cadiz, Spain)EUEuropean Commission [FP7-ENV-2011, 282845, FP7-534 ENV-2012, 308392]MINECO project INTERCARBON [CGL2016-78937-R]info:eu-repo/semantics/publishedVersio

    Novel Association Strategy with Copy Number Variation for Identifying New Risk Loci of Human Diseases

    Get PDF
    Copy number variations (CNV) are important causal genetic variations for human disease; however, the lack of a statistical model has impeded the systematic testing of CNVs associated with disease in large-scale cohort.Here, we developed a novel integrated strategy to test CNV-association in genome-wide case-control studies. We converted the single-nucleotide polymorphism (SNP) signal to copy number states using a well-trained hidden Markov model. We mapped the susceptible CNV-loci through SNP site-specific testing to cope with the physiological complexity of CNVs. We also ensured the credibility of the associated CNVs through further window-based CNV-pattern clustering. Genome-wide data with seven diseases were used to test our strategy and, in total, we identified 36 new susceptible loci that are associated with CNVs for the seven diseases: 5 with bipolar disorder, 4 with coronary artery disease, 1 with Crohn's disease, 7 with hypertension, 9 with rheumatoid arthritis, 7 with type 1 diabetes and 3 with type 2 diabetes. Fifteen of these identified loci were validated through genotype-association and physiological function from previous studies, which provide further confidence for our results. Notably, the genes associated with bipolar disorder converged in the phosphoinositide/calcium signaling, a well-known affected pathway in bipolar disorder, which further supports that CNVs have impact on bipolar disorder.Our results demonstrated the effectiveness and robustness of our CNV-association analysis and provided an alternative avenue for discovering new associated loci of human diseases

    RNAcentral 2021: secondary structure integration, improved sequence search and new member databases

    Get PDF
    RNAcentral is a comprehensive database of non-coding RNA (ncRNA) sequences that provides a single access point to 44 RNA resources and >18 million ncRNA sequences from a wide range of organisms and RNA types. RNAcentral now also includes secondary (2D) structure information for >13 million sequences, making RNAcentral the world’s largest RNA 2D structure database. The 2D diagrams are displayed using R2DT, a new 2D structure visualization method that uses consistent, reproducible and recognizable layouts for related RNAs. The sequence similarity search has been updated with a faster interface featuring facets for filtering search results by RNA type, organism, source database or any keyword. This sequence search tool is available as a reusable web component, and has been integrated into several RNAcentral member databases, including Rfam, miRBase and snoDB. To allow for a more fine-grained assignment of RNA types and subtypes, all RNAcentral sequences have been annotated with Sequence Ontology terms. The RNAcentral database continues to grow and provide a central data resource for the RNA community. RNAcentral is freely available at https://rnacentral.org

    The fitness cost of mis-splicing is the main determinant of alternative splicing patterns

    Get PDF
    Background Most eukaryotic genes are subject to alternative splicing (AS), which may contribute to the production of protein variants or to the regulation of gene expression via nonsense-mediated messenger RNA (mRNA) decay (NMD). However, a fraction of splice variants might correspond to spurious transcripts and the question of the relative proportion of splicing errors to functional splice variants remains highly debated. Results We propose a test to quantify the fraction of AS events corresponding to errors. This test is based on the fact that the fitness cost of splicing errors increases with the number of introns in a gene and with expression level. We analyzed the transcriptome of the intron-rich eukaryote Paramecium tetraurelia. We show that in both normal and in NMD-deficient cells, AS rates strongly decrease with increasing expression level and with increasing number of introns. This relationship is observed for AS events that are detectable by NMD as well as for those that are not, which invalidates the hypothesis of a link with the regulation of gene expression. Our results show that in genes with a median expression level, 92–98% of observed splice variants correspond to errors. We observed the same patterns in human transcriptomes and we further show that AS rates correlate with the fitness cost of splicing errors. Conclusions These observations indicate that genes under weaker selective pressure accumulate more maladaptive substitutions and are more prone to splicing errors. Thus, to a large extent, patterns of gene expression variants simply reflect the balance between selection, mutation, and drift
    • …
    corecore