77 research outputs found

    Flexible RNA design under structure and sequence constraints using formal languages

    Get PDF
    The problem of RNA secondary structure design (also called inverse folding) is the following: given a target secondary structure, one aims to create a sequence that folds into, or is compatible with, a given structure. In several practical applications in biology, additional constraints must be taken into account, such as the presence/absence of regulatory motifs, either at a specific location or anywhere in the sequence. In this study, we investigate the design of RNA sequences from their targeted secondary structure, given these additional sequence constraints. To this purpose, we develop a general framework based on concepts of language theory, namely context-free grammars and finite automata. We efficiently combine a comprehensive set of constraints into a unifying context-free grammar of moderate size. From there, we use generic generic algorithms to perform a (weighted) random generation, or an exhaustive enumeration, of candidate sequences. The resulting method, whose complexity scales linearly with the length of the RNA, was implemented as a standalone program. The resulting software was embedded into a publicly available dedicated web server. The applicability demonstrated of the method on a concrete case study dedicated to Exon Splicing Enhancers, in which our approach was successfully used in the design of \emph{in vitro} experiments.Comment: ACM BCB 2013 - ACM Conference on Bioinformatics, Computational Biology and Biomedical Informatics (2013

    Extent, causes, and consequences of small RNA expression variation in human adipose tissue.

    Get PDF
    Small RNAs are functional molecules that modulate mRNA transcripts and have been implicated in the aetiology of several common diseases. However, little is known about the extent of their variability within the human population. Here, we characterise the extent, causes, and effects of naturally occurring variation in expression and sequence of small RNAs from adipose tissue in relation to genotype, gene expression, and metabolic traits in the MuTHER reference cohort. We profiled the expression of 15 to 30 base pair RNA molecules in subcutaneous adipose tissue from 131 individuals using high-throughput sequencing, and quantified levels of 591 microRNAs and small nucleolar RNAs. We identified three genetic variants and three RNA editing events. Highly expressed small RNAs are more conserved within mammals than average, as are those with highly variable expression. We identified 14 genetic loci significantly associated with nearby small RNA expression levels, seven of which also regulate an mRNA transcript level in the same region. In addition, these loci are enriched for variants significant in genome-wide association studies for body mass index. Contrary to expectation, we found no evidence for negative correlation between expression level of a microRNA and its target mRNAs. Trunk fat mass, body mass index, and fasting insulin were associated with more than twenty small RNA expression levels each, while fasting glucose had no significant associations. This study highlights the similar genetic complexity and shared genetic control of small RNA and mRNA transcripts, and gives a quantitative picture of small RNA expression variation in the human population

    Chromatin and Epigenetics

    Get PDF
    Genomics has gathered broad public attention since Lamarck put forward his top-down hypothesis of 'motivated change' in 1809 in his famous book "Philosophie Zoologique" and even more so since Darwin published his famous bottom-up theory of natural selection in "The Origin of Species" in 1859. The public awareness culminated in the much anticipated race to decipher the sequence of the human genome in 2002. Over all those years, it has become apparent that genomic DNA is compacted into chromatin with a dedicated 3D higher-order organization and dynamics, and that on each structural level epigenetic modifications exist. The book "Chromatin and Epigenetics" addresses current issues in the fields of epigenetics and chromatin ranging from more theoretical overviews in the first four chapters to much more detailed methodologies and insights into diagnostics and treatments in the following chapters. The chapters illustrate in their depth and breadth that genetic information is stored on all structural and dynamical levels within the nucleus with corresponding modifications of functional relevance. Thus, only an integrative systems approach allows to understand, treat, and manipulate the holistic interplay of genotype and phenotype creating functional genomes. The book chapters therefore contribute to this general perspective, not only opening opportunities for a true universal view on genetic information but also being key for a general understanding of genomes, their function, as well as life and evolution in general

    Tailoring Microalgae for Efficient Biofuel Production

    Get PDF
    Depleting fossil fuel, soaring prices, growing demand, and global climate change concerns have driven the research for finding an alternative source of sustainable fuel. Microalgae have emerged as a potential feedstock for biofuel production as many strains accumulate higher amounts of lipid, with faster biomass growth and higher photosynthetic yield than their land plant counterparts. In addition to this, microalgae can be cultured without needing agricultural land or ecological landscapes and offer opportunities for mitigating global climate change, allowing waste water treatment and carbon dioxide sequestration. Despite these benefits, microalgae pose many challenges, including low lipid yield under limiting growth conditions and slow growth in high lipid content strains. Biotechnological interventions can make major advances in strain improvement for the commercial scale production of biofuel. We discuss various strategies, including efficient transformation toolbox, to increase lipid accumulation and its quality through the regulation of key enzymes involved in lipid production, by blocking the competing pathways, pyramiding genes, enabling high cell biomass under nutrient-deprived conditions and other environmental stresses, and controlling the upstream regulators of targets, the transcription factors, and microRNAs. We highlight the opportunities emerging from the current progress in the application of genome editing in microalgae for accelerating the strain improvement program

    A profile of differential DNA methylation in sporadic human prion disease blood: precedent, implications and clinical promise

    Get PDF
    Sporadic Creutzfeldt-Jakob Disease (sCJD) is a rare but devastating neurodegenerative disorder characterised by misfolding, propagation and deposition of the prion protein in the brain, leading to neuronal death and rapid cognitive and functional decline. As there is no obvious genetic cause of sCJD, the epigenetic status of sCJD patients may clarify spontaneous prion disease aetiology or reveal biomarkers of the disease. Blood from patients was profiled to document genome-wide differential DNA methylation. // 38 loci were identified as being differentially methylated in sCJD blood, including two which associated with disease severity as measured by the MRC Scale score. Of 7 loci considered for replication, 5 showed similar effects in a second cohort of patients, but not in patients of Alzheimer’s disease, iatrogenic CJD, or inherited prion disease, suggesting these effects are specific to the sporadic form of CJD. Notably hypomethylation at a site in the promoter of AIM2, an inflammasome component, retained its association with disease severity. // Hypomethylation of FKBP5, a gene known to regulate the cellular response to cortisol, prompted further investigation which revealed that circulating cortisol is indeed elevated in sCJD patients. Profiling of frontal cortex-derived DNA showed that differential methylation observed in blood is absent from the brain methylome. // Machine learning classification of sCJD based on genome-wide methylation data was able to classify sCJD and healthy control status with an accuracy of 87.04%. This is an appreciable level of accuracy but importantly sets precedence for further classification of prion patients in more complex clinical and research settings, as well as assisting differential diagnosis of less conventional rapid dementias

    Integrative bioinformatics applications for complex human disease contexts

    Get PDF
    This thesis presents new methods for the analysis of high-throughput data from modern sources in the context of complex human diseases, at the example of a bioinformatics analysis workflow. New measurement techniques improve the resolution with which cellular and molecular processes can be monitored. While RNA sequencing (RNA-seq) measures mRNA expression, single-cell RNA-seq (scRNA-seq) resolves this on a per-cell basis. Long-read sequencing is increasingly used in genomics. With imaging mass spectrometry (IMS) the protein level in tissues is measured spatially resolved. All these techniques induce specific challenges, which need to be addressed with new computational methods. Collecting knowledge with contextual annotations is important for integrative data analyses. Such knowledge is available through large literature repositories, from which information, such as miRNA-gene interactions, can be extracted using text mining methods. After aggregating this information in new databases, specific questions can be answered with traceable evidence. The combination of experimental data with these databases offers new possibilities for data integrative methods and for answering questions relevant for complex human diseases. Several data sources are made available, such as literature for text mining miRNA-gene interactions (Chapter 2), next- and third-generation sequencing data for genomics and transcriptomics (Chapters 4.1, 5), and IMS for spatially resolved proteomics (Chapter 4.4). For these data sources new methods for information extraction and pre-processing are developed. For instance, third-generation sequencing runs can be monitored and evaluated using the poreSTAT and sequ-into methods. The integrative (down-stream) analyses make use of these (heterogeneous) data sources. The cPred method (Chapter 4.2) for cell type prediction from scRNA-seq data was successfully applied in the context of the SARS-CoV-2 pandemic. The robust differential expression (DE) analysis pipeline RoDE (Chapter 6.1) contains a large set of methods for (differential) data analysis, reporting and visualization of RNA-seq data. Topics of accessibility of bioinformatics software are discussed along practical applications (Chapter 3). The developed miRNA-gene interaction database gives valuable insights into atherosclerosis-relevant processes and serves as regulatory network for the prediction of active miRNA regulators in RoDE (Chapter 6.1). The cPred predictions, RoDE results, scRNA-seq and IMS data are unified as input for the 3D-index Aorta3D (Chapter 6.2), which makes atherosclerosis related datasets browsable. Finally, the scRNA-seq analysis with subsequent cPred cell type prediction, and the robust analysis of bulk-RNA-seq datasets, led to novel insights into COVID-19. Taken all discussed methods together, the integrative analysis methods for complex human disease contexts have been improved at essential positions.Die Dissertation beschreibt Methoden zur Prozessierung von aktuellen Hochdurchsatzdaten, sowie Verfahren zu deren weiterer integrativen Analyse. Diese findet Anwendung vor allem im Kontext von komplexen menschlichen Krankheiten. Neue Messtechniken erlauben eine detailliertere Beobachtung biomedizinischer Prozesse. Mit RNA-Sequenzierung (RNA-seq) wird mRNA-Expression gemessen, mit Hilfe von moderner single-cell-RNA-seq (scRNA-seq) sogar für (sehr viele) einzelne Zellen. Long-Read-Sequenzierung wird zunehmend zur Sequenzierung ganzer Genome eingesetzt. Mittels bildgebender Massenspektrometrie (IMS) können Proteine in Geweben räumlich aufgelöst quantifiziert werden. Diese Techniken bringen spezifische Herausforderungen mit sich, die mit neuen bioinformatischen Methoden angegangen werden müssen. Für die integrative Datenanalyse ist auch die Gewinnung von geeignetem Kontextwissen wichtig. Wissenschaftliche Erkenntnisse werden in Artikeln veröffentlicht, die über große Literaturdatenbanken zugänglich sind. Mittels Textmining können daraus Informationen extrahiert werden, z.B. miRNA-Gen-Interaktionen, die in eigenen Datenbank aggregiert werden um spezifische Fragen mit nachvollziehbaren Belegen zu beantworten. In Kombination mit experimentellen Daten bieten sich so neue Möglichkeiten für integrative Methoden. Durch die Extraktion von Rohdaten und deren Vorprozessierung werden mehrere Datenquellen erschlossen, wie z.B. Literatur für Textmining von miRNA-Gen-Interaktionen (Kapitel 2), Long-Read- und RNA-seq-Daten für Genomics und Transcriptomics (Kapitel 4.2, 5) und IMS für Protein-Messungen (Kapitel 4.4). So dienen z.B. die poreSTAT und sequ-into Methoden der Vorprozessierung und Auswertung von Long-Read-Sequenzierungen. In der integrativen (down-stream) Analyse werden diese (heterogenen) Datenquellen verwendet. Für die Bestimmung von Zelltypen in scRNA-seq-Experimenten wurde die cPred-Methode (Kapitel 4.2) erfolgreich im Kontext der SARS-CoV-2-Pandemie eingesetzt. Auch die robuste Pipeline RoDE fand dort Anwendung, die viele Methoden zur (differentiellen) Datenanalyse, zum Reporting und zur Visualisierung bereitstellt (Kapitel 6.1). Themen der Benutzbarkeit von (bioinformatischer) Software werden an Hand von praktischen Anwendungen diskutiert (Kapitel 3). Die entwickelte miRNA-Gen-Interaktionsdatenbank gibt wertvolle Einblicke in Atherosklerose-relevante Prozesse und dient als regulatorisches Netzwerk für die Vorhersage von aktiven miRNA-Regulatoren in RoDE (Kapitel 6.1). Die cPred-Methode, RoDE-Ergebnisse, scRNA-seq- und IMS-Daten werden im 3D-Index Aorta3D (Kapitel 6.2) zusammengeführt, der relevante Datensätze durchsuchbar macht. Die diskutierten Methoden führen zu erheblichen Verbesserungen für die integrative Datenanalyse in komplexen menschlichen Krankheitskontexten

    Biotechnology to Combat COVID-19

    Get PDF
    This book provides an inclusive and comprehensive discussion of the transmission, science, biology, genome sequencing, diagnostics, and therapeutics of COVID-19. It also discusses public and government health measures and the roles of media as well as the impact of society on the ongoing efforts to combat the global pandemic. It addresses almost every topic that has been studied so far in the research on SARS-CoV-2 to gain insights into the fundamentals of the disease and mitigation strategies. This volume is a useful resource for virologists, epidemiologists, biologists, medical professionals, public health and government professionals, and all global citizens who have endured and battled against the pandemic

    Forest genomics and biotechnology

    Get PDF
    This Research Topic addresses research in genomics and biotechnology to improve the growth and quality of forest trees for wood, pulp, biorefineries and carbon capture. Forests are the world’s greatest repository of terrestrial biomass and biodiversity. Forests serve critical ecological services, supporting the preservation of fauna and flora, and water resources. Planted forests also offer a renewable source of timber, for pulp and paper production, and the biorefinery. Despite their fundamental role for society, thousands of hectares of forests are lost annually due to deforestation, pests and pathogens and urban development. As a consequence, there is an increasing need to develop trees that are more productive under lower inputs, while understanding how they adapt to the environment and respond to biotic and abiotic stress. Forest genomics and biotechnology, disciplines that study the genetic composition of trees and the methods required to modify them, began over a quarter of a century ago with the development of the first genetic maps and establishment of early methods of genetic transformation. Since then, genomics and biotechnology have impacted all research areas of forestry. Genome analyses of tree populations have uncovered genes involved in adaptation and response to biotic and abiotic stress. Genes that regulate growth and development have been identified, and in many cases their mechanisms of action have been described. Genetic transformation is now widely used to understand the roles of genes and to develop germplasm that is more suitable for commercial tree plantations. However, in contrast to many annual crops that have benefited from centuries of domestication and extensive genomic and biotechnology research, in forestry the field is still in its infancy. Thus, tremendous opportunities remain unexplored. This Research Topic aims to briefly summarize recent findings, to discuss long-term goals and to think ahead about future developments and how this can be applied to improve growth and quality of forest trees. Mini-review articles are sought in forest genomics and biotechnology, with a focus on future directions applied to (1) genetic engineering, (2) adaptation, (3) genomics of conifers and hardwoods, (4) cell wall and wood formation, (5) development (6) metabolic engineering (7) biotic and abiotic resistance and (8) the biorefinery
    • …
    corecore