7,582 research outputs found

    Gene Editing in pig models of inherited retinal diseases

    Get PDF

    Machine Learning Approaches for the Prioritisation of Cardiovascular Disease Genes Following Genome- wide Association Study

    Get PDF
    Genome-wide association studies (GWAS) have revealed thousands of genetic loci, establishing itself as a valuable method for unravelling the complex biology of many diseases. As GWAS has grown in size and improved in study design to detect effects, identifying real causal signals, disentangling from other highly correlated markers associated by linkage disequilibrium (LD) remains challenging. This has severely limited GWAS findings and brought the method’s value into question. Although thousands of disease susceptibility loci have been reported, causal variants and genes at these loci remain elusive. Post-GWAS analysis aims to dissect the heterogeneity of variant and gene signals. In recent years, machine learning (ML) models have been developed for post-GWAS prioritisation. ML models have ranged from using logistic regression to more complex ensemble models such as random forests and gradient boosting, as well as deep learning models (i.e., neural networks). When combined with functional validation, these methods have shown important translational insights, providing a strong evidence-based approach to direct post-GWAS research. However, ML approaches are in their infancy across biological applications, and as they continue to evolve an evaluation of their robustness for GWAS prioritisation is needed. Here, I investigate the landscape of ML across: selected models, input features, bias risk, and output model performance, with a focus on building a prioritisation framework that is applied to blood pressure GWAS results and tested on re-application to blood lipid traits

    Using machine learning to predict pathogenicity of genomic variants throughout the human genome

    Get PDF
    Geschätzt mehr als 6.000 Erkrankungen werden durch Veränderungen im Genom verursacht. Ursachen gibt es viele: Eine genomische Variante kann die Translation eines Proteins stoppen, die Genregulation stören oder das Spleißen der mRNA in eine andere Isoform begünstigen. All diese Prozesse müssen überprüft werden, um die zum beschriebenen Phänotyp passende Variante zu ermitteln. Eine Automatisierung dieses Prozesses sind Varianteneffektmodelle. Mittels maschinellem Lernen und Annotationen aus verschiedenen Quellen bewerten diese Modelle genomische Varianten hinsichtlich ihrer Pathogenität. Die Entwicklung eines Varianteneffektmodells erfordert eine Reihe von Schritten: Annotation der Trainingsdaten, Auswahl von Features, Training verschiedener Modelle und Selektion eines Modells. Hier präsentiere ich ein allgemeines Workflow dieses Prozesses. Dieses ermöglicht es den Prozess zu konfigurieren, Modellmerkmale zu bearbeiten, und verschiedene Annotationen zu testen. Der Workflow umfasst außerdem die Optimierung von Hyperparametern, Validierung und letztlich die Anwendung des Modells durch genomweites Berechnen von Varianten-Scores. Der Workflow wird in der Entwicklung von Combined Annotation Dependent Depletion (CADD), einem Varianteneffektmodell zur genomweiten Bewertung von SNVs und InDels, verwendet. Durch Etablierung des ersten Varianteneffektmodells für das humane Referenzgenome GRCh38 demonstriere ich die gewonnenen Möglichkeiten Annotationen aufzugreifen und neue Modelle zu trainieren. Außerdem zeige ich, wie Deep-Learning-Scores als Feature in einem CADD-Modell die Vorhersage von RNA-Spleißing verbessern. Außerdem werden Varianteneffektmodelle aufgrund eines neuen, auf Allelhäufigkeit basierten, Trainingsdatensatz entwickelt. Diese Ergebnisse zeigen, dass der entwickelte Workflow eine skalierbare und flexible Möglichkeit ist, um Varianteneffektmodelle zu entwickeln. Alle entstandenen Scores sind unter cadd.gs.washington.edu und cadd.bihealth.org frei verfügbar.More than 6,000 diseases are estimated to be caused by genomic variants. This can happen in many possible ways: a variant may stop the translation of a protein, interfere with gene regulation, or alter splicing of the transcribed mRNA into an unwanted isoform. It is necessary to investigate all of these processes in order to evaluate which variant may be causal for the deleterious phenotype. A great help in this regard are variant effect scores. Implemented as machine learning classifiers, they integrate annotations from different resources to rank genomic variants in terms of pathogenicity. Developing a variant effect score requires multiple steps: annotation of the training data, feature selection, model training, benchmarking, and finally deployment for the model's application. Here, I present a generalized workflow of this process. It makes it simple to configure how information is converted into model features, enabling the rapid exploration of different annotations. The workflow further implements hyperparameter optimization, model validation and ultimately deployment of a selected model via genome-wide scoring of genomic variants. The workflow is applied to train Combined Annotation Dependent Depletion (CADD), a variant effect model that is scoring SNVs and InDels genome-wide. I show that the workflow can be quickly adapted to novel annotations by porting CADD to the genome reference GRCh38. Further, I demonstrate the integration of deep-neural network scores as features into a new CADD model, improving the annotation of RNA splicing events. Finally, I apply the workflow to train multiple variant effect models from training data that is based on variants selected by allele frequency. In conclusion, the developed workflow presents a flexible and scalable method to train variant effect scores. All software and developed scores are freely available from cadd.gs.washington.edu and cadd.bihealth.org

    Genetic approaches to exploit landraces for improvement of Triticum turgidum ssp. durum in the age of climate change

    Get PDF
    Addressing the challenges of climate change and durum wheat production is becoming an important driver for food and nutrition security in the Mediterranean area, where are located the major producing countries (Italy, Spain, France, Greece, Morocco, Algeria, Tunisia, Turkey, and Syria). One of the emergent strategies, to cope with durum wheat adaptation, is the exploration and exploitation of the existing genetic variability in landrace populations. In this context, this review aims to highlight the important role of durum wheat landraces as a useful genetic resource to improve the sustainability of Mediterranean agroecosystems, with a focus on adaptation to environmental stresses. We described the most recent molecular techniques and statistical approaches suitable for the identification of beneficial genes/alleles related to the most important traits in landraces and the development of molecular markers for marker-assisted selection. Finally, we outline the state of the art about landraces genetic diversity and signature of selection, already identified from these accessions, for adaptability to the environment

    The state of quantum computing applications in health and medicine

    Full text link
    Quantum computing hardware and software have made enormous strides over the last years. Questions around quantum computing's impact on research and society have changed from "if" to "when/how". The 2020s have been described as the "quantum decade", and the first production solutions that drive scientific and business value are expected to become available over the next years. Medicine, including fields in healthcare and life sciences, has seen a flurry of quantum-related activities and experiments in the last few years (although medicine and quantum theory have arguably been entangled ever since Schr\"odinger's cat). The initial focus was on biochemical and computational biology problems; recently, however, clinical and medical quantum solutions have drawn increasing interest. The rapid emergence of quantum computing in health and medicine necessitates a mapping of the landscape. In this review, clinical and medical proof-of-concept quantum computing applications are outlined and put into perspective. These consist of over 40 experimental and theoretical studies from the last few years. The use case areas span genomics, clinical research and discovery, diagnostics, and treatments and interventions. Quantum machine learning (QML) in particular has rapidly evolved and shown to be competitive with classical benchmarks in recent medical research. Near-term QML algorithms, for instance, quantum support vector classifiers and quantum neural networks, have been trained with diverse clinical and real-world data sets. This includes studies in generating new molecular entities as drug candidates, diagnosing based on medical image classification, predicting patient persistence, forecasting treatment effectiveness, and tailoring radiotherapy. The use cases and algorithms are summarized and an outlook on medicine in the quantum era, including technical and ethical challenges, is provided

    Discovery and applications of family AA9 lytic polysaccharide monooxygenases

    Get PDF
    Auxililary activity family 9 lytic polysaccharide monooxygenases (abbreviated as AA9s or LPMO9s) are fungal mono-copper enzymes capable of oxidatively cleaving various plant cell wall oligo- and/or polysaccharides. LPMO9s are key components of lignocellulolytic enzyme cocktails used in today’s biorefineries to break down biomass into fermentable sugars. Highly stable enzymes with novel functions are of great interest to improve enzymatic biorefinery processes and their economic feasibility. Genome sequencing of an industrially relevant fungus, Thermothielavioides terrestris LPH172, revealed 411 putative carbohydrate-active enzyme (CAZy) domains. Transcriptomic analysis indicated that the fungus upregulated numerous LPMO9 genes in concert with canonical cellulase and hemicellulase encoding genes to degrade lignocellulose. Nuanced co-upregulation was detected for LPMO9 genes and those encoding other redox-active CAZymes. Six strongly upregulated TtLPMO9 genes were heterologously expressed and functionally characterized using cellulosic and hemicellulosic substrates. These studies showed that the multitude of LPMO9 genes provided the fungus with different functions, including previously unknown cleavage of cellulose-associated spruce arabinoglucuronoxylan and acetylated birch glucuronoxylan. In a related study, xylanolytic LPMO9 activity was revealed or enhanced by debranching xylans enzymatically, which likely assumed a rigid and stretched xylan conformation that associated with cellulose to increase accessibility to LPMO9s. LPMOs have unique oxidative powers which render them advantageous for various biorefinery applications. A C1-oxidizing TtLPMO9G was found to increase the amount of carboxyl groups on sulfated cellulose nanocrystals by 10%, without any extensive degradation of the crystals. The functional groups thus generated were used for proof-of-concept crosslinking, which could aid in the production of bio-based materials. In another application, addition of TaLPMO9A to a benchmark LPMO-poor cellulolytic cocktail was shown to improve saccharification yields of mildly pretreated spruce substrates. The final glucose and xylose yields were increased by up to 1.6- and 1.5-fold, respectively, illustrating how LPMO9s can be exploited in the saccharification of these notoriously recalcitrant substrates

    Genetic approaches to exploit landraces for improvement of Triticum turgidum ssp. durum in the age of climate change

    Get PDF
    Addressing the challenges of climate change and durum wheat production is becoming an important driver for food and nutrition security in the Mediterranean area, where are located the major producing countries (Italy, Spain, France, Greece, Morocco, Algeria, Tunisia, Turkey, and Syria). One of the emergent strategies, to cope with durum wheat adaptation, is the exploration and exploitation of the existing genetic variability in landrace populations. In this context, this review aims to highlight the important role of durum wheat landraces as a useful genetic resource to improve the sustainability of Mediterranean agroecosystems, with a focus on adaptation to environmental stresses. We described the most recent molecular techniques and statistical approaches suitable for the identification of beneficial genes/alleles related to the most important traits in landraces and the development of molecular markers for marker-assisted selection. Finally, we outline the state of the art about landraces genetic diversity and signature of selection, already identified from these accessions, for adaptability to the environment
    • …
    corecore