Search CORE

23 research outputs found

Deep phenotyping: symptom annotation made simple with SAMS.

Author: Proft Sebastian
Robinson Peter N
Schalau Tobias
Seelow Dominik
Seelow Evelyn
Steinhaus Robin
Publication venue: The Mouseion at the JAXlibrary
Publication date: 01/01/2022
Field of study

Precision medicine needs precise phenotypes. The Human Phenotype Ontology (HPO) uses clinical signs instead of diagnoses and has become the standard annotation for patients\u27 phenotypes when describing single gene disorders. Use of the HPO beyond human genetics is however still limited. With SAMS (Symptom Annotation Made Simple), we want to bring sign-based phenotyping to routine clinical care, to hospital patients as well as to outpatients. Our web-based application provides access to three widely used annotation systems: HPO, OMIM, Orphanet. Whilst data can be stored in our database, phenotypes can also be imported and exported as Global Alliance for Genomics and Health (GA4GH) Phenopackets without using the database. The web interface can easily be integrated into local databases, e.g. clinical information systems. SAMS offers users to share their data with others, empowering patients to record their own signs and symptoms (or those of their children) and thus provide their doctors with additional information. We think that our approach will lead to better characterised patients which is not only helpful for finding disease mutations but also to better understand the pathophysiology of diseases and to recruit patients for studies and clinical trials. SAMS is freely available at https://www.genecascade.org/SAMS/

Institutional Repository of the Freie Universität Berlin

The Jackson Laboratory: The Mouseion at the JAXlibrary

PubMed Central

MutationTaster2021

Author: Cooper David N.
Proft Sebastian
Schuelke Markus
Schwarz Jana Marie
Seelow Dominik
Steinhaus Robin
Publication venue
Publication date: 01/01/2021
Field of study

Here we present an update to MutationTaster, our DNA variant effect prediction tool. The new version uses a different prediction model and attains higher accuracy than its predecessor, especially for rare benign variants. In addition, we have integrated many sources of data that only became available after the last release (such as gnomAD and ExAC pLI scores) and changed the splice site prediction model. To more easily assess the relevance of detected known disease mutations to the clinical phenotype of the patient, MutationTaster now provides information on the diseases they cause. Further changes represent a major overhaul of the interfaces to increase user-friendliness whilst many changes under the hood have been designed to accelerate the processing of uploaded VCF files. We also offer an API for the rapid automated query of smaller numbers of variants from within other software. MutationTaster2021 integrates our disease mutation search engine, MutationDistiller, to prioritise variants from VCF files using the patient's clinical phenotype. The novel version is available at https://www.genecascade.org/MutationTaster2021/. This website is free and open to all users and there is no login requirement

Institutional Repository of the Freie Universität Berlin

Online Research @ Cardiff

RegEl corpus: identifying DNA regulatory elements in the scientific literature

Author: Garda Samuele
Hochmuth Stefanie
Lenihan-Geels Freyda
Leser Ulf
Proft Sebastian
Schülke Markus
Seelow Dominik
Publication venue: Humboldt-Universität zu Berlin
Publication date: 05/04/2022
Field of study

High-throughput technologies led to the generation of a wealth of data on regulatory DNA elements in the human genome. However, results from disease-driven studies are primarily shared in textual form as scientific articles. Information extraction (IE) algorithms allow this information to be (semi-)automatically accessed. Their development, however, is dependent on the availability of annotated corpora. Therefore, we introduce RegEl (Regulatory Elements), the first freely available corpus annotated with regulatory DNA elements comprising 305 PubMed abstracts for a total of 2690 sentences. We focus on enhancers, promoters and transcription factor binding sites. Three annotators worked in two stages, achieving an overall 0.73 F1 inter-annotator agreement and 0.46 for regulatory elements. Depending on the entity type, IE baselines reach F1-scores of 0.48–0.91 for entity detection and 0.71–0.88 for entity normalization. Next, we apply our entity detection models to the entire PubMed collection and extract co-occurrences of genes or diseases with regulatory elements. This generates large collections of regulatory elements associated with 137 870 unique genes and 7420 diseases, which we make openly available.Database URL: https://zenodo.org/record/6418451#.YqcLHvexVqgPeer Reviewe

PubMed Central

Dokumenten-Publikationsserver der Humboldt-Universität zu Berlin

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

MutationTaster2021

Author: Cooper David N.
Proft Sebastian
Schuelke Markus
Schwarz Jana Marie
Seelow Dominik
Steinhaus Robin
Publication venue: 'Oxford University Press (OUP)'
Publication date: 02/07/2021
Field of study

Online Research @ Cardiff

Photoactivation of titanium-oxo cluster [Ti 6 O 6 (OR) 6 (O 2 C t Bu) 6 ] : mechanism, photoactivated structures, and onward reactivity with O 2 to a peroxide complex

Author: Andrews Ryan T.
Barnes Thomas J.
Brown Stephen E.
Cunha Ana V.
De Proft Frank
Lees Martin R.
Mantaloufa Ioanna
Pike Sebastian D.
Publication venue: 'Royal Society of Chemistry (RSC)'
Publication date: 07/12/2022
Field of study

The molecular titanium-oxo cluster [Ti6O6(OiPr)6(O2CtBu)6] (1) can be photoactivated by UV light, resulting in a deeply coloured mixed valent (photoreduced) Ti (iii/iv) cluster, alongside alcohol and ketone (photooxidised) organic products. Mechanistic studies indicate that a two-electron (not free-radical) mechanism occurs in this process, which utilises the cluster structure to facilitate multielectron reactions. The photoreduced products [Ti6O6(OiPr)4(O2CtBu)6(sol)2], sol = iPrOH (2) or pyridine (3), can be isolated in good yield and are structurally characterized, each with two, uniquely arranged, antiferromagnetically coupled d-electrons. 2 and 3 undergo onward oxidation under air, with 3 cleanly transforming into peroxide complex, [Ti6O6(OiPr)4(O2CtBu)6(py)(O2)] (5). 5 reacts with isopropanol to regenerate the initial cluster (1) completing a closed cycle, and suggesting opportunities for the deployment of these easily made and tuneable clusters for sustainable photocatalytic processes using air and light. The redox reactivity described here is only possible in a cluster with multiple Ti sites, which can perform multi-electron processes and can adjust its shape to accommodate changes in electron density

Warwick Research Archives Portal Repository

Diatom DNA metabarcoding for ecological assessment: Comparison among bioinformatics pipelines used in six European countries reveals the need for standardization

Author: Alain Franc
Alberdi
Ana Baričević
Apothéloz-Perret-Gentil
Bailet
Bonnie Bailet
Boyer
Brown
Buchner
Bálint
Caporaso
Cemagref
CEN
CEN
CEN
Chonova
Clarke
Commission E
Demetrio Mora
Di Tommaso
Dufrene
Dufresne
Edgar
Elbrecht
Esling
Frigerio
Gaonkar
Godhe
Hammer
Jarlman
Jean-Marc Frigerio
Ji
Jonas Zimmermann
Joshi
Kahlert
Kahlert
Kahlert
Katoh
Keck
Kelly
Kelly
Kelly
Kelly
Kelly
Kermarrec
Kermarrec
Laure Apothéloz-Perret-Gentil
Lecointe
Mann
Maria Kahlert
Martin
Martin Pfannkuchen
Martyn Kelly
Masella
Mathieu Ramon
McCune
Mora
Mortágua
Proft
Pérez-Burillo
R Core Team
Rimet
Rimet
Rivera
Rognes
Schloss
Sebastian Proft
Siegwald
SIS - Standardiseringens stöd
SIS - Standardiseringens stöd
Smol
Stamatakis
Stein
Tapolczai
Tapolczai
Tedersoo
Teofana Chonova
Valentin Vasselon
Vasselon
Vasselon
Vasselon
Visco
Wang
Weigand
Zafeiropoulos
Zgrundo
Zhang
Zimmermann
Publication venue
Publication date: 01/01/2020
Field of study

Ecological assessment of lakes and rivers using benthic diatom assemblages currently requires considerable taxonomic expertise to identify species using light microscopy. This traditional approach is also time-consuming. Diatom metabarcoding is a promising alternative and there is increasing interest in using this approach for routine assessment. However, until now, analysis protocols for diatom metabarcoding have been developed and optimised by research groups working in isolation. The diversity of existing bioinformatics methods highlights the need for an assessment of the performance and comparability of results of different methods. The aim of this study was to test the correspondence of outputs from six bioinformatics pipelines currently in use for diatom metabarcoding in different European countries. Raw sequence data from 29 biofilm samples were treated by each of the bioinformatics pipelines, five of them using the same curated reference database. The outputs of the pipelines were compared in terms of sequence unit assemblages, taxonomic assignment, biotic index score and ecological assessment outcomes. The three last components were also compared to outputs from traditional light microscopy, which is currently accepted for ecological assessment of phytobenthos, as required by the Water Framework Directive. We also tested the performance of the pipelines on the two DNA markers (rbcL and 185-V4) that are currently used by the working groups participating in this study. The sequence unit assemblages produced by different pipelines showed significant differences in terms of assigned and unassigned read numbers and sequence unit numbers. When comparing the taxonomic assignments at genus and species level, correspondence of the taxonomic assemblages between pipelines was weak. Most discrepancies were linked to differential detection or quantification of taxa, despite the use of the same reference database. Subsequent calculation of biotic index scores also showed significant differences between approaches, which were reflected in the final ecological assessment. Use of the rbcL marker always resulted in better correlation among molecular datasets and also in results closer to these generated using traditional microscopy. This study shows that decisions made in pipeline design have implications for the dataset's structure and the taxonomic assemblage, which in turn may affect biotic index calculation and ecological assessment. There is a need to define best-practice bioinformatics parameters in order to ensure the best representation of diatom assemblages. Only the use of similar parameters will ensure the compatibility of data from different working groups. The future of diatom metabarcoding for ecological assessment may also lie in the development of new metrics using, for example, presence/absence instead of relative abundance data. (C) 2020 The Authors. Published by Elsevier B.V

Epsilon Open Archive

Crossref

INRIA a CCSD electronic archive server

HAL Université de Savoie

Hal-Diderot

Oskar Bordeaux

Discovery of a non-canonical GRHL1 binding site using deep convolutional and recurrent neural networks

Author: Dominik Seelow
Janna Leiz
Kai M. Schmidt-Ott
Maria Rutkiewicz
Sebastian Proft
Udo Heinemann
Publication venue: BMC
Publication date: 01/12/2023
Field of study

Abstract Background Transcription factors regulate gene expression by binding to transcription factor binding sites (TFBSs). Most models for predicting TFBSs are based on position weight matrices (PWMs), which require a specific motif to be present in the DNA sequence and do not consider interdependencies of nucleotides. Novel approaches such as Transcription Factor Flexible Models or recurrent neural networks consequently provide higher accuracies. However, it is unclear whether such approaches can uncover novel non-canonical, hitherto unexpected TFBSs relevant to human transcriptional regulation. Results In this study, we trained a convolutional recurrent neural network with HT-SELEX data for GRHL1 binding and applied it to a set of GRHL1 binding sites obtained from ChIP-Seq experiments from human cells. We identified 46 non-canonical GRHL1 binding sites, which were not found by a conventional PWM approach. Unexpectedly, some of the newly predicted binding sequences lacked the CNNG core motif, so far considered obligatory for GRHL1 binding. Using isothermal titration calorimetry, we experimentally confirmed binding between the GRHL1-DNA binding domain and predicted GRHL1 binding sites, including a non-canonical GRHL1 binding site. Mutagenesis of individual nucleotides revealed a correlation between predicted binding strength and experimentally validated binding affinity across representative sequences. This correlation was neither observed with a PWM-based nor another deep learning approach. Conclusions Our results show that convolutional recurrent neural networks may uncover unanticipated binding sites and facilitate quantitative transcription factor binding predictions

Directory of Open Access Journals