Search CORE

28 research outputs found

A SARS-CoV-2 sequence submission tool for the European Nucleotide Archive

Author: Backofen Rolf
Coppens Frederik
D'Anna Flora
De Ruyck Kim
Droesbeke Bert
Eguinoa Ignacio
Grüning Björn
Roncoroni Miguel
Yusuf Dilmurat
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2021
Field of study

Abstract Summary Many aspects of the global response to the COVID-19 pandemic are enabled by the fast and open publication of SARS-CoV-2 genetic sequence data. The European Nucleotide Archive (ENA) is the European recommended open repository for genetic sequences. In this work, we present a tool for submitting raw sequencing reads of SARS-CoV-2 to ENA. The tool features a single-step submission process, a graphical user interface, tabular-formatted metadata and the possibility to remove human reads prior to submission. A Galaxy wrap of the tool allows users with little or no bioinformatic knowledge to do bulk sequencing read submissions. The tool is also packed in a Docker container to ease deployment. Availability CLI ENA upload tool is available at github.com/usegalaxy-eu/ena-upload-cli (DOI 10.5281/zenodo.4537621); Galaxy ENA upload tool at toolshed.g2.bx.psu.edu/view/iuc/ena_upload/382518f24d6d and https://github.com/galaxyproject/tools-iuc/tree/master/tools/ena_upload (development) and; ENA upload Galaxy container at github.com/ELIXIR-Belgium/ena-upload-container (DOI 10.5281/zenodo.4730785) </jats:sec

Ghent University Academic Bibliography

PubMed Central

Generation of pure monocultures of human microglia-like cells from induced pluripotent stem cells

Author: Backofen Rolf
Banerjee Poulomi
Burr Karen
Chandran Siddharthan
Dando Owen
He Xin
James Owen G.
Kenkhuis Boyd
Lloyd Amy F.
Paza Evdokia
Perkins Emma M.
Priller Josef
Story David
Yusuf Dilmurat
Publication venue: 'Elsevier BV'
Publication date: 14/10/2020
Field of study

Crossref

Edinburgh Research Explorer

Small RNA profiling of low biomass samples: identification and removal of contaminants

Author: de Beaufort Carine
Etheridge Alton
Fritz Joëlle V
Fritz Joëlle V.
Galas David J
Galas David J.
Ghosal Anubrata
Heintz-Buschart Anna
Kaysen Anne
May Patrick
Upadhyaya Bimal B
Upadhyaya Bimal B.
Wilmes Paul
Yusuf Dilmurat
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/05/2018
Field of study

Background Sequencing-based analyses of low-biomass samples are known to be prone to misinterpretation due to the potential presence of contaminating molecules derived from laboratory reagents and environments. DNA contamination has been previously reported, yet contamination with RNA is usually considered to be very unlikely due to its inherent instability. Small RNAs (sRNAs) identified in tissues and bodily fluids, such as blood plasma, have implications for physiology and pathology, and therefore the potential to act as disease biomarkers. Thus, the possibility for RNA contaminants demands careful evaluation. Results Herein, we report on the presence of small RNA (sRNA) contaminants in widely used microRNA extraction kits and propose an approach for their depletion. We sequenced sRNAs extracted from human plasma samples and detected important levels of non-human (exogenous) sequences whose source could be traced to the microRNA extraction columns through a careful qPCR-based analysis of several laboratory reagents. Furthermore, we also detected the presence of artefactual sequences related to these contaminants in a range of published datasets, thereby arguing in particular for a re-evaluation of reports suggesting the presence of exogenous RNAs of microbial and dietary origin in blood plasma. To avoid artefacts in future experiments, we also devise several protocols for the removal of contaminant RNAs, define minimal amounts of starting material for artefact-free analyses, and confirm the reduction of contaminant levels for identification of bona fide sequences using ‘ultra-clean’ extraction kits. Conclusion This is the first report on the presence of RNA molecules as contaminants in RNA extraction kits. The described protocols should be applied in the future to avoid confounding sRNA studies. Keywords: RNA sequencing; Artefact removal; Exogenous RNA in human blood plasma; Contaminant RNA; Spin column

DSpace@MIT

Crossref

Directory of Open Access Journals

Open Repository and Bibliography - Luxembourg

The RNA workbench: Best practices for RNA and high-throughput sequencing bioinformatics in Galaxy

Author: Akalin A. (Altuna)
Backofen R. (Rolf)
Bagnacani A. (Andrea)
Batut B. (Bérénice)
Eggenhofer F. (Florian)
Erxleben A. (Anika)
Fallmann J. (Jörg)
Grüning B.A. (Björn A.)
Hess W.R. (Wolfgang R.)
Hoffmann S. (Steve)
Hoogstrate Y. (Youri)
Houwaart T. (Torsten)
Lott S.C. (Steffen C.)
Ohler U. (Uwe)
Stadler P.F. (Peter F.)
Videm P. (Pavankumar)
Will S. (Sebastian)
Wolfien M. (Markus)
Wolkenhauer O. (Olaf)
Yusuf D. (Dilmurat)
Publication venue: 'Oxford University Press (OUP)'
Publication date: 05/06/2017
Field of study

RNA-based regulation has become a major research topic in molecular biology. The analysis of epigenetic and expression data is therefore incomplete if RNA-based regulation is not taken into account. Thus, it is increasingly important but not yet standard to combine RNA-centric data and analysis tools with other types of experimental data such as RNA-seq or ChIP-seq. Here, we present the RNA workbench, a comprehensive set of analysis tools and consolidated workflows that enable the researcher to combine these two worlds. Based on the Galaxy framework the workbench guarantees simple access, easy extension, flexible adaption to personal and security needs, and sophisticated analyses that are independent of command-line knowledge. Currently, it includes more than 50 bioinformatics tools that are dedicated to different research areas of RNA biology including RNA structure analysis, RNA alignment, RNA annotation, RNA-protein interaction, ribosome profiling, RNA-seq analysis and RNA target prediction. The workbench is developed and maintained by experts in RNA bioinformatics and the Galaxy framework. Together with the growing community evolving around this workbench, we are committed to keep the workbench up-to-date for future standards and needs, providing researchers with a reliable and robust framework for RNA data analysis

Crossref

Erasmus University Digital Repository

Community-Driven Data Analysis Training for Biology

Author: Backofen R. (Rolf)
Bagnacani A. (Andrea)
Baker D. (Dannon)
Batut B. (Bérénice)
Bhardwaj V. (Vivek)
Blank C. (Clemens)
Bretaudeau A. (Anthony)
Brillet-Guéguen L. (Loraine)
Chilton J. (John)
Clements D. (Dave)
Doppelt-Azeroual O. (Olivia)
Erxleben A. (Anika)
Freeberg M.A. (Mallory Ann)
Gladman S. (Simon)
Grüning B. (Björn)
Hiltemann S. (Saskia)
Hoogstrate Y. (Youri)
Hotz H.-R. (Hans-Rudolf)
Houwaart T. (Torsten)
Jagtap P. (Pratik)
Larivière D. (Delphine)
Le Corguillé G. (Gildas)
Manke T. (Thomas)
Mareuil F. (Fabien)
Nekrutenko A. (Anton)
Ramírez F. (Fidel)
Ryan D. (Devon)
Sigloch F.C. (Florian Christoph)
Soranzo N. (Nicola)
Taylor J. (James)
Videm P. (Pavankumar)
Wolff J. (Joachim)
Wolfien M. (Markus)
Wubuli A. (Aisanjiang)
Yusuf D. (Dilmurat)
Čech M. (Martin)
Publication venue: 'Elsevier BV'
Publication date: 01/01/2018
Field of study

The primary problem with the explosion of biomedical datasets is not the data, not computational resources, and not the required storage space, but the general lack of trained and skilled researchers to manipulate and analyze these data. Eliminating this problem requires development of comprehensive educational resources. Here we present a community-driven framework that enables modern, interactive teaching of data analytics in life sciences and facilitates the development of training materials. The key feature of our system is that it is not a static but a continuously improved collection of tutorials. By coupling tutorials with a web-based analysis framework, biomedical researchers can learn by performing computation themselves through a web browser without the need to install software or search for example datasets. Our ultimate goal is to expand the breadth of training materials to include fundamental statistical and data science topics and to precipitate a complete re-engineering of undergraduate and graduate curricula in life sciences. This project is accessible at https://training.galaxyproject.org. We developed an infrastructure that facilitates data analysis training in life sciences. It is an interactive learning platform tuned for current types of data and research problems. Importantly, it provides a means for community-wide content creation and maintenance and, finally, enables trainers and trainees to use the tutorials in a variety of situations, such as those where reliable Internet access is unavailable

Erasmus University Digital Repository

Reproducible inference of transcription factor footprints in ATAC-seq and DNase-seq datasets using protocol-specific bias modeling

Author: Antje Hirsekorn
Aslıhan Karabacak Calviello
Dilmurat Yusuf
Ricardo Wurmus
Uwe Ohler
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/02/2019
Field of study

Abstract Background DNase-seq and ATAC-seq are broadly used methods to assay open chromatin regions genome-wide. The single nucleotide resolution of DNase-seq has been further exploited to infer transcription factor binding sites (TFBSs) in regulatory regions through footprinting. Recent studies have demonstrated the sequence bias of DNase I and its adverse effects on footprinting efficiency. However, footprinting and the impact of sequence bias have not been extensively studied for ATAC-seq. Results Here, we undertake a systematic comparison of the two methods and show that a modification to the ATAC-seq protocol increases its yield and its agreement with DNase-seq data from the same cell line. We demonstrate that the two methods have distinct sequence biases and correct for these protocol-specific biases when performing footprinting. Despite the differences in footprint shapes, the locations of the inferred footprints in ATAC-seq and DNase-seq are largely concordant. However, the protocol-specific sequence biases in conjunction with the sequence content of TFBSs impact the discrimination of footprint from the background, which leads to one method outperforming the other for some TFs. Finally, we address the depth required for reproducible identification of open chromatin regions and TF footprints. Conclusions We demonstrate that the impact of bias correction on footprinting performance is greater for DNase-seq than for ATAC-seq and that DNase-seq footprinting leads to better performance. It is possible to infer concordant footprints by using replicates, highlighting the importance of reproducibility assessment. The results presented here provide an overview of the advantages and limitations of footprinting analyses using ATAC-seq and DNase-seq

Directory of Open Access Journals

MDC Repository

Universal Optimizations of Scoring Functions for Virtual Screening

Author: Kenji Onodera
REID DARRYL
Shunsuke Kamijo
TERAMOTO REIJI
YANG JINN-MOON
YUSUF DILMURAT
Publication venue: 'Chem-Bio Informatics Society'
Publication date: 01/01/2010
Field of study

Crossref

A SARS-CoV-2 sequence submission tool for the European Nucleotide Archive

Author: Backofen Rolf
Coppens Frederik
D'Anna Flora
De Ruyck Kim
Droesbeke Bert
Eguinoa Ignacio
Grüning Björn
Roncoroni Miguel
Yusuf Dilmurat
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2021
Field of study

Many aspects of the global response to the COVID-19 pandemic are enabled by the fast and open publication of SARS-CoV-2 genetic sequence data. The European Nucleotide Archive (ENA) is the European recommended open repository for genetic sequences. In this work, we present a tool for submitting raw sequencing reads of SARS-CoV-2 to ENA. The tool features a single-step submission process, a graphical user interface, tabular-formatted metadata and the possibility to remove human reads prior to submission. A Galaxy wrap of the tool allows users with little or no bioinformatic knowledge to do bulk sequencing read submissions. The tool is also packed in a Docker container to ease deployment

Crossref

Ghent University Academic Bibliography

PubMed Central

Training data for 'From peaks to gene' tutorial (Galaxy Training Material)

Author: Anne Pajon (5388778)
Björn Grüning (3587165)
Bérénice Batut (5377243)
Clemens Blank (5388781)
Dilmurat Yusuf (2428192)
Nicola Soranzo (95104)
Publication venue
Publication date
Field of study

<p>The data provided here are part of a Galaxy Training Network tutorial that analyzes peaks from a study published by Li et al., 2012 (DOI:10.1016/j.stem.2012.04.023) to identify target genes</p

FigShare