Search CORE

27 research outputs found

Application of motif scoring algorithms for enhancer prediction in distantly related species

Author: Dolle Dirk-Dominik
Publication venue
Publication date: 13/11/2012
Field of study

Although many studies proposed methods for the identification of enhancers, reliable prediction on a genome-wide scale is still an unsolved problem. One of the reasons for this is the highly flexible regulatory logic underlying a detectable enhancer activity. In each cell type or tissue and at any given time, a mostly unknown set of transcription factors activates specific regulatory elements by coordinated binding to the corresponding genomic region. Position, spacing, and orientation of the individual bound factors can thereby vary between different enhancers yet result in a highly similar spatio-temporal activity. Due to this inner flexibility, so-called “alignment-free” methods have been proposed for enhancer prediction, as they are able to cope with rearrangements by comparison of word profiles rather than linear sequence. However, the problems caused by allowing for permutation in sequence comparison have not been investigated so far. In this study I implemented several published alignment-free metrics and analysed, which parameters affect their ability to successfully predict regulatory regions. As results show, single point mutations and the increasing amount of spurious matches with decreasing word size pose the biggest challenge to alignment-free techniques, especially when applied on a genome-wide scale. Alignment algorithms usually solve these problems quite efficiently but cannot handle permutation. I therefore implemented a new technique for enhancer prediction that combines the advantages of both algorithm types and used it for the identification of regulatory regions in the teleost fish Oryzias latipes (Medaka) based on a set of known and validated human enhancers. Predicted medaka regions and human enhancers were subsequently used in an in vivo enhancer assay and analysed for their activity. In total, 12 predicted regions corresponding to 9 human enhancers showed clear enhancing activity in the fish. This shows that the principle implemented here is able to predict active enhancers at a high rate on a genome-wide scale even in species as diverged as human and fish. Furthermore, evidence for motif-level conservation between some of the human and medaka enhancers could be found that was invisible for most of the alignment-algorithms used for comparison

Heidelberger Dokumentenserver

Using reference-free compressed data structures to analyze sequencing reads from thousands of human genomes.

Author: Cotten Matthew
Dolle Dirk D
Durbin Richard
Iqbal Zamin
Keane Thomas M
Liu Zhicheng
McCarthy Shane A
Simpson Jared T
Publication venue: 'Cold Spring Harbor Laboratory'
Publication date: 22/06/2016
Field of study

We are rapidly approaching the point where we have sequenced millions of human genomes. There is a pressing need for new data structures to store raw sequencing data and efficient algorithms for population scale analysis. Current reference-based data formats do not fully exploit the redundancy in population sequencing nor take advantage of shared genetic variation. In recent years, the Burrows-Wheeler transform (BWT) and FM-index have been widely employed as a full-text searchable index for read alignment and de novo assembly. We introduce the concept of a population BWT and use it to store and index the sequencing reads of 2705 samples from the 1000 Genomes Project. A key feature is that, as more genomes are added, identical read sequences are increasingly observed, and compression becomes more efficient. We assess the support in the 1000 Genomes read data for every base position of two human reference assembly versions, identifying that 3.2 Mbp with population support was lost in the transition from GRCh37 with 13.7 Mbp added to GRCh38. We show that the vast majority of variant alleles can be uniquely described by overlapping 31-mers and show how rapid and accurate SNP and indel genotyping can be carried out across the genomes in the population BWT. We use the population BWT to carry out nonreference queries to search for the presence of all known viral genomes and discover human T-lymphotropic virus 1 integrations in six samples in a recognized epidemiological distribution

Crossref

LSHTM Research Online

PubMed Central

Oxford University Research Archive

Enlighten

The Light Responsive Transcriptome of the Zebrafish: Function and Regulation

Author: Armant Olivier
Dickmeis Thomas
Dolle Dirk
Ettwiller Laurence
Foulkes Nicholas S.
Geisler Robert
Lahiri Kajori
Mracek Philipp
Otto Georg W.
Sahinbas Meltem
Vallone Daniela
Weger Benjamin D.
Publication venue: Public Library of Science
Publication date: 03/04/2014
Field of study

KITopen

Genomic and Phenotypic Characterization of a Wild Medaka Population : Towards the Establishment of an Isogenic Population Genetic Resource in Fish

Author: Aizu Tomoyuki
Auer Thomas O.
Birney Ewan
Dolle Dirk
Dunham Ian
Fujiyama Asao
Loosli Felix
Minakuchi Yohei
Naruse Kiyoshi
Peravali Ravindra
Spivakov Mikhail
Toyoda Atsushi
Wittbrodt Joachim
Publication venue: Genetics Society of America
Publication date: 09/01/2014
Field of study

Oryzias latipes (medaka) has been established as a vertebrate genetic model for more than a century and recently has been rediscovered outside its native Japan. The power of new sequencing methods now makes it possible to reinvigorate medaka genetics, in particular by establishing a near-isogenic panel derived from a single wild population. Here we characterize the genomes of wild medaka catches obtained from a single Southern Japanese population in Kiyosu as a precursor for the establishment of a near-isogenic panel of wild lines. The population is free of significant detrimental population structure and has advantageous linkage disequilibrium properties suitable for the establishment of the proposed panel. Analysis of morphometric traits in five representative inbred strains suggests phenotypic mapping will be feasible in the panel. In addition, high-throughput genome sequencing of these medaka strains confirms their evolutionary relationships on lines of geographic separation and provides further evidence that there has been little significant interbreeding between the Southern and Northern medaka population since the Southern/Northern population split. The sequence data suggest that the Southern Japanese medaka existed as a larger older population that went through a relatively recent bottleneck approximately 10,000 years ago. In addition, we detect patterns of recent positive selection in the Southern population. These data indicate that the genetic structure of the Kiyosu medaka samples is suitable for the establishment of a vertebrate near-isogenic panel and therefore inbreeding of 200 lines based on this population has commenced. Progress of this project can be tracked at http://www.ebi.ac.uk/birney-srv/medaka-ref-panel

KITopen

PubMed Central

The Light Responsive Transcriptome of the Zebrafish: Function and Regulation

Author: A Kalsbeek
A Mehra
AD Guler
AJ Carr
AJ Saldanha
AW Girotti
B Kornmann
Benjamin D. Weger
C Guo
CP Selby
D Gavriouchkina
D Smedley
D Vallone
D Vallone
D Whitmore
D Whitmore
Daniela Vallone
Dirk Dolle
DM Berson
DO Zharkov
E Pinzar
F Damiola
Ferenc Mueller
G Bindea
G Vatine
Georg W. Otto
GK Smyth
H Reinke
H Ukai
HJ Bailes
I Lee
J Dunlap
J Hirayama
J Hirayama
JD Plautz
JS Takahashi
K Lahiri
Kajori Lahiri
KN Paul
L Ettwiller
L Ziv
LA Higa
Laurence Ettwiller
M Frisch
M Gallego
M Kaneko
Meltem Sahinbas
MJ de Hoon
ML Circu
MP Dekens
MP Dekens
MP Pando
N Cermakian
N Ozturk
Nicholas S. Foulkes
Olivier Armant
P Herrlich
P Moutsaki
P Shannon
Philipp Mracek
R Breitling
RC Gentleman
Robert Geisler
S Maere
SG Ahn
SJ Kuhlman
SN Peirson
SR Pulivarthy
SW Ryter
T Dickmeis
T Tanaka
Thomas Dickmeis
TK Tamai
TK Tamai
TL Bailey
TL Siu
TP Burris
TT Huang
U Schibler
Y Benjamini
Z Ben-Moshe
Z Wu
Publication venue: Public Library of Science
Publication date: 01/02/2011
Field of study

Most organisms possess circadian clocks that are able to anticipate the day/night cycle and are reset or “entrained” by the ambient light. In the zebrafish, many organs and even cultured cell lines are directly light responsive, allowing for direct entrainment of the clock by light. Here, we have characterized light induced gene transcription in the zebrafish at several organizational levels. Larvae, heart organ cultures and cell cultures were exposed to 1- or 3-hour light pulses, and changes in gene expression were compared with controls kept in the dark. We identified 117 light regulated genes, with the majority being induced and some repressed by light. Cluster analysis groups the genes into five major classes that show regulation at all levels of organization or in different subset combinations. The regulated genes cover a variety of functions, and the analysis of gene ontology categories reveals an enrichment of genes involved in circadian rhythms, stress response and DNA repair, consistent with the exposure to visible wavelengths of light priming cells for UV-induced damage repair. Promoter analysis of the induced genes shows an enrichment of various short sequence motifs, including E- and D-box enhancers that have previously been implicated in light regulation of the zebrafish period2 gene. Heterologous reporter constructs with sequences matching these motifs reveal light regulation of D-box elements in both cells and larvae. Morpholino-mediated knock-down studies of two homologues of the D-box binding factor Tef indicate that these are differentially involved in the cell autonomous light induction in a gene-specific manner. These findings suggest that the mechanisms involved in period2 regulation might represent a more general pathway leading to light induced gene expression

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

UCL Discovery

MPG.PuRe

Sixteen diverse laboratory mouse reference genomes define strain-specific haplotypes and novel functional loci.

Author: A Diefenbach
A Goios
A Hodgkins
A Kirby
AD Ewing
Adam Frankish
AG Doran
AL Rasmussen
Anne Czechanski
Anne Ferguson-Smith
Anthony G. Doran
B Paten
B Yalcin
B Yalcin
Beiyuan Fu
Benedict Paten
Binnaz Yalcin
C Durrant
Charles Steward
Chris J. Lelliott
Clayton E. Mathews
Cristina Sisu
Darren W. Logan
David J. Adams
David Thybert
Dent Earl
Dirk-Dominik Dolle
DM Church
DR Schrider
Duncan T. Odom
ED Boyden
EM Simpson
ES Lander
F Bauernfeind
Fabio C. P. Navarro
Fengtang Yang
FY Ideraabdullah
GA Churchill
GA Churchill
GA Taylor
Glen Threadgold
GTEx Consortium.
H Mi
H Zhang
I Sastalla
Ian T. Fiddes
J Flint
J Giordano
J Harrow
J Lilue
JA Beck
JA Weiner
James Gilbert
James Torrance
Jane Loveland
JE French
Jennifer Harrow
Jingtao Lilue
JL Americo
JL Levinsohn
JM Mudge
Joanna Collins
Joel Armstrong
Jonathan Flint
Jonathan Wood
JP Hunn
JT Simpson
K Boroviak
Kerstin Howe
KH Braunewell
Kim Wong
KL Svenson
KM Monroe
Lars Romoth
Laura Reinholdt
LD Shultz
Leo Goodstadt
Lesley Shirley
LL Lanier
LL Liebenauer
LR Saraiva
M Boniotto
M Li
M Stanke
M Stremlau
Marcela Sjoberg-Herrera
Mario Stanke
Mark Diekhans
Mark Gerstein
Mark Thomas
Matt Dunn
ME Dickinson
Mike Quail
Mikhail Kolmogorov
MN Loviglio
Monica Abrudan
MT Ferris
Naomi Park
NH Putnam
O Bustos
O Keller
P Broz
Paul Flicek
Paul Muir
PD Dummer
Petr Danecek
Q Liu
R Luo
Richard Durbin
Richard Mott
Ruth Bennett
S König
Sarah Pelan
SNP Kelada
Son K. Pham
SR Patierno
Stefanie Nachtweide
Stephan Collins
T O’Sullivan
TA Bell
Thomas M. Keane
TM Keane
WC Skarnes
William Chow
Ximena Ibarra-Soria
Y Cai
Z Ye
Z Zhang
Publication venue: Nat Genet
Publication date: 01/10/2018
Field of study

We report full-length draft de novo genome assemblies for 16 widely used inbred mouse strains and find extensive strain-specific haplotype variation. We identify and characterize 2,567 regions on the current mouse reference genome exhibiting the greatest sequence diversity. These regions are enriched for genes involved in pathogen defence and immunity and exhibit enrichment of transposable elements and signatures of recent retrotransposition events. Combinations of alleles and genes unique to an individual strain are commonly observed at these loci, reflecting distinct strain phenotypes. We used these genomes to improve the mouse reference genome, resulting in the completion of 10 new gene structures. Also, 62 new coding loci were added to the reference genome annotation. These genomes identified a large, previously unannotated, gene (Efcab3-like) encoding 5,874 amino acids. Mutant Efcab3-like mice display anomalies in multiple brain regions, suggesting a possible role for this gene in the regulation of brain development

HAL-uB

Crossref

The Jackson Laboratory: The Mouseion at the JAXlibrary

HAL-Inserm

UCL Discovery

eScholarship - University of California

Apollo (Cambridge)

Brunel University Research Archive

Können auch Bilder erzählen? : Visualität und Narrativität im Comic

Author: Dolle-Weinkauf Bernd
Frank Dirk
Publication venue
Publication date: 01/01/2017
Field of study

Comics sind ein überaus beliebtes Genre, vielleicht mehr denn je. Manga, aber auch Graphic Novels haben heute in jedem Buchladen ihre eigenen Regale. Aber worum handelt es sich eigentlich: um Bilder, die mit Text ergänzt werden, oder vice versa? Lesen wir oder schauen wir Comics, und warum lohnt es sich, dieses Misch-Genre zu erforschen? Darüber hat Dirk Frank mit Bernd Dolle-Weinkauff, Literaturwissenschaftler und Comic-Experte am Institut für Jugendbuchforschung, gesprochen

Hochschulschriftenserver - Universität Frankfurt am Main

Using reference-free compressed data structures to analyze sequencing reads from thousands of human genomes

Author: Dirk D. Dolle
Jared T. Simpson
Matthew Cotten
Richard Durbin
Shane A. McCarthy
Thomas M. Keane
Zamin Iqbal
Zhicheng Liu
Publication venue: 'Cold Spring Harbor Laboratory'
Publication date
Field of study

Crossref

Correction: Handling Permutation in Sequence Comparison: Genome-Wide Enhancer Prediction in Vertebrates by a Novel Non-Linear Alignment Scoring Principle.

Author: Burkhard Höckendorf
Daigo Inoue
Dirk Dolle
Joachim Wittbrodt
Juan L Mateo
Laurence Ettwiller
Lazaro Centanin
Michael P Eichenlaub
Rebecca Sinn
Robert Reinhardt
Publication venue: Public Library of Science (PLoS)
Publication date
Field of study

[This corrects the article DOI: 10.1371/journal.pone.0141487.]

Directory of Open Access Journals