Search CORE

129 research outputs found

Recommended from our members

*-DCC: A platform to collect, annotate, and explore a large variety of sequencing experiments.

Author: Brown James B
Daub Carsten O
Hörtenhuber Matthias
Mukarram Abdul K
Stoiber Marcus H
Publication venue: eScholarship, University of California
Publication date: 01/03/2020
Field of study

BackgroundOver the past few years the variety of experimental designs and protocols for sequencing experiments increased greatly. To ensure the wide usability of the produced data beyond an individual project, rich and systematic annotation of the underlying experiments is crucial.FindingsWe first developed an annotation structure that captures the overall experimental design as well as the relevant details of the steps from the biological sample to the library preparation, the sequencing procedure, and the sequencing and processed files. Through various design features, such as controlled vocabularies and different field requirements, we ensured a high annotation quality, comparability, and ease of annotation. The structure can be easily adapted to a large variety of species. We then implemented the annotation strategy in a user-hosted web platform with data import, query, and export functionality.ConclusionsWe present here an annotation structure and user-hosted platform for sequencing experiment data, suitable for lab-internal documentation, collaborations, and large-scale annotation efforts

eScholarship - University of California

Estimating mutual information using B-spline functions – an improved similarity measure for analysing gene expression data

Author: Daub Carsten O
Kloska Sebastian
Selbig Joachim
Steuer Ralf
Publication venue: BioMed Central
Publication date: 01/01/2004
Field of study

BACKGROUND: The information theoretic concept of mutual information provides a general framework to evaluate dependencies between variables. In the context of the clustering of genes with similar patterns of expression it has been suggested as a general quantity of similarity to extend commonly used linear measures. Since mutual information is defined in terms of discrete variables, its application to continuous data requires the use of binning procedures, which can lead to significant numerical errors for datasets of small or moderate size. RESULTS: In this work, we propose a method for the numerical estimation of mutual information from continuous data. We investigate the characteristic properties arising from the application of our algorithm and show that our approach outperforms commonly used algorithms: The significance, as a measure of the power of distinction from random correlation, is significantly increased. This concept is subsequently illustrated on two large-scale gene expression datasets and the results are compared to those obtained using other similarity measures. A C++ source code of our algorithm is available for non-commercial use from [email protected] upon request. CONCLUSION: The utilisation of mutual information as similarity measure enables the detection of non-linear correlations in gene expression datasets. Frequently applied linear correlation measures, which are often used on an ad-hoc basis without further justification, are thereby extended

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

MPG.PuRe

pre-miRNA profiles obtained through application of locked nucleic acids and deep sequencing reveals complex 5′/3′ arm variation including concomitant cleavage and polyuridylation patterns

Author: Ando Yoshinari
Burroughs A. Maxwell
Daub Carsten O.
Hayashizaki Yoshihide
Kawano Mitsuoki
Publication venue: Oxford University Press
Publication date: 01/02/2012
Field of study

Recent research hints at an underappreciated complexity in pre-miRNA processing and regulation. Global profiling of pre-miRNA and its potential to increase understanding of the pre-miRNA landscape is impeded by overlap with highly expressed classes of other non coding (nc) RNA. Here, we present a data set excluding these RNA before sequencing through locked nucleic acids (LNA), greatly increasing pre-miRNA sequence counts with no discernable effect on pre-miRNA or mature miRNA sequencing. Analysis of profiles generated in total, nuclear and cytoplasmic cell fractions reveals that pre-miRNAs are subject to a wide range of regulatory processes involving loci-specific 3′- and 5′-end variation entailing complex cleavage patterns with co-occurring polyuridylation. Additionally, examination of nuclear-enriched flanking sequences of pre-miRNA, particularly those derived from polycistronic miRNA transcripts, provides insight into miRNA and miRNA-offset (moRNA) production, specifically identifying novel classes of RNA potentially functioning as moRNA precursors. Our findings point to particularly intricate regulation of the let-7 family in many ways reminiscent of DICER1-independent, pre-mir-451-like processing, introduce novel and unify known forms of pre-miRNA regulation and processing, and shed new light on overlooked products of miRNA processing pathways

PubMed Central

Hokkaido University Collection of Scholarly and Academic Papers

A comprehensive promoter landscape identifies a novel promoter for CD133 in restricted tissues, cancers, and stem cells

Author: Andreas eBehren
Carsten O Daub
Carsten O Daub
Christopher A Maher
Craig eGedye
Elizabeth R Lawlor
Jonathan eCebon
Morana eVitezic
Morana eVitezic
Oliver Marc Hofmann
Oliver Marc Hofmann
Otavia L Caballero
Piero eCarninci
Ramakrishna eSompallae
Sylvie eDevalle
Winston eHide
Winston eHide
Yoshihide eHayashizaki
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2013
Field of study

PROM1 is the gene encoding prominin-1 or CD133, an important cell surface marker for the isolation of both normal and cancer stem cells. PROM1 transcripts initiate at a range of transcription start sites (TSS) associated with distinct tissue and cancer expression profiles. Using high resolution Cap Analysis of Gene Expression (CAGE) sequencing we characterize TSS utilization across a broad range of normal and developmental tissues. We identify a novel proximal promoter (P6) within CD133+ melanoma cell lines and stem cells. Additional exon array sampling finds P6 to be active in populations enriched for mesenchyme, neural stem cells and within CD133+ enriched Ewing sarcomas. The P6 promoter is enriched with respect to previously characterized PROM1 promoters for a HMGI/Y (HMGA1) family transcription factor binding site motif and exhibits different epigenetic modifications relative to the canonical promoter region of PROM1

Crossref

Harvard University - DASH

Directory of Open Access Journals

Digital Commons@Becker

Frontiers - Publisher Connector

PubMed Central

University of Melbourne Institutional Repository

Methods for analyzing deep sequencing expression data: constructing the human and mouse promoterome with deepCAGE data

Author: Balwierz Piotr J
Beisel Christian
Carninci Piero
Daub Carsten O
Hayashizaki Yoshihide
Kawai Jun
Van Belle Werner
van Nimwegen Erik
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

A set of methods is presented for normalization, quantification of noise and co-expression analysis for gene expression studies using deep sequencing

Repository for Publications and Research Data

Crossref

Springer - Publisher Connector

edoc

PubMed Central

Transcriptional features of genomic regulatory blocks

Author: Akalin Altuna
Arner Erik
Bryne Jan Christian
Daub Carsten O
Dong Xianjun
Fredman David
Hayashizaki Yoshihide
Lenhard Boris
Suzuki Harukazu
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

CAGE tag mapping of transcription start sites across different human tissues shows that genomic regulatory blocks have unique features that are the likely cause of their ability to respond to regulatory inputs from very long distances

Springer - Publisher Connector

PubMed Central

Spiral - Imperial College Digital Repository

MDC Repository

Nonimmunoglobulin target loci of activation-induced cytidine deaminase (AID) share unique features with immunoglobulin genes.

Author: Begum Nasim A
Burroughs A Maxwell
Daub Carsten O
Doi Tomomitsu
Hayashizaki Yoshihide
Honjo Tasuku
Kato Lucia
Kawaguchi Takahisa
Kawai Jun
Matsuda Fumihiko
Publication venue: 'Proceedings of the National Academy of Sciences'
Publication date: 30/01/2012
Field of study

Activation-induced cytidine deaminase (AID) is required for both somatic hypermutation and class-switch recombination in activated B cells. AID is also known to target nonimmunoglobulin genes and introduce mutations or chromosomal translocations, eventually causing tumors. To identify as-yet-unknown AID targets, we screened early AID-induced DNA breaks by using two independent genome-wide approaches. Along with known AID targets, this screen identified a set of unique genes (SNHG3, MALAT1, BCL7A, and CUX1) and confirmed that these loci accumulated mutations as frequently as Ig locus after AID activation. Moreover, these genes share three important characteristics with the Ig gene: translocations in tumors, repetitive sequences, and the epigenetic modification of chromatin by H3K4 trimethylation in the vicinity of cleavage sites

PubMed Central

Kyoto University Research Information Repository

Core promoter structure and genomic context reflect histone 3 lysine 9 acetylation patterns

Author: Arakawa Takahiro
Arner Erik
Carninci Piero
Daub Carsten O
Hayashizaki Yoshihide
Kawai Jun
Kratz Anton
Kubosaki Atsutaka
Saito Rintaro
Suzuki Harukazu
Tomita Masaru
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Crossref

Springer - Publisher Connector

PubMed Central