Search CORE

SAMStat: monitoring biases in next generation sequencing data

Author: C. O. Daub
Carninci
Plessy
T. Lassmann
Trapnell
Y. Hayashizaki
Publication venue: Oxford University Press
Publication date
Field of study

Motivation: The sequence alignment/map format (SAM) is a commonly used format to store the alignments between millions of short reads and a reference genome. Often certain positions within the reads are inherently more likely to contain errors due to the protocols used to prepare the samples. Such biases can have adverse effects on both mapping rate and accuracy. To understand the relationship between potential protocol biases and poor mapping we wrote SAMstat, a simple C program plotting nucleotide overrepresentation and other statistics in mapped and unmapped reads in a concise html page. Collecting such statistics also makes it easy to highlight problems in the data processing and enables non-experts to track data quality over time

Cold Spring Harbor Laboratory Institutional Repository

H3S28P Antibody Staining of Okinawan Oikopleura dioica Suggests the Presence of Three Chromosomes [version 2; peer review: 2 approved]

Author: Bliznina A
Liu AW
Luscombe NM
Masunaga A
Plessy C
Tan Y
West C
Publication venue: 'F1000 Research Ltd'
Publication date: 01/03/2021
Field of study

Oikopleura dioica is a ubiquitous marine zooplankton of biological interest owing to features that include dioecious reproduction, a short life cycle, conserved chordate body plan, and a compact genome. It is an important tunicate model for evolutionary and developmental research, as well as investigations into marine ecosystems. The genome of north Atlantic O. dioica comprises three chromosomes. However, comparisons with the genomes of O. dioica sampled from mainland and southern Japan revealed extensive sequence differences. Moreover, historical studies have reported widely varying chromosome counts. We recently initiated a project to study the genomes of O. dioica individuals collected from the coastline of the Ryukyu (Okinawa) Islands in southern Japan. Given the potentially large extent of genomic diversity, we employed karyological techniques to count individual animals’ chromosomes in situ using centromere-specific antibodies directed against H3S28P, a prophase-metaphase cell cycle-specific marker of histone H3. Epifluorescence and confocal images were obtained of embryos and oocytes stained with two commercial anti-H3S28P antibodies (Abcam ab10543 and Thermo Fisher 07-145). The data lead us to conclude that diploid cells from Okinawan O. dioica contain three pairs of chromosomes, in line with the north Atlantic populations. The finding facilitates the telomere-to-telomere assembly of Okinawan O. dioica genome sequences and gives insight into the genomic diversity of O. dioica from different geographical locations. The data deposited in the EBI BioImage Archive provide representative images of the antibodies’ staining properties for use in epifluorescent and confocal based fluorescent microscopy

UCL Discovery

Multiplicity of 5' Cap Structures Present on Short RNAs

Author: Abdelhamid R. F.
Carninci P.
de Hoon M.
Gingeras T. R.
Isobe T.
Plessy C.
Taoka M.
Yamauchi Y.
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 31/07/2014
Field of study

Most RNA molecules are co- or post-transcriptionally modified to alter their chemical and functional properties to assist in their ultimate biological function. Among these modifications, the addition of 5' cap structure has been found to regulate turnover and localization. Here we report a study of the cap structure of human short (<200 nt) RNAs (sRNAs), using sequencing of cDNA libraries prepared by enzymatic pretreatment of the sRNAs with cap sensitive-specificity, thin layer chromatographic (TLC) analyses of isolated cap structures and mass spectrometric analyses for validation of TLC analyses. Processed versions of snoRNAs and tRNAs sequences of less than 50 nt were observed in capped sRNA libraries, indicating additional processing and recapping of these annotated sRNAs biotypes. We report for the first time 2,7 dimethylguanosine in human sRNAs cap structures and surprisingly we find multiple type 0 cap structures (mGpppC, 7mGpppG, GpppG, GpppA, and 7mGpppA) in RNA length fractions shorter than 50 nt. Finally, we find the presence of additional uncharacterized cap structures that wait determination by the creation of needed reference compounds to be used in TLC analyses. These studies suggest the existence of novel biochemical pathways leading to the processing of primary and sRNAs and the modifications of their RNA 5' ends with a spectrum of chemical modifications

Wide-Scale Analysis of Human Functional Transcription Factor Binding Reveals a Strong Bias towards the Transcription Start Site

Author: A Ambesi-Impiombato
A Blais
A Eto
A Subramanian
AE Kel
AG Clark
AL Lam
AM McGuire
Anat Reiner
Assif Yitzhaky
B Ren
C Kimura-Yoshida
C Plessy
C Yang
CT Harbison
D Pfeifer
D Wang
DB Allison
E Emberly
E Segal
Eytan Domany
FP Roth
GC Pipes
GC Yuan
GQ Yao
GZ Hertz
H Li
H Lodish
J Zheng
JD Hughes
JL DeRisi
JQ Ling
K Frech
K Quandt
KD MacIsaac
L Amir-Zilberstein
L Elnitski
L Marino-Ramirez
L McCue
M Ashburner
M Kellis
M Milyavsky
MA Nobrega
Mark Koudritsky
MC Frith
ML Howard
ML Whitfield
N Rajewsky
Or Zuk
P Carninci
P Carninci
P Cliften
PM Haverty
PR Buckland
R Elkon
R Liu
R Sharan
Ran Brosh
S Aerts
S Rashi-Elkeles
S Tavazoie
SJ Cooper
SJ Ho Sui
Sui Huang
U Gerland
Varda Rotter
WW Wasserman
X Xie
Y Barash
Y Benjamini
Y Benjamini
Y Tabach
Yossi Buganim
Yuval Tabach
Z Wang
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2007
Field of study

We introduce a novel method to screen the promoters of a set of genes with shared biological function, against a precompiled library of motifs, and find those motifs which are statistically over-represented in the gene set. The gene sets were obtained from the functional Gene Ontology (GO) classification; for each set and motif we optimized the sequence similarity score threshold, independently for every location window (measured with respect to the TSS), taking into account the location dependent nucleotide heterogeneity along the promoters of the target genes. We performed a high throughput analysis, searching the promoters (from 200bp downstream to 1000bp upstream the TSS), of more than 8000 human and 23,000 mouse genes, for 134 functional Gene Ontology classes and for 412 known DNA motifs. When combined with binding site and location conservation between human and mouse, the method identifies with high probability functional binding sites that regulate groups of biologically related genes. We found many location-sensitive functional binding events and showed that they clustered close to the TSS. Our method and findings were put to several experimental tests. By allowing a "flexible" threshold and combining our functional class and location specific search method with conservation between human and mouse, we are able to identify reliably functional TF binding sites. This is an essential step towards constructing regulatory networks and elucidating the design principles that govern transcriptional regulation of expression. The promoter region proximal to the TSS appears to be of central importance for regulation of transcription in human and mouse, just as it is in bacteria and yeast.Comment: 31 pages, including Supplementary Information and figure

arXiv.org e-Print Archive

CiteSeerX

Update of the FANTOM web resource: from mammalian transcriptional landscape to its dynamic regulation

Author: A. R. R. Forrest
Adati
Balwierz
C. O. Daub
Carninci
E. van Nimwegen
Faulkner
Forrest
Gaidatzis
H. Kawaji
H. Suzuki
Harbers
J. Severin
K. Irvine
K. Schroder
Kawai
Kawaji
Kawaji
Kawaji
Kratz
Kubosaki
Kubosaki
M. Lizio
M. Rehli
Mar
Okazaki
P. Carninci
Plessy
Ravasi
Rayner
RIKEN Genome Exploration Research Group and Genome
Sansone
Severin
Stein
Suzuki
Taft
The FANTOM Consortium
Tsuchiya
Y. Hayashizaki
Publication venue: Oxford University Press
Publication date: 01/01/2011
Field of study

The international Functional Annotation Of the Mammalian Genomes 4 (FANTOM4) research collaboration set out to better understand the transcriptional network that regulates macrophage differentiation and to uncover novel components of the transcriptome employing a series of high-throughput experiments. The primary and unique technique is cap analysis of gene expression (CAGE), sequencing mRNA 5′-ends with a second-generation sequencer to quantify promoter activities even in the absence of gene annotation. Additional genome-wide experiments complement the setup including short RNA sequencing, microarray gene expression profiling on large-scale perturbation experiments and ChIP–chip for epigenetic marks and transcription factors. All the experiments are performed in a differentiation time course of the THP-1 human leukemic cell line. Furthermore, we performed a large-scale mammalian two-hybrid (M2H) assay between transcription factors and monitored their expression profile across human and mouse tissues with qRT-PCR to address combinatorial effects of regulation by transcription factors. These interdependent data have been analyzed individually and in combination with each other and are published in related but distinct papers. We provide all data together with systematic annotation in an integrated view as resource for the scientific community (http://fantom.gsc.riken.jp/4/). Additionally, we assembled a rich set of derived analysis results including published predicted and validated regulatory interactions. Here we introduce the resource and its update after the initial release

Serveur académique lausannois

edoc

University of Queensland eSpace

Protocol Dependence of Sequencing-Based Gene Expression Measurements

Author: A Goren
A Mortazavi
A Oshlack
BJ Blencowe
C Hart
C Plessy
C Trapnell
CD Armour
D Lipson
Doron Lipson
E Klein
F Ozsolak
GA Heap
I Chepelev
JC Marioni
John F. Thompson
KD Sullivan
L Mamanova
L Shi
LL Baumbach
LT Sam
M Sultan
MJ Fullwood
Najib M. El-Sayed
O Morozova
P Carninci
P Kapranov
P Kapranov
PA 't Hoen
Patrice M. Milos
Philipp Kapranov
Q Pan
R Rosenkranz
RD Canales
S Djebali
S Marguerat
SP Mane
Stan Letovsky
T Nagaike
Tal Raz
TR Breitman
YW Asmann
Z Wang
ZJ Wu
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

RNA Seq provides unparalleled levels of information about the transcriptome including precise expression levels over a wide dynamic range. It is essential to understand how technical variation impacts the quality and interpretability of results, how potential errors could be introduced by the protocol, how the source of RNA affects transcript detection, and how all of these variations can impact the conclusions drawn. Multiple human RNA samples were used to assess RNA fragmentation, RNA fractionation, cDNA synthesis, and single versus multiple tag counting. Though protocols employing polyA RNA selection generate the highest number of non-ribosomal reads and the most precise measurements for coding transcripts, such protocols were found to detect only a fraction of the non-ribosomal RNA in human cells. PolyA RNA excludes thousands of annotated and even more unannotated transcripts, resulting in an incomplete view of the transcriptome. Ribosomal-depleted RNA provides a more cost-effective method for generating complete transcriptome coverage. Expression measurements using single tag counting provided advantages for assessing gene expression and for detecting short RNAs relative to multi-read protocols. Detection of short RNAs was also hampered by RNA fragmentation. Thus, this work will help researchers choose from among a range of options when analyzing gene expression, each with its own advantages and disadvantages

CiteSeerX

Highly Parallel Genome-Wide Expression Analysis of Single Mammalian Cells

We have developed a high-throughput amplification method for generating robust gene expression profiles using single cell or low RNA inputs.The method uses tagged priming and template-switching, resulting in the incorporation of universal PCR priming sites at both ends of the synthesized cDNA for global PCR amplification. Coupled with a whole-genome gene expression microarray platform, we routinely obtain expression correlation values of R(2)~0.76-0.80 between individual cells and R(2)~0.69 between 50 pg total RNA replicates. Expression profiles generated from single cells or 50 pg total RNA correlate well with that generated with higher input (1 ng total RNA) (R(2)~0.80). Also, the assay is sufficiently sensitive to detect, in a single cell, approximately 63% of the number of genes detected with 1 ng input, with approximately 97% of the genes detected in the single-cell input also detected in the higher input.In summary, our method facilitates whole-genome gene expression profiling in contexts where starting material is extremely limiting, particularly in areas such as the study of progenitor cells in early development and tumor stem cell biology

FigShare

The Contemporary Issues and Supreme Court

Author: . L Harv
A J Gardner
A K Brauer-Rieke
A N Lavinbuk
A N Lavinbuk
Atkins V Virginia
Brand X Internet
Breard V Greene
C J Williams
C J Williams
C Kang
C S Yoo
C Savage
C W Kim
C Y Chung
Chinwe
Coker V Georgia
D Douglas
D H Kim
D Mccullagh
D Nunziato
D P Graham
Deshaney V
E Hu
E Ti
E Wyatt
Edmund V Florida
F Berkman
Findlaw
G
G Epps
G G Howard
G Gross
G Nagesh
General Electric
Gideon V Wainwright
Gonzales V Centro
Graham V Florida
Grutter V Bollinger
Grutter V Bollinger
H G Cohen
Harris V
Hi
Hi
Hi
Hi
Hi
Hi Bronia
Hilton V Guyot
Hustler Magazine
I Kant
I Kant
I Kant
International War
J Allard
J Brown
J Cheng
J O Mcginnis
J R Feagin
J R Vile
J R Vile
J R Vile
J R Vile
J R Vile
J R Vile
J R Vile
J R Vile
J R Vile
J R Vile
J R Vile
J S Yoon
Judiciary Power
K A Ruane
K A Ruane
Kerry Zivitofsky Vs
Kiyoung Kim
Kiyoung Kim
Kiyoung Kim
Kiyoung Kim
Kiyoung Kim
Kiyoung Kim
Kiyoung Kim
L A Graglia
L Greenhouse
Lemon V. Kurtzman
M A Glendon
M H
M Powell
M S Pardo
Maher V. Roe
Miller V. Alabama
Miranda V Arizona
Missouri V Holland
N Weil
Nate A
Org Oyez
P M Wald
P M Wald
Plessy V Ferguson
Plessy V Ferguson
R Dittmer
R Eveleth
R Howard
R Lachman
Ross V Oklahoma
S Crawford
S Hansell
S J Stern
S Lohr
S R Reinhardt
S Rutherford
S Y Lee
Sherbert V Verner
Staff
Sugarman V
Sullivan
T Mcgonagle
T Owen
U S A Chevron
U S A Chevron
Van Alstyne
Verizon V Fcc
W Nelson
W Tim
Worldcom
Z Stiegler
Publication venue: 'Elsevier BV'
Publication date: 01/01/2015
Field of study

Digital Gene Expression Profiling by 5′-End Sequencing of cDNAs during Reprogramming in the Moss Physcomitrella patens

Stem cells self-renew and repeatedly produce differentiated cells during development and growth. The differentiated cells can be converted into stem cells in some metazoans and land plants with appropriate treatments. After leaves of the moss Physcomitrella patens are excised, leaf cells reenter the cell cycle and commence tip growth, which is characteristic of stem cells called chloronema apical cells. To understand the underlying molecular mechanisms, a digital gene expression profiling method using mRNA 5′-end tags (5′-DGE) was established. The 5′-DGE method produced reproducible data with a dynamic range of four orders that correlated well with qRT-PCR measurements. After the excision of leaves, the expression levels of 11% of the transcripts changed significantly within 6 h. Genes involved in stress responses and proteolysis were induced and those involved in metabolism, including photosynthesis, were reduced. The later processes of reprogramming involved photosynthesis recovery and higher macromolecule biosynthesis, including of RNA and proteins. Auxin and cytokinin signaling pathways, which are activated during stem cell formation via callus in flowering plants, are also activated during reprogramming in P. patens, although no exogenous phytohormone is applied in the moss system, suggesting that an intrinsic phytohormone regulatory system may be used in the moss