In the FANTOM5 project, transcription initiation events across the human and
mouse genomes were mapped at a single base-pair resolution and their
frequencies were monitored by CAGE (Cap Analysis of Gene Expression) coupled
with single-molecule sequencing. Approximately three thousands of samples,
consisting of a variety of primary cells, tissues, cell lines, and time series
samples during cell activation and development, were subjected to a uniform
pipeline of CAGE data production. The analysis pipeline started by measuring
RNA extracts to assess their quality, and continued to CAGE library production
by using a robotic or a manual workflow, single molecule sequencing, and
computational processing to generate frequencies of transcription initiation.
Resulting data represents the consequence of transcriptional regulation in
each analyzed state of mammalian cells. Non-overlapping peaks over the CAGE
profiles, approximately 200,000 and 150,000 peaks for the human and mouse
genomes, were identified and annotated to provide precise location of known
promoters as well as novel ones, and to quantify their activities