38 research outputs found

    Transcript Annotation in FANTOM3: Mouse Gene Catalog Based on Physical cDNAs

    Get PDF
    The international FANTOM consortium aims to produce a comprehensive picture of the mammalian transcriptome, based upon an extensive cDNA collection and functional annotation of full-length enriched cDNAs. The previous dataset, FANTOM2, comprised 60,770 full-length enriched cDNAs. Functional annotation revealed that this cDNA dataset contained only about half of the estimated number of mouse protein-coding genes, indicating that a number of cDNAs still remained to be collected and identified. To pursue the complete gene catalog that covers all predicted mouse genes, cloning and sequencing of full-length enriched cDNAs has been continued since FANTOM2. In FANTOM3, 42,031 newly isolated cDNAs were subjected to functional annotation, and the annotation of 4,347 FANTOM2 cDNAs was updated. To accomplish accurate functional annotation, we improved our automated annotation pipeline by introducing new coding sequence prediction programs and developed a Web-based annotation interface for simplifying the annotation procedures to reduce manual annotation errors. Automated coding sequence and function prediction was followed with manual curation and review by expert curators. A total of 102,801 full-length enriched mouse cDNAs were annotated. Out of 102,801 transcripts, 56,722 were functionally annotated as protein coding (including partial or truncated transcripts), providing to our knowledge the greatest current coverage of the mouse proteome by full-length cDNAs. The total number of distinct non-protein-coding transcripts increased to 34,030. The FANTOM3 annotation system, consisting of automated computational prediction, manual curation, and final expert curation, facilitated the comprehensive characterization of the mouse transcriptome, and could be applied to the transcriptomes of other species

    Functional annotation of human long noncoding RNAs via molecular phenotyping

    Get PDF
    Long noncoding RNAs (lncRNAs) constitute the majority of transcripts in the mammalian genomes, and yet, their functions remain largely unknown. As part of the FANTOM6 project, we systematically knocked down the expression of 285 lncRNAs in human dermal fibroblasts and quantified cellular growth, morphological changes, and transcriptomic responses using Capped Analysis of Gene Expression (CAGE). Antisense oligonucleotides targeting the same lncRNAs exhibited global concordance, and the molecular phenotype, measured by CAGE, recapitulated the observed cellular phenotypes while providing additional insights on the affected genes and pathways. Here, we disseminate the largest-todate lncRNA knockdown data set with molecular phenotyping (over 1000 CAGE deep-sequencing libraries) for further exploration and highlight functional roles for ZNF213-AS1 and lnc-KHDC3L-2.Peer reviewe

    The Constrained Maximal Expression Level Owing to Haploidy Shapes Gene Content on the Mammalian X Chromosome.

    Get PDF
    X chromosomes are unusual in many regards, not least of which is their nonrandom gene content. The causes of this bias are commonly discussed in the context of sexual antagonism and the avoidance of activity in the male germline. Here, we examine the notion that, at least in some taxa, functionally biased gene content may more profoundly be shaped by limits imposed on gene expression owing to haploid expression of the X chromosome. Notably, if the X, as in primates, is transcribed at rates comparable to the ancestral rate (per promoter) prior to the X chromosome formation, then the X is not a tolerable environment for genes with very high maximal net levels of expression, owing to transcriptional traffic jams. We test this hypothesis using The Encyclopedia of DNA Elements (ENCODE) and data from the Functional Annotation of the Mammalian Genome (FANTOM5) project. As predicted, the maximal expression of human X-linked genes is much lower than that of genes on autosomes: on average, maximal expression is three times lower on the X chromosome than on autosomes. Similarly, autosome-to-X retroposition events are associated with lower maximal expression of retrogenes on the X than seen for X-to-autosome retrogenes on autosomes. Also as expected, X-linked genes have a lesser degree of increase in gene expression than autosomal ones (compared to the human/Chimpanzee common ancestor) if highly expressed, but not if lowly expressed. The traffic jam model also explains the known lower breadth of expression for genes on the X (and the Z of birds), as genes with broad expression are, on average, those with high maximal expression. As then further predicted, highly expressed tissue-specific genes are also rare on the X and broadly expressed genes on the X tend to be lowly expressed, both indicating that the trend is shaped by the maximal expression level not the breadth of expression per se. Importantly, a limit to the maximal expression level explains biased tissue of expression profiles of X-linked genes. Tissues whose tissue-specific genes are very highly expressed (e.g., secretory tissues, tissues abundant in structural proteins) are also tissues in which gene expression is relatively rare on the X chromosome. These trends cannot be fully accounted for in terms of alternative models of biased expression. In conclusion, the notion that it is hard for genes on the Therian X to be highly expressed, owing to transcriptional traffic jams, provides a simple yet robustly supported rationale of many peculiar features of X's gene content, gene expression, and evolution

    Data Descriptor : FANTOM5 CAGE profiles of human and mouse samples

    Get PDF
    In the FANTOM5 project, transcription initiation events across the human and mouse genomes were mapped at a single base-pair resolution and their frequencies were monitored by CAGE (Cap Analysis of Gene Expression) coupled with single-molecule sequencing. Approximately three thousands of samples, consisting of a variety of primary cells, tissues, cell lines, and time series samples during cell activation and development, were subjected to a uniform pipeline of CAGE data production. The analysis pipeline started by measuring RNA extracts to assess their quality, and continued to CAGE library production by using a robotic or a manual workflow, single molecule sequencing, and computational processing to generate frequencies of transcription initiation. Resulting data represents the consequence of transcriptional regulation in each analyzed state of mammalian cells. Non-overlapping peaks over the CAGE profiles, approximately 200,000 and 150,000 peaks for the human and mouse genomes, were identified and annotated to provide precise location of known promoters as well as novel ones, and to quantify their activities.Peer reviewe

    Annotation Pipelines for Transcript Description and for GO Terms

    No full text
    <div><p>(A) Pipeline for transcript description. Query sequences falling into categories (black boxes) 1–3 were assigned the description of the matched target sequence DNA entry in MGI symbols, and synonyms were also transferred to our annotation database. Queries falling into categories 4–10 were assigned a transcript description corresponding to the matched protein name. For query sequences falling into category 5 or 6, the keyword “homolog” was appended to the matching protein name. Sequences assigned to category 7 or 8 were denoted with the prefix “similar to” attached to the target sequence name. The prefix “weakly similar” was used to identify sequences assigned to category 9 or 10. For all sequences in categories 5–10, the name of the organism corresponding to the matched protein was appended to the assigned transcript description. If a query was assigned to category 14, its transcript description was “hypothetical [InterPro domain name] containing protein.” Query sequences assigned to category 17 and 19 were annotated as “hypothetical protein” and “unclassifiable,” respectively. Query sequences grouped into category N1 or N2 were assigned the description of the matched target ncRNA entry. For query sequences falling into category N2, the keyword “homolog of” was appended to the matching ncRNA name.</p> <p>(B) Pipeline for GO terms.</p> <p>DB, database.</p></div

    Transcribed enhancers lead waves of coordinated transcription in transitioning mammalian cells

    No full text
    While it is generally accepted that cellular differentiation requires changes to transcriptional networks, dynamic regulation of promoters and enhancers at specific sets of genes has not been previously studied en masse. Exploiting the fact that active promoters and enhancers are transcribed, we simultaneously measured their activity in 19 human and 14 mouse time courses covering a wide range of cell types and biological stimuli. Enhancer RNAs, then mRNAs encoding transcription factors dominated the earliest responses. Binding sites for key lineage transcription factors were simultaneously over-represented in enhancers and promoters active in each cellular system. Our data support a highly generalizable model in which enhancer transcription is the earliest event in successive waves of transcriptional change during cellular differentiation or activation

    Collision detection on transmission lines with optical interferometer

    No full text
    V diplomski nalogi skušamo ugotoviti, v kolikšni meri je možno zaznavati in klasificirati trke na jeklenicah daljnovodov z optičnim interferometrom. Na začetku predstavimo osnovne pojme interferometrije in opišemo uporabljen optični interferometer. V jedru diplomske naloge natančneje opišemo eksperimentalni protokol in obdelavo signalov. Nadaljujemo z implementacijo algoritmov za segmentacijo in klasifikacijo zajetih signalov ter predstavimo dobljene rezultate. Segmentacijo izvedemo v domeni števila prehodov signala skozi ničlo, za klasifikacijo pa uporabimo večplastno nevronsko mrežo z algoritmom vzvratnega učenja. Rezultati študije nakazujejo, da sta implementirani segmentacija in klasifikacija uspešni v 77 % izvedenih trkov različnih predmetov.We analyse feasibility of collision detection on transmission lines with optical interferometer. We first provide a brief introduction into interferometry, along with a description of the optical interferometer used for measurements in this study. Afterwards, we describe the conducted experimental protocol and signal processing methodology. The focus is on implementation of algorithms for signal segmentation and collision classification. We used zero-crossing algorithm to transform signals into segmentation domain. Classification of collisions is done with a multilayer neural network trained by the backpropagation algorithm. The results demonstrate an average success rate of 77% for segmentation and classification of collision with five different objects

    Sistematización de la experiencia de un ambiente de aprendizaje enriquecido por TIC durante la práctica clínica en fisioterapia cardiopulmonar en un hospital de nivel II de la ciudad de Cali

    No full text
    Esta investigación se centra en la caracterización de la experiencia de 4 estudiantes de fisioterapia de IX semestre de la Institución Universitaria Escuela Nacional del Deporte (IUEND) durante la implementación de un ambiente de aprendizaje enriquecido con Tecnologías de la Información y la Comunicación (TIC) en la práctica clínico – asistencial en Salud Cardiopulmonar; la cual se fundamenta en el hacer y pone a prueba las bases conceptuales del ciclo de fundamentación; todo esto con el fin de identificar las experiencias significativas que facilitan el aprendizaje y desarrollo de competencias clínicas, además analizar si este tipo de estrategias de enseñanza -aprendizaje permite al estudiante y al docente asesor superar inconvenientes propios de la práctica clínica como: optimizar tiempos de atención a pacientes, estudio independiente y trabajo colaborativo, retomar e integrar gran cantidad de conceptos y procedimientos aprendidos en IV semestre con las nuevas experiencias y la realidad del paciente; y a la vez cumplir con funciones administrativas propios del rol del fisioterapeuta asistencial (estadística, indicadores, desarrollo de guías, etc.) que dificultan el proceso de aprendizaje; concluyendo que los ambientes mediados por TIC pueden lograr superar estas dificultades y favorecer finalmente el aprendizaje significativo (juicio clínico), en el que se fundamenta el ciclo de práctica profesional
    corecore