Search CORE

53 research outputs found

The zebrafish transcriptome during early development

Author: Hovatta Outi
Jiao Hong
Kere Juha
Unneberg Per
Vesterlund Liselotte
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background The transition from fertilized egg to embryo is accompanied by a multitude of changes in gene expression, and the transcriptional events that underlie these processes have not yet been fully characterized. In this study RNA-Seq is used to compare the transcription profiles of four early developmental stages in zebrafish (<it>Danio rerio</it>) on a global scale. Results An average of 79 M total reads were detected from the different stages. Out of the total number of reads 65% - 73% reads were successfully mapped and 36% - 44% out of those were uniquely mapped. The total number of detected unique gene transcripts was 11187, of which 10096 were present at 1-cell stage. The largest number of common transcripts was observed between 1-cell stage and 16-cell stage. An enrichment of gene transcripts with molecular functions of DNA binding, protein folding and processing as well as metal ion binding was observed with progression of development. The sequence data (accession number ERP000635) is available at the European Nucleotide Archive. Conclusion Clustering of expression profiles shows that a majority of the detected gene transcripts are present at steady levels, and thus a minority of the gene transcripts clusters as increasing or decreasing in expression over the four investigated developmental stages. The three earliest developmental stages were similar when comparing highly expressed genes, whereas the 50% epiboly stage differed from the other three stages in the identity of highly expressed genes, number of uniquely expressed genes and enrichment of GO molecular functions. Taken together, these observations indicate a major transition in gene regulation and transcriptional activity taking place between the 512-cell and 50% epiboly stages, in accordance with previous studies.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Transposon- and Genome Dynamics in the Fungal Genus Neurospora: Insights from Nearly Gapless Genome Assemblies

Author: Jern Patric
Johannesson Hanna
Nguyen Diem
Peona Valentina
Suh Alexander
Unneberg Per
Publication venue: 'New Prairie Press'
Publication date: 18/02/2022
Field of study

A large portion of nuclear DNA is composed of transposable element (TE) sequences, whose transposition is controlled by diverse host defense strategies in order to maintain genomic integrity. One such strategy is the fungal-specific Repeat-Induced Point mutation (RIP) that hyper-mutates repetitive DNA sequences. While RIP is found across Fungi, it has been shown to vary in efficiency. The filamentous ascomycete Neurospora crassa has been a pioneer in the study of RIP, but data on TEs and RIP from other species in the genus is limited. In this study, we investigated 18 nearly gapless genome assemblies of ten Neurospora species, which diverged from a common ancestor about 7 MYA, to determine and compare genome-wide TE distribution and their associated RIP patterns. Four of these assemblies, generated by PacBio technology, represent new genomic datasets. We showed that the TE contents between 8.7-18.9% covary with genome sizes that range between 37.8-43.9 Mb. Degraded copies of Long Terminal Repeat (LTR) retrotransposons were abundant among the identified TEs, and these are distributed across the genome at varying frequencies. In all investigated Neurospora genomes, TE sequences had signs of numerous C-to-T substitutions, suggesting that RIP occurred in all species, and accordingly, RIP signatures correlated with TE-dense regions in all genomes. In conclusion, essentially gapless genome assemblies allowed us to identify TEs in Neurospora genomes, and reveal that TEs contribute to genome size variation in this group. Our study suggests that TEs and RIP are highly correlated in each examined Neurospora species, and hence, the pattern of interaction is conserved over the investigated evolutionary timescale. Finally, with our results, we verify that RIP signatures can be used to facilitate the identification of TE-rich region in the genome. The comprehensive genomic dataset of Neurospora is a rich resource for further in-depth analyses of fungal genomes by the community

Kansas State University

Differences in Gene Expression between Mouse and Human for Dynamically Regulated Genes in Early Embryo

Author: Hovatta Outi
Inzunza Jose
Katayama Shintaro
Kere Juha
Madissoon Elo
Tohonen Virpi
Unneberg Per
Vesterlund Liselotte
Publication venue
Publication date: 01/01/2014
Field of study

Peer reviewe

CiteSeerX

Crossref

Directory of Open Access Journals

PubMed Central

Helsingin yliopiston digitaalinen arkisto

FigShare

High-throughput mutational screening adds clinically important information in myelodysplastic syndromes and secondary or therapy-related acute myeloid leukemia

Author: Dimitriou Marios
Hellstrom-Lindberg Eva
Jansson Monika
Karimi Mohsen
Kere Juha
Lehmann Soren
Matsson Hans
Nilsson Christer
Unneberg Per
Publication venue
Publication date: 13/03/2015
Field of study

Non peer reviewe

Crossref

PubMed Central

Helsingin yliopiston digitaalinen arkisto

Dominant Mutations in GRHL3 Cause Van der Woude Syndrome and Disrupt Oral Periderm Development

Author: Peyrard-Janvid Myriam
Leslie Elizabeth J.
Kousa Youssef A.
Smith Tiffany L.
Dunnwald Martine
Magnusson Måns
Lentz Brian A.
Unneberg Per
Fransson Ingegerd
Koillinen Hannele K.
Rautio Jorma
Pegelow Marie
Karsten Agneta
Basel-Vanagaite Lina
Gordon William
Andersen Bogi
Svensson Thomas
Murray Jeffrey C.
Cornell Robert A.
Kere Juha
Schutte Brian C.
Publication venue: The American Society of Human Genetics. Published by Elsevier Inc.
Publication date: 01/01/2013
Field of study

Mutations in interferon regulatory factor 6 (IRF6) account for ∼70% of cases of Van der Woude syndrome (VWS), the most common syndromic form of cleft lip and palate. In 8 of 45 VWS-affected families lacking a mutation in IRF6, we found coding mutations in grainyhead-like 3 (GRHL3). According to a zebrafish-based assay, the disease-associated GRHL3 mutations abrogated periderm development and were consistent with a dominant-negative effect, in contrast to haploinsufficiency seen in most VWS cases caused by IRF6 mutations. In mouse, all embryos lacking Grhl3 exhibited abnormal oral periderm and 17% developed a cleft palate. Analysis of the oral phenotype of double heterozygote (Irf6+/−;Grhl3+/−) murine embryos failed to detect epistasis between the two genes, suggesting that they function in separate but convergent pathways during palatogenesis. Taken together, our data demonstrated that mutations in two genes, IRF6 and GRHL3, can lead to nearly identical phenotypes of orofacial cleft. They supported the hypotheses that both genes are essential for the presence of a functional oral periderm and that failure of this process contributes to VWS

arXiv.org e-Print Archive

Elsevier - Publisher Connector

Crossref

EDP Sciences OAI-PMH repository (1.2.0)

Radboud Repository

Helsingin yliopiston digitaalinen arkisto

Dominant Mutations in GRHL3 Cause Van der Woude Syndrome and Disrupt Oral Periderm Development

Peer reviewe

Elsevier - Publisher Connector

Crossref

PubMed Central

eScholarship - University of California

Helsingin yliopiston digitaalinen arkisto

Expanding the stdpopsim species catalog, and lessons learned for realistic genome simulations

Simulation is a key tool in population genetics for both methods development and empirical research, but producing simulations that recapitulate the main features of genomic datasets remains a major obstacle. Today, more realistic simulations are possible thanks to large increases in the quantity and quality of available genetic data, and the sophistication of inference and simulation software. However, implementing these simulations still requires substantial time and specialized knowledge. These challenges are especially pronounced for simulating genomes for species that are not well-studied, since it is not always clear what information is required to produce simulations with a level of realism sufficient to confidently answer a given question. The community-developed framework stdpopsim seeks to lower this barrier by facilitating the simulation of complex population genetic models using up-to-date information. The initial version of stdpopsim focused on establishing this framework using six well-characterized model species (Adrion et al., 2020). Here, we report on major improvements made in the new release of stdpopsim (version 0.2), which includes a significant expansion of the species catalog and substantial additions to simulation capabilities. Features added to improve the realism of the simulated genomes include non-crossover recombination and provision of species-specific genomic annotations. Through community-driven efforts, we expanded the number of species in the catalog more than threefold and broadened coverage across the tree of life. During the process of expanding the catalog, we have identified common sticking points and developed the best practices for setting up genome-scale simulations. We describe the input data required for generating a realistic simulation, suggest good practices for obtaining the relevant information from the literature, and discuss common pitfalls and major considerations. These improvements to stdpopsim aim to further promote the use of realistic whole-genome population genetic simulations, especially in non-model organisms, making them available, transparent, and accessible to everyone

Publikationer från Uppsala Universitet

Edinburgh Research Explorer

eScholarship - University of California

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Integrative Annotation of 21,037 Human Genes Validated by Full-Length cDNA Clones

Author: Amid Clara
Apweiler Rolf
Ashurst Jennifer
Auffray Charles
Barrero Roberto A
Bellgard Matthew
Bonaldo Maria de Fatima
Bono Hidemasa
Bromberg Susan K
Brookes Anthony J
Bruford Elspeth
Carninci Piero
Chakraborty Ranajit
Chelala Claude
Chen Zhu
Couillault Christine
Debily Marie-Anne
Devignes Marie-Dominique
Dubchak Inna
Endo Toshinori
Estreicher Anne
Eveno Eric
Eyras Eduardo
Fujii Yasuyuki
Fukami-Kobayashi Kaoru
Fukuchi Satoshi
Go Mitiko
Gojobori Takashi
Gough Craig
Graudens Esther
Hahn Yoonsoo
Han Michael
Han Ze-Guang
Hanada Kousuke
Hanaoka Hideki
Harada Erimi
Hashimoto Katsuyuki
Hayashizaki Yoshihide
Hide Winston
Hilton Phillip
Hinz Ursula
Hirai Momoki
Hirakawa Mika
Hishiki Teruyoshi
Homma Keiichi
Hopkinson Ian
Ikeo Kazuho
Imanishi Tadashi
Imbeaud Sandrine
Inoko Hidetoshi
Isogai Takao
Itoh Takeshi
Jia Libin
Jin Lihua
Kanapin Alexander
Kanehisa Minoru
Kaneko Yayoi
Karavidopoulou Youla
Kasprzyk Arek
Kasukawa Takeya
Kelso Janet
Kersey Paul
Kikuno Reiko
Kim Sangsoo
Kimura Kouichi
Korn Bernhard
Koyanagi Kanako O
Kuryshev Vladimir
Lenhard Boris
Makalowska Izabela
Makalowski Wojciech
Makino Takashi
Mano Shuhei
Mariage-Samson Regine
Mashima Jun
Matsuda Hideo
Mewes Hans-Werner
Minoshima Shinsei
Miyazaki Satoru
Mulder Nicola
Nagai Keiichi
Nagasaki Hideki
Nagata Naoki
Nakai Kenta
Nakao Mitsuteru
Nigam Rajni
Nishikawa Ken
Nishikawa Tetsuo
Nomura Nobuo
O'Donovan Claire
Ogasawara Osamu
Ohara Osamu
Ohtsubo Masafumi
Oishi Michio
Okada Norihiro
Okazaki Yasushi
Okido Toshihisa
Okubo Kousaku
Oota Satoshi
Ota Motonori
Ota Toshio
Otsuki Tetsuji
Piatier-Tonneau Dominique
Poustka Annemarie
Quackenbush John
R. Gopinath Gopal
Ren Shuang-Xi
Richard Roberts
Saitou Naruya
Sakai Hiroaki
Sakai Katsunaga
Sakaki Yoshiyuki
Sakamoto Shigetaka
Sakate Ryuichi
Schupp Ingo
Servant Florence
Sherry Stephen
Shiba Rie
Shimizu Nobuyoshi
Shimoyama Mary
Simpson Andrew J
Soares Bento
Souza Sandro J. de
Steward Charles
Stodolsky Marvin
Strausberg Robert L
Sugano Sumio
Sugawara Hideaki
Suwa Makiko
Suzuki Mami
Suzuki Yoshiyuki
Suzuki Yutaka
Takagi Toshihisa
Takahashi Aiko
Takeda Jun-ichi
Tamiya Gen
Tamura Takuro
Tanaka Hiroshi
Tanaka Susumu
Tanino Motohiko
Tateno Yoshio
Taylor Todd
Terwilliger Joseph D
Thierry-Mieg Danielle
Thierry-Mieg Jean
Thomas Michael A
Tonellato Peter
Unneberg Per
Veeramachaneni Vamsi
Wagner Lukas
Watanabe Shinya
Wiemann Stefan
Wilming Laurens
Yamaguchi-Kabata Yumi
Yamasaki Chisato
Yasuda Norikazu
Yasuda Tomohiro
Yoo Hyang-Sook
Yura Kei
Publication venue: Public Library of Science
Publication date: 01/01/2004
Field of study

The human genome sequence defines our inherent biological potential; the realization of the biology encoded therein requires knowledge of the function of each gene. Currently, our knowledge in this area is still limited. Several lines of investigation have been used to elucidate the structure and function of the genes in the human genome. Even so, gene prediction remains a difficult task, as the varieties of transcripts of a gene may vary to a great extent. We thus performed an exhaustive integrative characterization of 41,118 full-length cDNAs that capture the gene transcripts as complete functional cassettes, providing an unequivocal report of structural and functional diversity at the gene level. Our international collaboration has validated 21,037 human gene candidates by analysis of high-quality full-length cDNA clones through curation using unified criteria. This led to the identification of 5,155 new gene candidates. It also manifested the most reliable way to control the quality of the cDNA clones. We have developed a human gene database, called the H-Invitational Database (H-InvDB; http://www.h-invitational.jp/). It provides the following: integrative annotation of human genes, description of gene structures, details of novel alternative splicing isoforms, non-protein-coding RNAs, functional domains, subcellular localizations, metabolic pathways, predictions of protein three-dimensional structure, mapping of known single nucleotide polymorphisms (SNPs), identification of polymorphic microsatellite repeats within human genes, and comparative results with mouse full-length cDNAs. The H-InvDB analysis has shown that up to 4% of the human genome sequence (National Center for Biotechnology Information build 34 assembly) may contain misassembled or missing regions. We found that 6.5% of the human gene candidates (1,377 loci) did not have a good protein-coding open reading frame, of which 296 loci are strong candidates for non-protein-coding RNA genes. In addition, among 72,027 uniquely mapped SNPs and insertions/deletions localized within human genes, 13,215 nonsynonymous SNPs, 315 nonsense SNPs, and 452 indels occurred in coding regions. Together with 25 polymorphic microsatellite repeats present in coding regions, they may alter protein structure, causing phenotypic effects or resulting in disease. The H-InvDB platform represents a substantial contribution to resources needed for the exploration of human biology and pathology

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Queensland University of Technology ePrints Archive

Research Repository

Hokkaido University Collection of Scholarly and Academic Papers

UPF Digital Repository

White Rose Research Online

MPG.PuRe

Computational approaches for in-depth analysis of cDNA sequence tags

Author: Unneberg Per
Publication venue: Bioteknologi
Publication date: 01/01/2004
Field of study

Major recent improvements in biotechnology have led to an accelerated production of DNA sequences. The completion of the human genome sequence, along with the genomes of more than two hundred other species, has marked the arrival of the genome era. The ultimate goal is to understand the structure and function of genomes and their genes. This thesis has focused on the computational analysis of complementary DNA (cDNA) sequences. These are copies of mRNA transcripts that correspond to the coding regions of genomes. Studying the expression patterns of genes is essential for understanding gene function. Many gene expression profiling techniques generate short sequence tags that derive from transcripts. A pilot study was performed to assess the feasibility of using the pyrosequencing platform for gene expression analysis. The sequences generated by pyrosequencing in most cases (≈ 85%) were long enough (> 18 nucleotides) to uniquely identify the corresponding transcripts through database searches. Aspects of transcript identification by short sequence tags were further investigated in a number of public databases, revealing that a tag length 16-17 nucleotides was sufficient for unique identifi- cation. Longer transcript representations are obtained from expressed sequence tag (EST) sequencing. Method development for the analysis and maintenance of large EST data sets has been performed on data from poplar, which is a tree of commercial interest to the forest biotechnology industry. In 2003 a large ESTsequencing project reached > 100 000 reads, providing a unique resource for tree biology research. ESTs have been grouped into clusters and singletons that represent potential genes. Preliminary analyses have estimated gene content in Populus to be very similar to that of model organism Arabidopsis thaliana. EST data collections provide a rich source for mining polymorphisms. A software application was developed and applied to EST data from two Populus species, and candidate single nucleotide polymorphisms (SNPs) were recorded. A study of genetic variation between the species revealed a striking similarity, with orthologous pairs being > 98% identical on the protein level. Keywords: cDNA, EST, gene expression, SNP, SAGE, polymorphism, assembly, clustering, DNA sequencing, pyrosequencing, mRNA transcript, orthology, tree biotechnology, restriction enzym

Publikationer från KTH

Tentative mapping of transcription-induced interchromosomal interaction using chimeric EST and mRNA data.

Author: Jean-Michel Claverie
Per Unneberg
Publication venue: Public Library of Science (PLoS)
Publication date: 28/02/2007
Field of study

Recent studies on chromosome conformation show that chromosomes colocalize in the nucleus, bringing together active genes in transcription factories. This spatial proximity of actively transcribing genes could provide a means for RNA interaction at the transcript level. We have screened public databases for chimeric EST and mRNA sequences with the intent of mapping transcription-induced interchromosomal interactions. We suggest that chimeric transcripts may be the result of close encounters of active genes, either as functional products or "noise" in the transcription process, and that they could be used as probes for chromosome interactions. We have found a total of 5,614 chimeric ESTs and 587 chimeric mRNAs that meet our selection criteria. Due to their higher quality, the mRNA findings are of particular interest and we hope that they may serve as food for thought for specialists in diverse areas of molecular biology

Public Library of Science (PLOS)

Directory of Open Access Journals

PubMed Central