    The Asian arowana (Scleropages formosus) genome provides new insights into the evolution of an early lineage of teleosts

    The Asian arowana (Scleropages formosus), one of the world’s most expensive cultivated ornamental fishes, is an endangered species. It represents an ancient lineage of teleosts: the Osteoglossomorpha. Here, we provide a high-quality chromosome-level reference genome of a female golden-variety arowana using a combination of deep shotgun sequencing and high-resolution linkage mapping. In addition, we have also generated two draft genome assemblies for the red and green varieties. Phylogenomic analysis supports a sister group relationship between Osteoglossomorpha (bonytongues) and Elopomorpha (eels and relatives), with the two clades together forming a sister group of Clupeocephala which includes all the remaining teleosts. The arowana genome retains the full complement of eight Hox clusters unlike the African butterfly fish (Pantodon buchholzi), another bonytongue fish, which possess only five Hox clusters. Differential gene expression among three varieties provides insights into the genetic basis of colour variation. A potential heterogametic sex chromosome is identified in the female arowana karyotype, suggesting that the sex is determined by a ZW/ZZ sex chromosomal system. The high-quality reference genome of the golden arowana and the draft assemblies of the red and green varieties are valuable resources for understanding the biology, adaptation and behaviour of Asian arowanas

    Sequencing and comparative analysis of fugu protocadherin clusters reveal diversity of protocadherin genes among teleosts

    BACKGROUND: The synaptic cell adhesion molecules, protocadherins, are a vertebrate innovation that accompanied the emergence of the neural tube and the elaborate central nervous system. In mammals, the protocadherins are encoded by three closely-linked clusters (α, β and γ) of tandem genes and are hypothesized to provide a molecular code for specifying the remarkably-diverse neural connections in the central nervous system. Like mammals, the coelacanth, a lobe-finned fish, contains a single protocadherin locus, also arranged into α, β and γ clusters. Zebrafish, however, possesses two protocadherin loci that contain more than twice the number of genes as the coelacanth, but arranged only into α and γ clusters. To gain further insight into the evolutionary history of protocadherin clusters, we have sequenced and analyzed protocadherin clusters from the compact genome of the pufferfish, Fugu rubripes. RESULTS: Fugu contains two unlinked protocadherin loci, Pcdh1 and Pcdh2, that collectively consist of at least 77 genes. The fugu Pcdh1 locus has been subject to extensive degeneration, resulting in the complete loss of Pcdh1γ cluster. The fugu Pcdh genes have undergone lineage-specific regional gene conversion processes that have resulted in a remarkable regional sequence homogenization among paralogs in the same subcluster. Phylogenetic analyses show that most protocadherin genes are orthologous between fugu and zebrafish either individually or as paralog groups. Based on the inferred phylogenetic relationships of fugu and zebrafish genes, we have reconstructed the evolutionary history of protocadherin clusters in the teleost fish lineage. CONCLUSION: Our results demonstrate the exceptional evolutionary dynamism of protocadherin genes in vertebrates in general, and in teleost fishes in particular. Besides the 'fish-specific' whole genome duplication, the evolution of protocadherin genes in teleost fishes is influenced by lineage-specific gene losses, tandem gene duplications and regional sequence homogenization. The dynamic protocadherin clusters might have led to the diversification of neural circuitry among teleosts, and contributed to the behavioral and physiological diversity of teleosts

    TFCONES: A database of vertebrate transcription factor-encoding genes and their associated conserved noncoding elements

    <p>Abstract</p> <p>Background</p> <p>Transcription factors (TFs) regulate gene transcription and play pivotal roles in various biological processes such as development, cell cycle progression, cell differentiation and tumor suppression. Identifying <it>cis</it>-regulatory elements associated with TF-encoding genes is a crucial step in understanding gene regulatory networks. To this end, we have used a comparative genomics approach to identify putative <it>cis</it>-regulatory elements associated with TF-encoding genes in vertebrates.</p> <p>Description</p> <p>We have created a database named TFCONES (Transcription Factor Genes & Associated COnserved Noncoding ElementS) (<url>http://tfcones.fugu-sg.org</url>) which contains all human, mouse and fugu TF-encoding genes and conserved noncoding elements (CNEs) associated with them. The CNEs were identified by gene-by-gene alignments of orthologous TF-encoding gene loci using MLAGAN. We also predicted putative transcription factor binding sites within the CNEs. A significant proportion of human-fugu CNEs contain experimentally defined binding sites for transcriptional activators and repressors, indicating that a majority of the CNEs may function as transcriptional regulatory elements. The TF-encoding genes that are involved in nervous system development are generally enriched for human-fugu CNEs. Users can retrieve TF-encoding genes and their associated CNEs by conducting a keyword search or by selecting a family of DNA-binding proteins.</p> <p>Conclusion</p> <p>The conserved noncoding elements identified in TFCONES represent a catalog of highly prioritized putative <it>cis</it>-regulatory elements of TF-encoding genes and are candidates for functional assay.</p

    Characterization of the neurohypophysial hormone gene loci in elephant shark and the Japanese lamprey: origin of the vertebrate neurohypophysial hormone genes

    <p>Abstract</p> <p>Background</p> <p>Vasopressin and oxytocin are mammalian neurohypophysial hormones with distinct functions. Vasopressin is involved mainly in osmoregulation and oxytocin is involved primarily in parturition and lactation. Jawed vertebrates contain at least one homolog each of vasopressin and oxytocin, whereas only a vasopressin-family hormone, vasotocin, has been identified in jawless vertebrates. The genes encoding vasopressin and oxytocin are closely linked tail-to-tail in eutherian mammals whereas their homologs in chicken, <it>Xenopus </it>and coelacanth (<it>vasotocin </it>and <it>mesotocin</it>) are linked tail-to-head. In contrast, their pufferfish homologs, <it>vasotocin </it>and <it>isotocin</it>, are located on the same strand of DNA with <it>isotocin </it>located upstream of <it>vasotocin </it>and separated by five genes. These differences in the arrangement of the two genes in different bony vertebrate lineages raise questions about their origin and ancestral arrangement. To trace the origin of these genes, we have sequenced BAC clones from the neurohypophysial gene loci in a cartilaginous fish, the elephant shark (<it>Callorhinchus milii</it>), and in a jawless vertebrate, the Japanese lamprey (<it>Lethenteron japonicum</it>). We have also analyzed the neurohypophysial hormone gene locus in an invertebrate chordate, the amphioxus (<it>Branchiostoma floridae</it>).</p> <p>Results</p> <p>The elephant shark neurohypophysial hormone genes encode vasotocin and oxytocin, and are linked tail-to-head like their homologs in coelacanth and non-eutherian tetrapods. Besides the hypothalamus, the two genes are also expressed in the ovary. In addition, the <it>vasotocin </it>gene is expressed in the kidney, rectal gland and intestine. These expression profiles indicate a paracrine role for the two hormones. The lamprey locus contains a single neurohypophysial hormone gene, the <it>vasotocin</it>. The synteny of genes in the lamprey locus is conserved in elephant shark, coelacanth and tetrapods but disrupted in teleost fishes. The amphioxus locus encodes a single neurohypophysial hormone, designated as [Ile<sup>4</sup>]vasotocin.</p> <p>Conclusion</p> <p>The vasopressin- and oxytocin-family of neurohypophysial hormones evolved in a common ancestor of jawed vertebrates through tandem duplication of the ancestral <it>vasotocin </it>gene. The duplicated genes were linked tail-to-head like their homologs in elephant shark, coelacanth and non-eutherian tetrapods. In contrast to the conserved linkage of the neurohypophysial genes in these vertebrates, the neurohypophysial hormone gene locus has experienced extensive rearrangements in the teleost lineage.</p

    Sequence and organization of coelacanth neurohypophysial hormone genes: Evolutionary history of the vertebrate neurohypophysial hormone gene locus

    <p>Abstract</p> <p>Background</p> <p>The mammalian neurohypophysial hormones, vasopressin and oxytocin are involved in osmoregulation and uterine smooth muscle contraction respectively. All jawed vertebrates contain at least one homolog each of vasopressin and oxytocin whereas jawless vertebrates contain a single neurohypophysial hormone called vasotocin. The vasopressin homolog in non-mammalian vertebrates is vasotocin; and the oxytocin homolog is mesotocin in non-eutherian tetrapods, mesotocin and [Phe<sup>2</sup>]mesotocin in lungfishes, and isotocin in ray-finned fishes. The genes encoding vasopressin and oxytocin genes are closely linked in the human and rodent genomes in a tail-to-tail orientation. In contrast, their pufferfish homologs (vasotocin and isotocin) are located on the same strand of DNA with isotocin gene located upstream of vasotocin gene separated by five genes, suggesting that this locus has experienced rearrangements in either mammalian or ray-finned fish lineage, or in both lineages. The coelacanths occupy a unique phylogenetic position close to the divergence of the mammalian and ray-finned fish lineages.</p> <p>Results</p> <p>We have sequenced a coelacanth (<it>Latimeria menadoensis</it>) BAC clone encompassing the neurohypophysial hormone genes and investigated the evolutionary history of the vertebrate neurohypophysial hormone gene locus within a comparative genomics framework. The coelacanth contains vasotocin and mesotocin genes like non-mammalian tetrapods. The coelacanth genes are present on the same strand of DNA with no intervening genes, with the vasotocin gene located upstream of the mesotocin gene. Nucleotide sequences of the second exons of the two genes are under purifying selection implying a regulatory function. We have also analyzed the neurohypophysial hormone gene locus in the genomes of opossum, chicken and <it>Xenopus tropicalis</it>. The opossum contains two tandem copies of vasopressin and mesotocin genes. The vasotocin and mesotocin genes in chicken and <it>Xenopus</it>, and the vasopressin and mesotocin genes in opossum are linked tail-to-head similar to their orthologs in coelacanth and unlike their homologs in human and rodents.</p> <p>Conclusion</p> <p>Our results indicate that the neurohypophysial hormone gene locus has experienced independent rearrangements in both placental mammals and teleost fishes. The coelacanth genome appears to be more stable than mammalian and teleost fish genomes. As such, it serves as a valuable outgroup for studying the evolution of mammalian and teleost fish genomes.</p

    Emergence and evolution of the glycoprotein hormone and neurotrophin gene families in vertebrates

    <p>Abstract</p> <p>Background</p> <p>The three vertebrate pituitary glycoprotein hormones (GPH) are heterodimers of a common α and a specific β subunit. In human, they are located on different chromosomes but in a similar genomic environment. We took advantage of the availability of genomic and EST data from two cartilaginous fish species as well as from two lamprey species to identify their repertoire of neurotrophin, lin7 and KCNA gene family members which are in the close environment of <it>gphβ</it>. <it>Gphα </it>and <it>gphβ </it>are absent outside vertebrates but are related to two genes present in both protostomes and deuterostomes that were named <it>gpa2 </it>and <it>gpb5</it>. Genomic organization and functional characteristics of their protein products suggested that <it>gphα </it>and <it>gphβ </it>might have been generated concomitantly by a duplication of <it>gpa2 </it>and <it>gpb5 </it>just prior to the radiation of vertebrates. To have a better insight into this process we used new genomic resources and tools to characterize the ancestral environment before the duplication occurred.</p> <p>Results</p> <p>An almost similar repertoire of genes was characterized in cartilaginous fishes as in tetrapods. Data in lampreys are either incomplete or the result of specific duplications and/or deletions but a scenario for the evolution of this genomic environment in vertebrates could be proposed. A number of genes were identified in the amphioxus genome that helped in reconstructing the ancestral environment of <it>gpa2 </it>and <it>gpb5 </it>and in describing the evolution of this environment in vertebrates.</p> <p>Conclusion</p> <p>Our model suggests that vertebrate <it>gphα </it>and <it>gphβ </it>were generated by a specific local duplication of the ancestral forms of <it>gpa2 </it>and <it>gpb5</it>, followed by a translocation of <it>gphβ </it>to a new environment whereas <it>gphα </it>was retained in the <it>gpa2</it>-<it>gpb5 </it>locus. The two rounds of whole genome duplication that occurred early in the evolution of vertebrates generated four paralogues of each gene but secondary gene losses or lineage specific duplications together with genomic rearrangements have resulted in the present organization of these genes, which differs between vertebrate lineages.</p

    Identification and Comparative Analysis of the Protocadherin Cluster in a Reptile, the Green Anole Lizard

    BACKGROUND:The vertebrate protocadherins are a subfamily of cell adhesion molecules that are predominantly expressed in the nervous system and are believed to play an important role in establishing the complex neural network during animal development. Genes encoding these molecules are organized into a cluster in the genome. Comparative analysis of the protocadherin subcluster organization and gene arrangements in different vertebrates has provided interesting insights into the history of vertebrate genome evolution. Among tetrapods, protocadherin clusters have been fully characterized only in mammals. In this study, we report the identification and comparative analysis of the protocadherin cluster in a reptile, the green anole lizard (Anolis carolinensis). METHODOLOGY/PRINCIPAL FINDINGS:We show that the anole protocadherin cluster spans over a megabase and encodes a total of 71 genes. The number of genes in the anole protocadherin cluster is significantly higher than that in the coelacanth (49 genes) and mammalian (54-59 genes) clusters. The anole protocadherin genes are organized into four subclusters: the delta, alpha, beta and gamma. This subcluster organization is identical to that of the coelacanth protocadherin cluster, but differs from the mammalian clusters which lack the delta subcluster. The gene number expansion in the anole protocadherin cluster is largely due to the extensive gene duplication in the gammab subgroup. Similar to coelacanth and elephant shark protocadherin genes, the anole protocadherin genes have experienced a low frequency of gene conversion. CONCLUSIONS/SIGNIFICANCE:Our results suggest that similar to the protocadherin clusters in other vertebrates, the evolution of anole protocadherin cluster is driven mainly by lineage-specific gene duplications and degeneration. Our analysis also shows that loss of the protocadherin delta subcluster in the mammalian lineage occurred after the divergence of mammals and reptiles. We present a model for the evolutionary history of the protocadherin cluster in tetrapods

    Early vertebrate chromosome duplications and the evolution of the neuropeptide Y receptor gene regions

    <p>Abstract</p> <p>Background</p> <p>One of the many gene families that expanded in early vertebrate evolution is the neuropeptide (NPY) receptor family of G-protein coupled receptors. Earlier work by our lab suggested that several of the NPY receptor genes found in extant vertebrates resulted from two genome duplications before the origin of jawed vertebrates (gnathostomes) and one additional genome duplication in the actinopterygian lineage, based on their location on chromosomes sharing several gene families. In this study we have investigated, in five vertebrate genomes, 45 gene families with members close to the NPY receptor genes in the compact genomes of the teleost fishes <it>Tetraodon nigroviridis </it>and <it>Takifugu rubripes</it>. These correspond to <it>Homo sapiens </it>chromosomes 4, 5, 8 and 10.</p> <p>Results</p> <p>Chromosome regions with conserved synteny were identified and confirmed by phylogenetic analyses in <it>H. sapiens, M. musculus, D. rerio, T. rubripes </it>and <it>T. nigroviridis</it>. 26 gene families, including the NPY receptor genes, (plus 3 described recently by other labs) showed a tree topology consistent with duplications in early vertebrate evolution and in the actinopterygian lineage, thereby supporting expansion through block duplications. Eight gene families had complications that precluded analysis (such as short sequence length or variable number of repeated domains) and another eight families did not support block duplications (because the paralogs in these families seem to have originated in another time window than the proposed genome duplication events). RT-PCR carried out with several tissues in <it>T. rubripes </it>revealed that all five NPY receptors were expressed in the brain and subtypes Y2, Y4 and Y8 were also expressed in peripheral organs.</p> <p>Conclusion</p> <p>We conclude that the phylogenetic analyses and chromosomal locations of these gene families support duplications of large blocks of genes or even entire chromosomes. Thus, these results are consistent with two early vertebrate tetraploidizations forming a paralogon comprising human chromosomes 4, 5, 8 and 10 and one teleost tetraploidization. The combination of positional and phylogenetic data further strengthens the identification of orthologs and paralogs in the NPY receptor family.</p

    Neuropeptide Y-family peptides and receptors in the elephant shark, Callorhinchus milii confirm gene duplications before the gnathostome radiation

    AbstractWe describe here the repertoire of neuropeptide Y (NPY) peptides and receptors in the elephant shark Callorhinchus milii, belonging to the chondrichthyans that diverged from the rest of the gnathostome (jawed vertebrate) lineage about 450 million years ago and the first chondrichthyan with a genome project. We have identified two peptide genes that are orthologous to NPY and PYY (peptide YY) in other vertebrates, and seven receptor genes orthologous to the Y1, Y2, Y4, Y5, Y6, Y7 and Y8 subtypes found in tetrapods and teleost fishes. The repertoire of peptides and receptors seems to reflect the ancestral configuration in the predecessor of all gnathostomes, whereas other lineages such as mammals and teleosts have lost one or more receptor genes or have acquired 1–2 additional peptide genes. Both the peptides and receptors showed broad and overlapping mRNA expression which may explain why some receptor gene losses could take place in some lineages, but leaves open the question why all the known ancestral receptors have been retained in the elephant shark