15 research outputs found
In silico searches for putative growth hormone family homologs in invertebrates
<p>Growth hormone, prolactin and somatolactin-family sequences were sought in diverse invertebrate genome and reference sequence databases using different available strategies. None of these search strategies could identify any putative invertebrate GH/PRL/SL homologs with the characteristic amino acid motif of the family, or indeed any significant sequence similarity. One sequence with high sequence identity to human PRL was identified from a tapeworm species; however this sequence is likely the result of horizontal gene transfer or contamination.</p>
<p><strong>File information:</strong></p>
<p><strong>In silico searches for putative growth hormone family homologs in invertebrates.pdf</strong><br>Description of methods and results, with references.</p>
<p><strong>130723_Taenia_prl-hit_NJtree.phb</strong><br>Neighbor joining tree file in Newick format including the identified tapeworm sequence. Vertebrate species names are abbreviated to the first letter of the genus followed by the first two letters of the species (Hsa = Homo sapiens et. c.). The tapeworm sequence is identified with the UniProt ID Q8T110_9CEST.</p>
<p><strong>130723_Taenia_prl-hit_NJtree.pdf</strong><br>Image of the neighbor joining tree (PDF). The tapeworm sequence is colored green.</p>
<p><strong>Pairwise_align_hPRL+Taenia_prl-hit.txt</strong><br>Text file with the results of the pairwise alignment between the human PRL sequence and the identified tapeworm sequence.</p>
<p><strong>HsaPRL+Taenia_prl-hit.aln</strong><br>Clustal alignment file containing the human PRL sequence and the identified tapeworm sequence.</p>
<p> </p
Evolution of the growth hormone, prolactin, prolactin 2 and somatolactin family
<p></p><p>This document corresponds to the final published version of this article. You are free to download, print and distribute this document for any purposes under a Creative Commons Attribution 4.0 Unported License provided the original work is cited as specified.</p><p></p><p></p><p>Cite this article as: Ocampo Daza D and Larhammar D. <b>Evolution of the growth hormone, prolactin, prolactin 2 and somatolactin family</b>, General and Comparative Endocrinology 264 (2018) 94-112. <a href="https://doi.org/10.1016/j.ygcen.2018.01.007">https://doi.org/10.1016/j.ygcen.2018.01.007</a>.</p><br><p></p
Evolution of the receptors for growth hormone, prolactin, erythropoietin and thrombopoietin in relation to the vertebrate tetraploidizations [DATASET]
Phylogenetic analyses and chromosomal data for the single-chain cytokine class I receptor family (GHR, PRLR, CRFA4, EPOR and TPOR) and 18 neighboring gene families in paralogous chromosome blocks.<div><br></div><div>Supporting data for manuscript GCE-17-19 submitted to <i>General and Comparative Endocrinology</i>.</div><div><br></div><div><b>Abbreviations:</b></div><div><br></div><div>CRFA4: Cytokine receptor family member A4<br></div><div>EPOR: Erythropoietin receptor<br></div><div>GHR: Growth hormone receptor</div><div>PRLR: Prolactin receptor</div><div>LBD: Ligand-binding domain<br></div><div>MPL: Myoproliferative leukemia proto-oncogene<br></div><div>TPOR: Thrombopoietin receptor</div><div><br></div><div><b>Files:</b></div><div><div><div><br></div><div><b>GHRfam_data.xlxs:</b> Location data, sequence identifiers and prediction/annotation notes for all identified <i>GHR</i>, <i>PRLR</i>, <i>CRFA4</i>, <i>EPOR </i>and <i>TPOR</i> (<i>MPL</i>) sequences. The table also includes species, genome assembly and sequence quality information.</div><div><br></div><div><b>GHRfam_all_seq.fasta:</b> All curated <i>GHR</i>, <i>PRLR</i>, <i>CRFA4</i>, <i>EPOR</i> and <i>TPOR</i> (<i>MPL</i>) amino acid sequences identified in this study. Partial sequences are indicated by an asterisk (*) in the sequence name. Sequences marked "_edited" have one duplicated ligand-binding domain removed. This applies to all <i>TPOR </i>sequences except anole lizard; chicken and anole lizard <i>PRLR</i>; and cartilaginous fish <i>GHR</i>. The corresponding full-length sequences are marked "_full".<br></div></div><div><br></div><div><b>GHRfam_align.fasta:</b> Edited amino acid sequence alignment of <i>GHR, PRLR, CRFA4, EPOR </i>and <i>TPOR </i>(<i>MPL</i>) sequences. Sequence information is shown in '<b>GHRfam_data.xlxs</b>', including species abbreviations used in sequence names. </div><div><br></div><div><b>GHRfam_PhyML_tree_raw.phb: </b>Phylogenetic Maximum Likelihood (PhyML) tree analysis of the single-chain cytokine class I receptor family. Output file from Seaview v4.6.1 in PHYLIP/Newick format. <br></div></div><div><br></div><div><b>GHRfam_PhyML_tree_midpoint.phb: </b>Midpoint-rooted version of the phylogenetic tree above. Midpoint identified in FigTree v1.4.3.<br></div><div><br></div><div><b>GHRfam_LBD_align.fasta: </b>Amino acid sequence alignment of only ligand-binding domains of <i>GHR, PRLR, CRFA4, EPOR </i>and <i>TPOR </i>(<i>MPL</i>) sequences.<br></div><div><br></div><div><b>GHRfam_LBD_PhyML_tree_raw.phb:</b> PhyML tree analysis of <i>GHR, PRLR, CRFA4, EPOR </i>and <i>TPOR </i>(<i>MPL</i>) ligand-binding domains. Output file from Seaview v4.6.1 in PHYLIP/Newick format. <br></div><div><br></div><div><b>GHRfam_LBD_PhyML_tree_midpoint.phb:</b> Midpoint-rooted version of the phylogenetic tree above. Midpoint identified in FigTree v1.4.3.<br></div><div><br></div><div><b>GHRfam_Bfl_align.fasta: </b>Amino acid sequence alignment including putative <i>Branchiostoma floridae </i>family member. Edited to include only extracellular domains. Extended N-terminal of <i>B. floridae </i>sequence not included. <br></div><div><br></div><div><b>GHRfam_Bfl_PhyML_tree.phb:</b> PhyML tree analysis of <i>GHR, PRLR, CRFA4, EPOR </i>and <i>TPOR </i>(<i>MPL</i>) extracellular domains, including putative <i>Branchiostoma floridae </i>family member. Output file from Seaview v4.6.1 in PHYLIP/Newick format. <br></div><div><br></div><div><b>Neighboring_family_data.xlsx: </b>Location data, sequence identifiers and prediction/annotation notes for 18 neighboring gene families in the chromosomal regions of <i>GHR, PRLR, EPOR </i>and <i>TPOR </i>(<i>MPL</i>) genes. Includes explanations of family abbreviations and gene names used in sequence alignment and phylogenetic tree files.<br><br>Amino acid sequence alignments as well as unrooted and rooted PhyML tree files are included for all 18 neighboring gene families. All tree files are in PHYLIP/Newick format.</div><div><br></div><div>For the FGF3/7/10/22 and ZFR families, two analyses were made for each family owing to the unclear relationships of putative invertebrate family members. </div><div><br></div><div>Outdated files shared before peer-review are included in the archive file <b>Pre-review-data.zip</b>.</div
Evolution of the growth hormone, prolactin, prolactin 2 and somatolactin family [DATASET]
<div><div>Phylogenetic analyses and chromosomal data for the growth hormone (<i>GH</i>), prolactin (<i>PRL</i>), prolactin 2 (<i>PRL2</i>) and somatolactin (<i>SL</i>) gene family, as well as 31 neighboring gene families in paralogous chromosome blocks.<br></div><div><br></div><div>Supporting data for manuscript GCE_2017_309 submitted to <i>General and Comparative Endocrinology</i>.<br></div><div><br></div><div>Sequence alignment files are in FASTA format. </div><div>Phylogenetic tree files are in PHYLIP/Newick format.</div><div><br></div><div>Phylogenetic trees were made using the IQ-TREE program (http://www.iqtree.org/), supported by Ultra-Fast Bootstrap (UFBoot) and approximate Likelihood Ratio Test (aLRT). </div></div
Phylogenetic analyses of the vertebrate oxytocin and vasopressin receptor gene family
<p>Sequence based phylogenetic analyses of vertebrate oxytocin receptor (OTR) and vasopressin receptor (VPR) genes using amino acid sequences predicted primarily from the Ensembl (http://www.ensembl.org) and Pre Ensembl (http://pre.ensembl.org) genome browsers. These analyses are based on our previously published study identifying OTR and VPR sequences in vertebrate genomes, including previously unrecognised subtypes of V2 receptors - <em>Ocampo Daza D., Lewicka M. and Larhammar D. (2012) The oxytocin/vasopressin receptor family has at least five members in the gnathostome lineage, inclucing two distinct V2 subtypes, General and Comparative Endocrinology 175(1):135-143</em> (link below). These updated analyses include more species and suggest an update of VPR gene nomenclature.</p>
<p>Species and genome assembly information, database identifiers, location data and annotation notes for all identified sequences are included in the Excel workbook 'Master_OTR_VPR_sequence_tables.xlsx'. These tables also detail the updated vs. outdated nomenclature. All identified and curated amino acid sequences are included in the FASTA file 'Master_OTR_VPR_sequences.fasta'.</p>
<p><strong>Legends:</strong></p>
<p>Sequences marked * are not full-length, sequences marked # are not full-length and the prediction of the intracellular loop 3 (IL3) is not clear. The sequence marked § is a putative pseudogene. See details in 'Master_OTR_VPR_sequence_tables.xlsx'. Numbers in sequence names indicate the chromosome/linkage group where known. </p>
<p><strong>File information 1:</strong></p>
<p>Species included in these analyses, with abbreviations: human (<em>Homo sapiens</em>, Hsa), mouse (<em>Mus musculus</em>, Mmu), grey short-tailed opossum (<em>Monodelphis domestica</em>, Mdo), chicken (<em>Gallus gallus</em>, Gga), Carolina anole lizard (<em>Anolis carolinensis</em>, Aca), Western clawed frog (<em>Xenopus tropicalis</em>, Xtr), coelacanth (<em>Latimeria chalumnae</em>, Lch), spotted gar (<em>Lepisosteus oculatus</em>, Loc), zebrafish (<em>Danio rerio</em>, Dre), three-spined stickleback (<em>Gasterosteus aculeatus</em>, Gac), medaka (<em>Oryzias latipes</em>, Ola), Southern platyfish (<em>Xiphophorus maculatus</em>, Xma), Japanese pufferfish (<em>Takifugu rubripes</em>, Tru) and Elephant shark (<em>Callorhinchus milii</em>, Cmi).</p>
<p>Alignment file included in FASTA-format: 'align_OTR_VPR_edited.fasta'. This file format can be opened by most sequence analysis applications as well as text editors. This alignment has been curated and edited as described in the Methods sections and Supplementary Material 3 of <em>Ocampo Daza D. et al. (2012) Gen. Comp. Endocrinol 175(1)</em> (link below), removing parts of the amino terminal, carboxy terminal and intracellular loop 3. The alignment was created using the MUSCLE algorithm applied through eBioX (http://www.ebioinformatics.org/ebiox/) using standard settings with 16 iterations. The alignment was edited manually in eBioX.</p>
<p>Phylogenetic tree files are included in Phylip/Newick format with the extension '.phb'. This file format can be opened by freely available phylogenetic tree viewers such as FigTree (http://tree.bio.ed.ac.uk/software/figtree/) and TreeView (http://darwin.zoology.gla.ac.uk/~rpage/treeviewx/). All trees were made using the alignment described above. Corresponding figures for each phylogenetic tree are also included as PDF-files. Red nodes and support values indicate values lower than 50%.</p>
<p>The neighbor joining (NJ) tree, 'NJ_tree_OPR_VPR.phb', was made using standard settings in ClustalX 2.0 (http://www.clustal.org/clustal2/), supported by a non-parametric bootstrap analysis with 1000 replicates.</p>
<p>Phylogenetic Maximum Likelihood (PhyML) trees were made using the PhyML3.0 algorithm (http://www.atgc-montpellier.fr/phyml/) through the PhyML-aBayes application. One tree is supported by a non-parametric bootstrap analysis with 100 replicates, 'PhyML_tree_OTR_VPR_boot.phb', and one is supported by an SH-like approximate likelihood ratio test (aLRT), 'PhyML_tree_OTR_VPR_aLRT.phb'. Both PhyML trees were made with the following settings: amino acid frequencies (equilibrium frequencies), proportion of invariable sites (with optimised p-invar) and gamma shape parameters were estimated from the alignments, the number of substitution rate categories was set to 8, BIONJ was chosen to create the starting tree, both NNI and SPR tree optimization methods were considered and both tree topology and branch length optimization were chosen. The JTT model of amino acid substitution was chosen using ProtTest 3.0 (https://bitbucket.org/diegodl/prottest3/downloads).</p>
<p><strong>File information 2:</strong></p>
<p>The alignment file '120922_align_Tni.fasta' includes OTR and VPR sequences identified in the spotted green pufferfish (<em>Tetraodon nigroviridis</em>, Tni) genome. The alignment file '120922_align_Psi_Cpi.fasta' includes OTR and VPR sequences identified in the Chinese softshell turtle (<em>Pelodiscus sinensis</em>, Psi) and painted turtle (<em>Chrysemys picta bellii</em>, Cpi) genomes. These alignments are based on the alignment used for the study described in <em>Ocampo Daza D. et al. (2012) Gen. Comp. Endocrinol 175(1)</em> and were made using the ClustalW algorithm in ClustalX 2.0 (http://www.clustal.org/clustal2/) with standard settings (Gonnet weight matrix, gap opening penalty 10.0 and gap extension penalty 0.20).</p>
<p>For the spotted green pufferfish, only the automatic Ensembl predictions were used to verify all family members. For the two turtles, the identified seqences were curated manually in order to ratify erroneous automatic exon predictions and to predict exons or whole gene predictions that had not been identified. Genome assembly information, database identifiers, location data and annotation notes for these sequences are also included in the Excel workbook 'Master_OTR_VPR_sequence_tables.xlsx'. The un-aligned sequence predictions are included in the FASTA file 'Master_OTR_VPR_sequences.fasta'.</p>
<p>These sequences were tested in NJ trees made using standard settings in ClustalX 2.0 (http://www.clustal.org/clustal2/), supported by a non-parametric bootstrap analysis with 1000 replicates. The file '120922_NJ_tree_Tni.phb' includes spotted green pufferfish and the file '121022_NJ_tree_Psi_Cpi.phb' includes the two turtle species. Both tree files are in Phylip/Newick format. Corresponding figures for each phylogenetic tree are also included as PDF-files, with the spotted green pufferfish and turtle sequences marked in color.</p
Phylogenetic analyses of the vertebrate voltage-gated calcium channel L-type alpha 1 subunit gene family
<p>Sequence based phylogenetic analyses of vertebrate voltage-gated calcium channel alpha 1 subunits (CACNA1) of L-type - CACNA1S, CACNA1C, CACNA1D and CACNA1F - using amino acid sequences predicted from the Ensembl (http://www.ensembl.org) genome browser. Species and genome assembly information, database identifiers, location data and annotation notes for all identified sequences are included in the Excel workbook 'Supplementary Table CACNA1L.xlsx'.</p>
<p><strong>File information:</strong></p>
<p>Species included in these analyses, with abbreviations: human (<em>Homo sapiens</em>, Hsa), mouse (<em>Mus musculus</em>, Mmu), grey short-tailed opossum (<em>Monodelphis domestica</em>, Mdo), chicken (<em>Gallus gallus</em>, Gga), Carolina anole lizard (<em>Anolis carolinensis</em>, Aca), zebrafish <em>(Danio rerio</em>, Dre), three-spined stickleback (<em>Gasterosteus aculeatus</em>, Gac), medaka (<em>Oryzias latipes</em>, Ola), green spotted pufferfish (<em>Tetraodon nigroviridis</em>, Tni), transparent sea squirt (<em>Ciona intestinalis</em>, Cin) and fruit fly (<em>Drosophila melanogaster</em>, Dme).</p>
<p>Alignment file included in FASTA-format: 'align_CACNA1L.fasta'. This file format can be opened by most sequence analysis applications as well as text editors. The alignment was created using ClustalX 2.0.12 (http://www.clustal.org/clustal2/) with standard settings (Gonnet weight matrix, gap opening penalty 10.0 and gap extension penalty 0.20). For short, incomplete or diverging gene predictions in Ensembl, the nucleotide sequence including the gene prediction (with introns) as well as the flanking sequence was collected and the Genscan gene prediction server (http://genes.mit.edu/GENSCAN.html) was used to identify exons that had not been predicted. Sequences that were still divergent with regard to exon-intron boundaries were curated manually by following consensus for splice donor and acceptor sites as well as sequence homology to other family members. Remaining highly divergent, non-alignable, regions were removed from the final alignment. The alignment was edited using the BioEdit Sequence Alignment Editor (http://www.mbio.ncsu.edu/bioedit/bioedit.html).</p>
<p>Phylogenetic tree files are included in Phylip/Newick format with the extension '.phb'. This file format can be opened by freely available phylogenetic tree viewers such as FigTree (http://tree.bio.ed.ac.uk/software/figtree/) and TreeView (http://darwin.zoology.gla.ac.uk/~rpage/treeviewx/). All trees were made using the alignment described above. Corresponding figures for each phylogenetic tree are also included as PDF-files.</p>
<p>The neighbor joining (NJ) tree, 'NJ_tree_CACNA1L.phb', was made using standard settings in ClustalX 2.0 (http://www.clustal.org/clustal2/), supported by a non-parametric bootstrap analysis with 1000 replicates. The Phylogenetic Maximum Likelihood (PhyML) tree, 'PhyML_tree_CACNA1L.phb', was made using the PhyML3.0 algorithm (http://www.atgc-montpellier.fr/phyml/) with the following settings: amino acid frequencies (equilibrium frequencies), proportion of invariable sites (with optimised p-invar) and gamma shape parameters were estimated from the alignments, the number of substitution rate categories was set to 8, BIONJ was chosen to create the starting tree, both NNI and SPR tree optimization methods were considered and both tree topology and branch length optimization were chosen. The JTT model of amino acid substitution was chosen using ProtTest 3.0 (https://bitbucket.org/diegodl/prottest3/downloads). The tree is supported by a non-parametric bootstrap analysis with 100 replicates.</p>
<p>Both trees are rooted with the fruit fly <em>Ca-α1D</em> sequence.</p
The oxytocin/vasopressin receptor family has at least five members in the gnathostome lineage, including two distinct V2 subtypes
<p><strong>Research article: The oxytocin/vasopressin receptor family has at least five members in the gnathostome lineage, including two distinct V2 subtypes</strong></p>
<p>Daniel Ocampo Daza*, Michalina Lewicka¹, Dan Larhammar<br>Department of Neuroscience, Science for Life Laboratory, Uppsala Universitet, Box 593, SE-751 24 Uppsala, Sweden</p>
<p>* Corresponding author. E-mail address: [email protected]<br>¹ Current address: Department of Neuroscience, Karolinska Institutet, SE-171 77 Stockholm, Sweden</p>
<p><em>General and Comparative Endocrinology 175(1): 135-143</em><br><em>doi:10.1016/j.ygcen.2011.10.011</em></p>
<p>Accepted October 20, 2011<br>E-pub October 28, 2012<br>Published January 1, 2012</p>
<p>This PDF and Supplementary material corresponds to the article as it appeared upon acceptance.</p>
<p>Cite original work as <em>D. Ocampo Daza, M. Lewicka and D. Larhammar. The oxytocin/vasopressin family has at least five members in the gnathostome lineage, including two distinct V2 subtypes. General and Comparative Endocrinology, 175 (1) (2012) 135-143.</em></p>
<p> </p
Phylogenetic Maximum Likelihood tree of the GRIN2 gene family
<p>Published in: Ocampo Daza D, Sundström G, Bergqvist CA, Larhammar D. The evolution of vertebrate somatostatin receptors and their gene regions involves extensive chromosomal rearrangements. BMC Evolutionary Biology 2012, 12:231 doi:10.1186/1471-2148-12-231. Please refer to this article if using this figure.</p>
<p><strong>Figure 3 Phylogenetic Maximum Likelihood tree of the GRIN2 gene family.</strong> The<br>ionotropic glutamate receptor 2 (GRIN2) gene family is a neighboring family of the <em>SSTR2, -</em><br><em>3</em> and -<em>5</em> chromosomal regions. Phylogenetic methods, monophyletic clusters and leaf names<br>as in Figure 2.</p
Phylogenetic maximum likelihood analyses of the voltage-gated sodium channel α subunits
<p>Phylogenetic maximum likelihood analyses of the voltage-gated sodium channel α subunit (SCNα) gene family based on amino acid sequence alignments. The sequences and alignments described in <em>Widmark et. al. (2011) Molecular Biology and Evolution 28(1):859-71</em> (1) were used to re-analyze the phylogenetic relationships of vertebrate SCNα subtypes with more powerful methods.</p>
<p><strong>File information:</strong></p>
<p>Two datasets were made; the full subset of identified SCNα sequences in alignment file 1 (<em>9.6_SCNexons.aln</em>) used for the phylogenetic analyses in Figs. 1 and 3; and the identified SCN1A, SCN4A, SCN5A and SCN8A sequences in alignment file 2 (<em>8.14_SCNexons(1,4,5,8).aln</em>) used for the phylogenetic analyses in Figs. 2 and 4. The latter dataset includes only sequences representing each of the four chromosomes harboring SCNα genes in tetrapod genomes (1). Alignment files are provided in the CLUSTAL format. For sequence and alignment curation details see <em>Widmark et. al. (2011)</em> (1).</p>
<p>The trees in Figs. 1 and 2 are supported by bootstrap values (see below). Both the final bootstrapped trees and all the bootstrap replicates are provided as txt-files in NEWICK format. The trees in Figs. 3 and 4 are supported by aLRT values (see below). The aLTR-supported trees are also provided in NEWICK format.</p>
<p>In all files the first three letters of the sequence names are abbreviations of the species names, followed by the chromosome assignment of the genes and the abbreviated α subunit name (full names for the human sequences).</p>
<p><strong>Phylogenetic methods:</strong></p>
<p>The phylogenetic analyses were done using the PhyML 3.0 algorithm in PhyML-aBayes (3.0.1 beta) or through the web-based form of the PhyML 3.0 algorithm, both available from http://www.atgc-montpellier.fr/phyml. </p>
<p>Trees supported by SH-like approximate likelihood ratio tests (aLTR) were done with standard settings: LG model of amino acid substitution; 4 substitution rate categories; equilibrium frequencies from the model; fixed proportion of invariable sites (0.0); gamma shape parameters estimated from the alignment; starting tree estimated using BIONJ; NNI method selected for tree topology improvement with both topology and branch length optimization.</p>
<p>Trees supported by non-parametric bootstrap analyses with 100 replicates were done with the following settings: the JTT model of amino acid substitution was chosen based on analysis of the amino acid alignments in ProtTest 1.4 (<em>http://darwin.uvigo.es/software/prottest.html</em>); 8 substitution rate categories; equilibrium frequencies, proportion of invariable sites and gamma shape parameters estimated from the alignment; starting tree estimated using BIONJ; NNI and SPR methods selected for tree topology improvement with both topology and branch length optimization. </p>
<p>Statistical support values are shown at the nodes. The trees were rooted with the identified <em>Drosophila melanogaster</em> sequence.</p>
<p><strong>Species abbreviations:</strong></p>
<p>Species abbreviations are applied as follows: Homo sapiens (Hsa, human), Mus musculus (Mmu, mouse), Monodelphis domestica (Mdo, opossum), Gallus gallus (Gga, chicken), Danio rerio (Dre, zebrafish), Oryzias latipes (Ola, medaka), Gasterosteus aculeatus (Gac, stickleback), Tetraodon nigroviridis (Tni, green spotted puffer), Ciona savignyi (Csa, tunicate), Branchiostoma floridae (Bfl, lancelet), Drosophila melanogaster (Dme, fruit fly).</p>
<p><strong>References:</strong></p>
<p>1. Widmark J, Sundström G, Ocampo Daza D, Larhammar D (2011) Differential evolution of voltage-gated sodium channels in tetrapods and teleost fishes. Molecular biology and evolution 28:859-71. DOI: 10.1093/molbev/msq257.</p>
<p>2. Guindon S et al. (2010) New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. Systematic biology 59:307-21. DOI: 10.1093/sysbio/syq010.</p>
<p>3. Abascal F, Zardoya R, Posada D (2005) ProtTest: selection of best-fit models of protein evolution. Bioinformatics 21:2104-5. DOI: 10.1093/bioinformatics/bti263.</p>
<p> </p>
<p><em>Description updated 2012-12-06.</em></p
Phylogenetic analyses of 47 syntenic gene families in SSTR gene-bearing chromosome regions
<p>Sequence based phylogenetic analyses of 47 gene families identified in an analysis of conserved synteny around somatostatin receptor gene-bearing chromosome regions. For each gene family amino acid sequences were predicted from the Ensembl genome browser (http://www.ensembl.org) and used to create sequence alignments and phylogenetic trees. Gene families were defined based on Ensembl protein family predictions. Database identifiers, location data, genome assembly information and annotation notes for all identified protein families and sequences are included in 'Supplemental Table 2.xlsx' and 'Supplemental Table 3.xlsx' (Excel spreadsheets). </p>
<p>File information: </p>
<p>Gene families are identified by unique abbreviations based on approved HUGO Gene Nomenclature Committe (HGNC) gene symbols, or known aliases from the NCBI Entrez Gene database. For each gene family an alignment file '...align.fasta', a neighbor joining tree '...NJ_rooted.phb' and a phylogenetic maximum likelihood tree '...PhyML_rooted.phb' are included. </p>
<p>Alignments are included in FASTA format with the extension '.fasta'. This file format can be opened by most sequence analysis applications as well as text editors. Alignments were created using the ClustalWS sequence alignment program with standard settings (Gonnet weight matrix, gap opening penalty 10.0 and gap extension penalty 0.20) through the JABAWS 2 tool in Jalview 2.7 (http://www.jalview.org/).</p>
<p>Phylogenetic tree files are included in Phylip/Newick format with the extension '.phb'. This file format can be opened by freely available phylogenetic tree viewers such as FigTree (http://tree.bio.ed.ac.uk/software/figtree/) and TreeView (http://darwin.zoology.gla.ac.uk/~rpage/treeviewx/). The phylogenetic analyses were carried out based on the included alignments using bootstrap-supported neighbor joining (NJ) as well as phylogenetic maximum likelihood (PhyML) methods. Phylogenetic trees are rooted with identified <em>Drosophila melanogaster </em>(fruit fly) sequences, or with identified <em>Ciona intestinalis</em> or <em>Ciona savignyi</em> (tunicates), <em>Branchiostoma floridae </em>(Florida lancelet, amphioxus), or <em>Caenorhabditis elegans</em> (nematode) sequences if no fruit fly sequence could be found. </p>
The NJ trees are supported by non-parametric bootstrap analyses with 1000 replicates, applied through ClustalX 2.0 (http://www.clustal.org/clustal2/) with standard settings. The PhyML trees are supported by non-parametric bootstrap analyses with 100 replicates made using the PhyML 3.0 algorithm (http://www.atgc-montpellier.fr/phyml/) with the following settings: amino acid frequencies (equilibrium frequencies), proportion of invariable sites (with optimised p-invar) and gamma-shape parameters were estimated from the datasets; the number of substitution rate categories was set to 8; BIONJ was chosen to create the starting tree and the nearest neighbor interchange (NNI) tree improvement method was used to estimate the best topology; both tree topology and branch length optimization were chosen. The LG model of amino acid substitution, which is standard for PhyML 3.0, was chosen.
Species abbreviations are applied as follows:
<em>Homo sapiens</em> (Hsa, human), <em>Mus musculus</em> (Mmu, mouse), <em>Canis familiaris</em> (Cfa, dog), <em>Monodelphis domestica</em> (Mdo, grey short-tailed opossum), <em>Macropus eugenii</em> (Meu, tammar wallaby), <em>Ornitorhynchus anatinus</em> (Oan, platypus), <em>Gallus gallus</em> (Gga, chicken), <em>Taeniopygia guttata</em> (Tgu, zebra finch), <em>Meleagris gallopavo</em> (Mga, turkey), <em>Anolis carolinensis</em> (Aca, Carolina anole lizard), <em>Silurana (Xenopus) tropicalis</em> (Xtr, Western clawed frog), <em>Danio rerio</em> (Dre, zebrafish), <em>Oryzias latipes</em> (Ola, medaka), <em>Gasterosteus aculeatus</em> (Gac, three-spined stickleback), <em>Tetraodon nigroviridis</em> (Tni, green spotted pufferfish), <em>Takifugu rubripes</em> (Tru, fugu), <em>Ciona intestinalis</em> (Cin, tunicate), <em>Ciona savignyi</em> (Csa, tunicate), <em>Branchiostoma floridae</em> (Bfl, amphioxus), <em>Caenorhabditis elegans</em> (Cel, nematode) and <em>Drosophila melanogaster</em> (Dme, fruit fly).
The following gene families are included in this file set:
ABHD12: Abhydrolase domain containing 12
CFL: Cofilin and destrin (actin depolymerizing factor)
FLRT: Fibronectin leucine rich transmembrane protein
FOXA: Forkhead box A
ISM: Isthmin homolog
JAG: Jagged
NIN: Ninein (GSK3B interacting protein)
NKX2: NK2 homeobox 1 and 4
PAX: Paired box 1 and 9
PYG: Glycogen phosphorylase; brain, liver and muscle variants
RALGAPA: Ral GTPase activating protein, alpha subunit
RIN: Ras and Rab interactor
SEC23: Sec23 homologs A and B
SLC24A: Solute carrier family 24 members 3 and 4
SNX: Sorting nexin 5, 6 and 32
SPTLC: Serine palmitoyltransferase, long chain base subunit 2 and 3
VSX: Visual system homeobox
ADAP: ArfGAP with dual PH domains
ATP2A: ATPase, Ca++ transporting, cardiac muscle, fast twitch
C1QTNF: C1q and tumor necrosis factor related protein
CABP: Calcium binding protein 1, 3, 4 and 5
CACNA1: Calcium channel, voltage dependent, T type alpha subunit
CREBBP: CREB binding protein
CYTH: Cytohesin
FAM20: Family with sequence similarity 20
FNG: Fringe homolog
FSCN: Fascin homolog 1 and 2, actin-bundling protein
GLPR: Glucagon, glucagon-like and gastric inhibitory polypeptide receptors
GGA: Golgi-associated, gamma adapting ear containing, ARF-binding protein
GRIN2: Glutamate receptor, ionotropic, N-methyl D-aspartate 2
KCNJ: Potassium inwardly-rectifying channel, subfamily J member 2, 4, 12 and 14
KCTD: Potassium channel tetramerisation domain containing 2, 5 and 17
METRN: Meteorin, glial cell differentiation regulator
NDE: nudE nuclear distribution gene E homolog
RAB11FIP: RAB11 family interacting protein 3 and 4 (class II)
RADIL: Ras association and DIL domains/Ras interacting protein
RHBDF: Rhomboid 5 homolog
RHOT: Ras homolog gene family, member T1 and T2
RPH3A: Rabphilin 3A homolog/double C2-like domains, alpha
SDK: Sidekick cell adhesion molecule
SOX: Sex-determining region Y-box 8, 9 and 10
TEX2: Testis expressed 2
TNRC6: Trinucleotide repeat containing 6
TOM1: Target of myb1
TTYH: Tweety homolog
USP: Ubiquitin specific peptidase 31 and 43
WFIKKN: WAP, follistatin/kazal, immunoglobulin, kunitz and netrin domain contanin