8 research outputs found

    Multigene Phylogeny of Choanozoa and the Origin of Animals

    Get PDF
    Animals are evolutionarily related to fungi and to the predominantly unicellular protozoan phylum Choanozoa, together known as opisthokonts. To establish the sequence of events when animals evolved from unicellular ancestors, and understand those key evolutionary transitions, we need to establish which choanozoans are most closely related to animals and also the evolutionary position of each choanozoan group within the opisthokont phylogenetic tree. Here we focus on Ministeria vibrans, a minute bacteria-eating cell with slender radiating tentacles. Single-gene trees suggested that it is either the closest unicellular relative of animals or else sister to choanoflagellates, traditionally considered likely animal ancestors. Sequencing thousands of Ministeria protein genes now reveals about 14 with domains of key significance for animal cell biology, including several previously unknown from deeply diverging Choanozoa, e.g. domains involved in hedgehog, Notch and tyrosine kinase signaling or cell adhesion (cadherin). Phylogenetic trees using 78 proteins show that Ministeria is not sister to animals or choanoflagellates (themselves sisters to animals), but to Capsaspora, another protozoan with thread-like (filose) tentacles. The Ministeria/Capsaspora clade (new class Filasterea) is sister to animals and choanoflagellates, these three groups forming a novel clade (filozoa) whose ancestor presumably evolved filose tentacles well before they aggregated as a periciliary collar in the choanoflagellate/sponge common ancestor. Our trees show ichthyosporean choanozoans as sisters to filozoa; a fusion between ubiquitin and ribosomal small subunit S30 protein genes unifies all holozoa (filozoa plus Ichthyosporea), being absent in earlier branching eukaryotes. Thus, several successive evolutionary innovations occurred among their unicellular closest relatives prior to the origin of the multicellular body-plan of animals

    Annotated Expressed Sequence Tags (ESTs) from pre-smolt Atlantic salmon (Salmo salar) in a searchable data resource

    Get PDF
    Background To identify as many different transcripts/genes in the Atlantic salmon genome as possible, it is crucial to acquire good cDNA libraries from different tissues and developmental stages, their relevant sequences (ESTs or full length sequences) and attempt to predict function. Such libraries allow identification of a large number of different transcripts and can provide valuable information on genes expressed in a particular tissue at a specific developmental stage. This data is important in constructing a microarray chip, identifying SNPs in coding regions, and for future identification of genes in the whole genome sequence. An important factor that determines the usefulness of generated data for biologists is efficient data access. Public searchable databases play a crucial role in providing such service. Description Twenty-three Atlantic salmon cDNA libraries were constructed from 15 tissues, yielding nearly 155,000 clones. From these libraries 58,109 ESTs were generated, of which 57,212 were used for contig assembly. Following deletion of mitochondrial sequences 55,118 EST sequences were submitted to GenBank. In all, 20,019 unique sequences, consisting of 6,424 contigs and 13,595 singlets, were generated. The Norwegian Salmon Genome Project Database has been constructed and annotation performed by the annotation transfer approach. Annotation was successful for 50.3% (10,075) of the sequences and 6,113 sequences (30.5%) were annotated with Gene Ontology terms for molecular function, biological process and cellular component. Conclusion We describe the construction of cDNA libraries from juvenile/pre-smolt Atlantic salmon (Salmo salar), EST sequencing, clustering, and annotation by assigning putative function to the transcripts. These sequences represents 97% of all sequences submitted to GenBank from the pre-smoltification stage. The data has been grouped into datasets according to its source and type of annotation. Various data query options are offered including searches on function assignments and Gene Ontology terms. Data delivery options include summaries for the datasets and their annotations, detailed self-explanatory annotations, and access to the original BLAST results and Gene Ontology annotation trees. Potential presence of a relatively high number of immune-related genes in the dataset was shown by annotation searches

    Annotated Expressed Sequence Tags (ESTs) from pre-smolt Atlantic salmon (<it>Salmo salar</it>) in a searchable data resource

    No full text
    Abstract Background To identify as many different transcripts/genes in the Atlantic salmon genome as possible, it is crucial to acquire good cDNA libraries from different tissues and developmental stages, their relevant sequences (ESTs or full length sequences) and attempt to predict function. Such libraries allow identification of a large number of different transcripts and can provide valuable information on genes expressed in a particular tissue at a specific developmental stage. This data is important in constructing a microarray chip, identifying SNPs in coding regions, and for future identification of genes in the whole genome sequence. An important factor that determines the usefulness of generated data for biologists is efficient data access. Public searchable databases play a crucial role in providing such service. Description Twenty-three Atlantic salmon cDNA libraries were constructed from 15 tissues, yielding nearly 155,000 clones. From these libraries 58,109 ESTs were generated, of which 57,212 were used for contig assembly. Following deletion of mitochondrial sequences 55,118 EST sequences were submitted to GenBank. In all, 20,019 unique sequences, consisting of 6,424 contigs and 13,595 singlets, were generated. The Norwegian Salmon Genome Project Database has been constructed and annotation performed by the annotation transfer approach. Annotation was successful for 50.3% (10,075) of the sequences and 6,113 sequences (30.5%) were annotated with Gene Ontology terms for molecular function, biological process and cellular component. Conclusion We describe the construction of cDNA libraries from juvenile/pre-smolt Atlantic salmon (Salmo salar), EST sequencing, clustering, and annotation by assigning putative function to the transcripts. These sequences represents 97% of all sequences submitted to GenBank from the pre-smoltification stage. The data has been grouped into datasets according to its source and type of annotation. Various data query options are offered including searches on function assignments and Gene Ontology terms. Data delivery options include summaries for the datasets and their annotations, detailed self-explanatory annotations, and access to the original BLAST results and Gene Ontology annotation trees. Potential presence of a relatively high number of immune-related genes in the dataset was shown by annotation searches.</p

    AIR: A batch-oriented web program package for construction of supermatrices ready for phylogenomic analyses

    No full text
    Background. Large multigene sequence alignments have over recent years been increasingly employed for phylogenomic reconstruction of the eukaryote tree of life. Such supermatrices of sequence data are preferred over single gene alignments as they contain vastly more information about ancient sequence characteristics, and are thus more suitable for resolving deeply diverging relationships. However, as alignments are expanded, increasingly numbers of sites with misleading phylogenetic information are also added. Therefore, a major goal in phylogenomic analyses is to maximize the ratio of information to noise; this can be achieved by the reduction of fast evolving sites. Results Here we present a batch-oriented web-based program package, named AIR that allows 1) transformation of several single genes to one multigene alignment, 2) identification of evolutionary rates in multigene alignments and 3) removal of fast evolving sites. These three processes can be done with the programs AIR-Appender, AIR-Identifier, and AIR-Remover (AIR), which can be used independently or in a semi-automated pipeline. AIR produces user-friendly output files with filtered and non-filtered alignments where residues are colored according to their evolutionary rates. Other bioinformatics applications linked to the AIR package are available at the Bioportal http://www.bioportal.uio.no, University of Oslo; together these greatly improve the flexibility, efficiency and quality of phylogenomic analyses. Conclusion The AIR program package allows for efficient creation of multigene alignments and better assessment of evolutionary rates in sequence alignments. Removing fast evolving sites with the AIR programs has been employed in several recent phylogenomic analyses resulting in improved phylogenetic resolution and increased statistical support for branching patterns among the early diverging eukaryotes.Botany, Department ofScience, Faculty ofNon UBCReviewedFacult

    Annotated Expressed Sequence Tags (ESTs) from pre-smolt Atlantic salmon () in a searchable data resource-3

    No full text
    <p><b>Copyright information:</b></p><p>Taken from "Annotated Expressed Sequence Tags (ESTs) from pre-smolt Atlantic salmon () in a searchable data resource"</p><p>http://www.biomedcentral.com/1471-2164/8/209</p><p>BMC Genomics 2007;8():209-209.</p><p>Published online 2 Jul 2007</p><p>PMCID:PMC1913521.</p><p></p>ts, [GO: Molecular Function], [GO: Biological Process] and [GO: Cellular Component]. The annotation was performed for contig and singlet sequences. Complete SGP GO-GOA annotation is available at SGP data resource > Data and results > Annotations > SGP full annotation > GO term tables

    Annotated Expressed Sequence Tags (ESTs) from pre-smolt Atlantic salmon () in a searchable data resource-2

    No full text
    <p><b>Copyright information:</b></p><p>Taken from "Annotated Expressed Sequence Tags (ESTs) from pre-smolt Atlantic salmon () in a searchable data resource"</p><p>http://www.biomedcentral.com/1471-2164/8/209</p><p>BMC Genomics 2007;8():209-209.</p><p>Published online 2 Jul 2007</p><p>PMCID:PMC1913521.</p><p></p>ptions, libraries and annotations. When a match occurs in any of these data, all three data categories are shown in results. The "Matches in datasets" display provides access to a subset of contigs, singlets and annotations selected in the search. The "Annotations best hits" display is the same for the Clustered data datasets menu and database search results. Clustered data datasets menu provides a current snapshot of the SGP database

    Annotated Expressed Sequence Tags (ESTs) from pre-smolt Atlantic salmon () in a searchable data resource-0

    No full text
    <p><b>Copyright information:</b></p><p>Taken from "Annotated Expressed Sequence Tags (ESTs) from pre-smolt Atlantic salmon () in a searchable data resource"</p><p>http://www.biomedcentral.com/1471-2164/8/209</p><p>BMC Genomics 2007;8():209-209.</p><p>Published online 2 Jul 2007</p><p>PMCID:PMC1913521.</p><p></p> best annotation hits" is available for singlets. Other options are "Contigs length and number of reads" and "Distribution of average length and number of reads in contigs". The Clustered data summary provides a current snapshot of the SGP database

    The Dalton quantum chemistry program system

    No full text
    Dalton is a powerful general-purpose program system for the study of molecular electronic structure at the Hartree–Fock, Kohn–Sham, multiconfigurational self-consistent-field, Møller–Plesset, configuration-interaction, and coupled-cluster levels of theory. Apart from the total energy, a wide variety of molecular properties may be calculated using these electronic-structure models. Molecular gradients and Hessians are available for geometry optimizations, molecular dynamics, and vibrational studies, whereas magnetic resonance and optical activity can be studied in a gauge-origin-invariant manner. Frequency-dependent molecular properties can be calculated using linear, quadratic, and cubic response theory. A large number of singlet and triplet perturbation operators are available for the study of one-, two-, and three-photon processes. Environmental effects may be included using various dielectric-medium and quantum-mechanics/molecular-mechanics models. Large molecules may be studied using linear-scaling and massively parallel algorithms. Dalton is distributed at no cost from http://www.daltonprogram.org for a number of UNIX platforms
    corecore