119 research outputs found

    Ensembl Genomes 2013: scaling up access to genome-wide data

    Get PDF
    Ensembl Genomes (http://www.ensemblgenomes.org) is an integrating resource for genome-scale data from non-vertebrate species. The project exploits and extends technologies for genome annotation, analysis and dissemination, developed in the context of the vertebrate-focused Ensembl project, and provides a complementary set of resources for non-vertebrate species through a consistent set of programmatic and interactive interfaces. These provide access to data including reference sequence, gene models, transcriptional data, polymorphisms and comparative analysis. This article provides an update to the previous publications about the resource, with a focus on recent developments. These include the addition of important new genomes (and related data sets) including crop plants, vectors of human disease and eukaryotic pathogens. In addition, the resource has scaled up its representation of bacterial genomes, and now includes the genomes of over 9000 bacteria. Specific extensions to the web and programmatic interfaces have been developed to support users in navigating these large data sets. Looking forward, analytic tools to allow targeted selection of data for visualization and download are likely to become increasingly important in future as the number of available genomes increases within all domains of life, and some of the challenges faced in representing bacterial data are likely to become commonplace for eukaryotes in future

    A Gene in the Process of Endosymbiotic Transfer

    Get PDF
    BACKGROUND: The endosymbiotic birth of organelles is accompanied by massive transfer of endosymbiont genes to the eukaryotic host nucleus. In the centric diatom Thalassiosira pseudonana the Psb28 protein is encoded in the plastid genome while a second version is nuclear-encoded and possesses a bipartite N-terminal presequence necessary to target the protein into the diatom complex plastid. Thus it can represent a gene captured during endosymbiotic gene transfer. METHODOLOGY/PRINCIPAL FINDINGS: To specify the origin of nuclear- and plastid-encoded Psb28 in T. pseudonana we have performed extensive phylogenetic analyses of both mentioned genes. We have also experimentally tested the intracellular location of the nuclear-encoded Psb28 protein (nuPsb28) through transformation of the diatom Phaeodactylum tricornutum with the gene in question fused to EYFP. CONCLUSIONS/SIGNIFICANCE: We show here that both versions of the psb28 gene in T. pseudonana are transcribed. We also provide experimental evidence for successful targeting of the nuPsb28 fused with EYFP to the diatom complex plastid. Extensive phylogenetic analyses demonstrate that nucleotide composition of the analyzed genes deeply influences the tree topology and that appropriate methods designed to deal with a compositional bias of the sequences and the long branch attraction artefact (LBA) need to be used to overcome this obstacle. We propose that nuclear psb28 in T. pseudonana is a duplicate of a plastid localized version, and that it has been transferred from its endosymbiont

    Evolutionary Origins and Functions of the Carotenoid Biosynthetic Pathway in Marine Diatoms

    Get PDF
    Carotenoids are produced by all photosynthetic organisms, where they play essential roles in light harvesting and photoprotection. The carotenoid biosynthetic pathway of diatoms is largely unstudied, but is of particular interest because these organisms have a very different evolutionary history with respect to the Plantae and are thought to be derived from an ancient secondary endosymbiosis between heterotrophic and autotrophic eukaryotes. Furthermore, diatoms have an additional xanthophyll-based cycle for dissipating excess light energy with respect to green algae and higher plants. To explore the origins and functions of the carotenoid pathway in diatoms we searched for genes encoding pathway components in the recently completed genome sequences of two marine diatoms. Consistent with the supplemental xanthophyll cycle in diatoms, we found more copies of the genes encoding violaxanthin de-epoxidase (VDE) and zeaxanthin epoxidase (ZEP) enzymes compared with other photosynthetic eukaryotes. However, the similarity of these enzymes with those of higher plants indicates that they had very probably diversified before the secondary endosymbiosis had occurred, implying that VDE and ZEP represent early eukaryotic innovations in the Plantae. Consequently, the diatom chromist lineage likely obtained all paralogues of ZEP and VDE genes during the process of secondary endosymbiosis by gene transfer from the nucleus of the algal endosymbiont to the host nucleus. Furthermore, the presence of a ZEP gene in Tetrahymena thermophila provides the first evidence for a secondary plastid gene encoded in a heterotrophic ciliate, providing support for the chromalveolate hypothesis. Protein domain structures and expression analyses in the pennate diatom Phaeodactylum tricornutum indicate diverse roles for the different ZEP and VDE isoforms and demonstrate that they are differentially regulated by light. These studies therefore reveal the ancient origins of several components of the carotenoid biosynthesis pathway in photosynthetic eukaryotes and provide information about how they have diversified and acquired new functions in the diatoms

    Phylogenomic analysis of the Chlamydomonas genome unmasks proteins potentially involved in photosynthetic function and regulation

    Get PDF
    Chlamydomonas reinhardtii, a unicellular green alga, has been exploited as a reference organism for identifying proteins and activities associated with the photosynthetic apparatus and the functioning of chloroplasts. Recently, the full genome sequence of Chlamydomonas was generated and a set of gene models, representing all genes on the genome, was developed. Using these gene models, and gene models developed for the genomes of other organisms, a phylogenomic, comparative analysis was performed to identify proteins encoded on the Chlamydomonas genome which were likely involved in chloroplast functions (or specifically associated with the green algal lineage); this set of proteins has been designated the GreenCut. Further analyses of those GreenCut proteins with uncharacterized functions and the generation of mutant strains aberrant for these proteins are beginning to unmask new layers of functionality/regulation that are integrated into the workings of the photosynthetic apparatus

    Transcriptomic response of the red tide dinoflagellate, Karenia brevis, to nitrogen and phosphorus depletion and addition

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The role of coastal nutrient sources in the persistence of <it>Karenia brevis </it>red tides in coastal waters of Florida is a contentious issue that warrants investigation into the regulation of nutrient responses in this dinoflagellate. In other phytoplankton studied, nutrient status is reflected by the expression levels of N- and P-responsive gene transcripts. In dinoflagellates, however, many processes are regulated post-transcriptionally. All nuclear encoded gene transcripts studied to date possess a 5' <it>trans</it>-spliced leader (SL) sequence suggestive, based on the trypanosome model, of post-transcriptional regulation. The current study therefore sought to determine if the transcriptome of <it>K. brevis </it>is responsive to nitrogen and phosphorus and is informative of nutrient status.</p> <p>Results</p> <p>Microarray analysis of N-depleted <it>K. brevis </it>cultures revealed an increase in the expression of transcripts involved in N-assimilation (nitrate and ammonium transporters, glutamine synthetases) relative to nutrient replete cells. In contrast, a transcriptional signal of P-starvation was not apparent despite evidence of P-starvation based on their rapid growth response to P-addition. To study transcriptome responses to nutrient addition, the limiting nutrient was added to depleted cells and changes in global gene expression were assessed over the first 48 hours following nutrient addition. Both N- and P-addition resulted in significant changes in approximately 4% of genes on the microarray, using a significance cutoff of 1.7-fold and p ≤ 10<sup>-4</sup>. By far, the earliest responding genes were dominated in both nutrient treatments by pentatricopeptide repeat (PPR) proteins, which increased in expression up to 3-fold by 1 h following nutrient addition. PPR proteins are nuclear encoded proteins involved in chloroplast and mitochondria RNA processing. Correspondingly, other functions enriched in response to both nutrients were photosystem and ribosomal genes.</p> <p>Conclusions</p> <p>Microarray analysis provided transcriptomic evidence for N- but not P-limitation in <it>K. brevis</it>. Transcriptomic responses to the addition of either N or P suggest a concerted program leading to the reactivation of chloroplast functions. Even the earliest responding PPR protein transcripts possess a 5' SL sequence that suggests post-transcriptional control. Given the current state of knowledge of dinoflagellate gene regulation, it is currently unclear how these rapid changes in such transcript levels are achieved.</p

    A bibliography of parasites and diseases of marine and freshwater fishes of India

    Get PDF
    With the increasing demand for fish as human food, aquaculture both in freshwater and salt water is rapidly developing over the world. In the developing countries, fishes are being raised as food. In many countries fish farming is a very important economic activity. The most recent branch, mariculture, has shown advances in raising fishes in brackish, estuarine and bay waters, in which marine, anadromous and catadromous fishes have successfully been grown and maintained

    Large expert-curated database for benchmarking document similarity detection in biomedical literature search

    Get PDF
    Document recommendation systems for locating relevant literature have mostly relied on methods developed a decade ago. This is largely due to the lack of a large offline gold-standard benchmark of relevant documents that cover a variety of research fields such that newly developed literature search techniques can be compared, improved and translated into practice. To overcome this bottleneck, we have established the RElevant LIterature SearcH consortium consisting of more than 1500 scientists from 84 countries, who have collectively annotated the relevance of over 180 000 PubMed-listed articles with regard to their respective seed (input) article/s. The majority of annotations were contributed by highly experienced, original authors of the seed articles. The collected data cover 76% of all unique PubMed Medical Subject Headings descriptors. No systematic biases were observed across different experience levels, research fields or time spent on annotations. More importantly, annotations of the same document pairs contributed by different scientists were highly concordant. We further show that the three representative baseline methods used to generate recommended articles for evaluation (Okapi Best Matching 25, Term Frequency-Inverse Document Frequency and PubMed Related Articles) had similar overall performances. Additionally, we found that these methods each tend to produce distinct collections of recommended articles, suggesting that a hybrid method may be required to completely capture all relevant articles. The established database server located at https://relishdb.ict.griffith.edu.au is freely available for the downloading of annotation data and the blind testing of new methods. We expect that this benchmark will be useful for stimulating the development of new powerful techniques for title and title/abstract-based search engines for relevant articles in biomedical research.Peer reviewe

    Whole-genome sequencing of chronic lymphocytic leukemia identifies subgroups with distinct biological and clinical features

    Get PDF
    The value of genome-wide over targeted driver analyses for predicting clinical outcomes of cancer patients is debated. Here, we report the whole-genome sequencing of 485 chronic lymphocytic leukemia patients enrolled in clinical trials as part of the United Kingdom’s 100,000 Genomes Project. We identify an extended catalog of recurrent coding and noncoding genetic mutations that represents a source for future studies and provide the most complete high-resolution map of structural variants, copy number changes and global genome features including telomere length, mutational signatures and genomic complexity. We demonstrate the relationship of these features with clinical outcome and show that integration of 186 distinct recurrent genomic alterations defines five genomic subgroups that associate with response to therapy, refining conventional outcome prediction. While requiring independent validation, our findings highlight the potential of whole-genome sequencing to inform future risk stratification in chronic lymphocytic leukemia

    A bibliography of parasites and diseases of marine and freshwater fishes of India

    Full text link
    corecore