21 research outputs found
Fe limitation decreases transcriptional regulation over the diel cycle in the model diatom Thalassiosira pseudonana.
Iron (Fe) is an important growth factor for diatoms and its availability is further restricted by changes in the carbonate chemistry of seawater. We investigated the physiological attributes and transcriptional profiles of the diatom Thalassiosira pseudonana grown on a day: night cycle under different CO2/pH and iron concentrations, that in combination generated available iron (Fe\u27) concentrations of 1160, 233, 58 and 12 pM. We found the light-dark conditions to be the main driver of transcriptional patterns, followed by Fe\u27 concentration and CO2 availability, respectively. At the highest Fe\u27 (1160 pM), 55% of the transcribed genes were differentially expressed between day and night, whereas at the lowest Fe\u27 (12 pM), only 28% of the transcribed genes displayed comparable patterns. While Fe limitation disrupts the diel expression patterns for genes in most central metabolism pathways, the diel expression of light- signaling molecules and glycolytic genes was relatively robust in response to reduced Fe\u27. Moreover, we identified a non-canonical splicing of transcripts encoding triose-phosphate isomerase, a key-enzyme of glycolysis, generating transcript isoforms that would encode proteins with and without an active site. Transcripts that encoded an active enzyme maintained a diel expression at low Fe\u27, while transcripts that encoded the non-active enzyme lost the diel expression. This work illustrates the interplay between nutrient limitation and transcriptional regulation over the diel cycle. Considering that future ocean conditions will reduce the availability of Fe in many parts of the oceans, our work identifies some of the regulatory mechanisms that may shape future ecological communities
Diel transcriptional oscillations of light-sensitive regulatory elements in open-ocean eukaryotic plankton communities
© The Author(s), 2021. This article is distributed under the terms of the Creative Commons Attribution License. The definitive version was published in Coesel, S. N., Durham, B. P., Groussman, R. D., Hu, S. K., Caron, D. A., Morales, R. L., Ribalet, F., & Armbrust, E. V. Diel transcriptional oscillations of light-sensitive regulatory elements in open-ocean eukaryotic plankton communities. Proceedings of the National Academy of Sciences of the United States of America, 118(6), (2021): e2011038118, https://doi.org/10.1073./pnas.2011038118.The 24-h cycle of light and darkness governs daily rhythms of complex behaviors across all domains of life. Intracellular photoreceptors sense specific wavelengths of light that can reset the internal circadian clock and/or elicit distinct phenotypic responses. In the surface ocean, microbial communities additionally modulate nonrhythmic changes in light quality and quantity as they are mixed to different depths. Here, we show that eukaryotic plankton in the North Pacific Subtropical Gyre transcribe genes encoding light-sensitive proteins that may serve as light-activated transcription factors, elicit light-driven electrical/chemical cascades, or initiate secondary messenger-signaling cascades. Overall, the protistan community relies on blue light-sensitive photoreceptors of the cryptochrome/photolyase family, and proteins containing the Light-Oxygen-Voltage (LOV) domain. The greatest diversification occurred within Haptophyta and photosynthetic stramenopiles where the LOV domain was combined with different DNA-binding domains and secondary signal-transduction motifs. Flagellated protists utilize green-light sensory rhodopsins and blue-light helmchromes, potentially underlying phototactic/photophobic and other behaviors toward specific wavelengths of light. Photoreceptors such as phytochromes appear to play minor roles in the North Pacific Subtropical Gyre. Transcript abundance of environmental light-sensitive protein-encoding genes that display diel patterns are found to primarily peak at dawn. The exceptions are the LOV-domain transcription factors with peaks in transcript abundances at different times and putative phototaxis photoreceptors transcribed throughout the day. Together, these data illustrate the diversity of light-sensitive proteins that may allow disparate groups of protists to respond to light and potentially synchronize patterns of growth, division, and mortality within the dynamic ocean environment.This work was supported by a grant from the Simons Foundation (SCOPE Award 329108 [to E.V.A.]) and XSEDE Grant Allocation OCE160019 (to R.D.G.)
Evolutionary Origins and Functions of the Carotenoid Biosynthetic Pathway in Marine Diatoms
Carotenoids are produced by all photosynthetic organisms, where they play essential roles in light harvesting and photoprotection. The carotenoid biosynthetic pathway of diatoms is largely unstudied, but is of particular interest because these organisms have a very different evolutionary history with respect to the Plantae and are thought to be derived from an ancient secondary endosymbiosis between heterotrophic and autotrophic eukaryotes. Furthermore, diatoms have an additional xanthophyll-based cycle for dissipating excess light energy with respect to green algae and higher plants. To explore the origins and functions of the carotenoid pathway in diatoms we searched for genes encoding pathway components in the recently completed genome sequences of two marine diatoms. Consistent with the supplemental xanthophyll cycle in diatoms, we found more copies of the genes encoding violaxanthin de-epoxidase (VDE) and zeaxanthin epoxidase (ZEP) enzymes compared with other photosynthetic eukaryotes. However, the similarity of these enzymes with those of higher plants indicates that they had very probably diversified before the secondary endosymbiosis had occurred, implying that VDE and ZEP represent early eukaryotic innovations in the Plantae. Consequently, the diatom chromist lineage likely obtained all paralogues of ZEP and VDE genes during the process of secondary endosymbiosis by gene transfer from the nucleus of the algal endosymbiont to the host nucleus. Furthermore, the presence of a ZEP gene in Tetrahymena thermophila provides the first evidence for a secondary plastid gene encoded in a heterotrophic ciliate, providing support for the chromalveolate hypothesis. Protein domain structures and expression analyses in the pennate diatom Phaeodactylum tricornutum indicate diverse roles for the different ZEP and VDE isoforms and demonstrate that they are differentially regulated by light. These studies therefore reveal the ancient origins of several components of the carotenoid biosynthesis pathway in photosynthetic eukaryotes and provide information about how they have diversified and acquired new functions in the diatoms
Regulation of the carotenoid biosynthetic pathway in the green microalga Dunaliella salina and the diatom Phaeodactylum tricornutum
Tese dout. , Faculdade de Ciências do Mar e do Ambiente, 2007, Universidade do AlgarveCarotenoids are produced by all photosynthetic organisms where they play indispensable roles in light-harvesting and photoprotection. This thesis has focused on carotenoid biosynthesis in Dunaliella salina and Phaeodactylum tricornutum, two phylogenetically diverse algae. Both algae are able to maintain high photosynthetic rates under fluctuating light intensities. We examined the effect of several environmental stress conditions on carotenoid biosynthesis in D. salina and found that nutrient level, light intensity and salinity have a differential effect on carotenogenesis. We also found that the steady-state transcript levels of two key-enzymes involved in the early steps of carotenoid biosynthesis are coordinately up-regulated in carotenoid-accumulating D. salina cells, indicating that carotenoid biosynthesis in this alga may be partly regulated at the transcriptional level. Analysis of the P. tricornutum genome, as well as that from another diatom, Thalassiosira pseudonana, revealed that the genes involved in xanthophyll biosynthesis and xanthophyll cycling in diatoms have diversified greatly with respect to green algae and higher plants. We showed that the steady-state mRNA levels of the P. tricornutum carotenoid biosynthesis-related genes are increased upon nutrient stress and blue light, and we were able to establish that light of different spectral quality has a differential effect on the mRNA levels of these genes. By using transgenic P. tricornutum cell lines containing elevated levels of a putative diatom blue light cryptochrome photoreceptor, we demonstrated that this protein is involved in the transcriptional regulation of blue light-responsive genes, ultimately resulting in an enhanced accumulation of xanthophyll pigments and a significantly altered chromatic adaptation to blue light. In conclusion, the work reported in this thesis will facilitate future work on both the regulatory and biotechnological aspects of the carotenoid biosynthetic pathway in unicellular algae
MarFERReT: an open-source, version-controlled reference library of marine microbial eukaryote functional genes
<p>Metatranscriptomics generates large volumes of sequence data about transcribed genes in natural environments. Taxonomic annotation of these datasets depends on availability of curated reference sequences. For marine microbial eukaryotes, current reference libraries are limited by gaps in sequenced organism diversity and barriers to updating libraries with new sequence data, resulting in taxonomic annotation of only about half of eukaryotic environmental transcripts. Here, we introduce version 1.0 of the Marine Functional EukaRyotic Reference Taxa (MarFERReT), an updated marine microbial eukaryotic sequence library with a version-controlled framework designed for taxonomic annotation of eukaryotic metatranscriptomes. We gathered 902 marine eukaryote genomes and transcriptomes from multiple sources and assessed these candidate entries for sequence quality and cross-contamination issues, selecting 800 validated entries for inclusion in the library. MarFERReT v1 contains reference sequences from 800 marine eukaryotic genomes and transcriptomes, covering 453 species- and strain-level taxa, totaling nearly 28 million protein sequences with associated NCBI and PR2 Taxonomy identifiers and Pfam functional annotations. An accompanying MarFERReT project repository hosts containerized build scripts, documentation on installation and use case examples, and information on new versions of MarFERReT.<br><br>MarFERReT is linked to a code repository hosting containerized build scripts, documentation on installation and use case examples, and information on new versions of MarFERReT here: <a href="https://github.com/armbrustlab/marferret">https://github.com/armbrustlab/marferret</a></p>
<p>The raw source data for the 902 candidate entries considered for MarFERReT v1.1.1, including the 800 accepted entries, are available for download from their respective online locations. The source URL for each of the entries is listed here in MarFERReT.v1.1.1.entry_curation.csv, and detailed instructions and code for downloading the raw sequence data from source are available in the MarFERReT code repository (<a href="https://github.com/armbrustlab/marferret/blob/main/docs/process_clean_marmicrodb.log.sh">link</a>). </p>
<p>This repository release contains MarFERReT database files from the v1.1.1 MarFERReT release using the following MarFERReT library build scripts: <strong>assemble_marferret.sh</strong>, <strong>pfam_annotate.sh</strong>, and <strong>build_diamond_db.sh</strong><br><br>The following MarFERReT data products are available in this repository:</p>
<p><strong>MarFERReT.v1.1.1.metadata.csv</strong><br>This CSV file contains descriptors of each of the 902 database entries, including data source, taxonomy, and sequence descriptors. Data fields are as follows:</p>
<ol>
<li><strong>entry_id</strong>: Unique MarFERReT sequence entry identifier.</li>
<li><strong>accepted: </strong>Acceptance into the final MarFERReT build (Y/N). The Y/N values can be adjusted to customize the final build output according to user-specific needs.</li>
<li><strong>marferret_name</strong>: A human and machine friendly string derived from the NCBI Taxonomy organism name; maintaining strain-level designation wherever possible.</li>
<li><strong>tax_id</strong>: The NCBI Taxonomy ID (taxID).</li>
<li><strong>pr2_accession</strong>: Best-matching PR2 accession ID associated with entry</li>
<li><strong>pr2_rank</strong>: The lowest shared rank between the entry and the pr2_accession</li>
<li><strong>pr2_taxonomy</strong>: PR2 Taxonomy classification scheme of the pr2_accession</li>
<li><strong>data_type</strong>: Type of sequence data; transcriptome shotgun assemblies (TSA), gene models from assembled genomes (genome), and single-cell amplified genomes (SAG) or transcriptomes (SAT).</li>
<li><strong>data_source</strong>: Online location of sequence data; the Zenodo data repository (<a href="../">Zenodo</a>), the datadryad.org repository (<a href="http://datadryad.org/">datadryad.org</a>), MMETSP re-assemblies on Zenodo (MMETSP)17, NCBI GenBank (<a href="https://www.ncbi.nlm.nih.gov/genbank/">NCBI</a>), JGI Phycocosm (<a href="https://phycocosm.jgi.doe.gov/phycocosm/home">JGI-Phycocosm</a>), the TARA Oceans portal on Genoscope (<a href="http://www.genoscope.cns.fr/tara/">TARA</a>), or entries from the Roscoff Culture Collection through the METdb database repository (<a href="https://metdb.sb-roscoff.fr/metdb/">METdb</a>).</li>
<li><strong>source_link</strong>: URL where the original sequence data and/or metadata was collected.</li>
<li><strong>pub_year</strong>: Year of data release or publication of linked reference.</li>
<li><strong>ref_link</strong>: Pubmed URL directs to the published reference for entry, if available.</li>
<li><strong>ref_doi</strong>: DOI of entry data from source, if available.</li>
<li><strong>source_filename</strong>: Name of the original sequence file name from the data source.</li>
<li><strong>seq_type</strong>: Entry sequence data retrieved in nucleotide (nt) or amino acid (aa) alphabets.</li>
<li><strong>n_seqs_raw</strong>: Number of sequences in the original sequence file.</li>
<li><strong>source_name:</strong> Full organism name from entry source</li>
<li><strong>original_taxID</strong>: Original NCBI taxID from entry data source metadata, if available</li>
<li><strong>alias:</strong> Additional identifiers for the entry, if available</li>
</ol>
<p><br><strong>MarFERReT.v1.1.1.curation.csv</strong><br>This CSV file contains curation and quality-control information on the 902 candidate entries considered for incorporation into MarFERReT v1, including curated NCBI Taxonomy IDs and entry validation statistics. Data fields are as follows:</p>
<ol>
<li><strong>entry_id:</strong> Unique MarFERReT sequence entry identifier</li>
<li><strong>marferret_name: </strong>Organism name in human and machine friendly format, including additional NCBI taxonomy strain identifiers if available.</li>
<li><strong>tax_id</strong>: Verified NCBI taxID used in MarFERReT</li>
<li><strong>taxID_status</strong>: Status of the final NCBI taxID (Assigned, Updated, or Unchanged)</li>
<li><strong>taxID_notes</strong>: Notes on the original_taxID</li>
<li><strong>n_seqs_raw</strong>: Number of sequences in the original sequence file</li>
<li><strong>n_pfams</strong>: Number of Pfam domains identified in protein sequences</li>
<li><strong>qc_flag</strong>: Early validation quality control flags for the following: LOW_SEQS; less than 1,200 raw sequences; LOW_PFAMS; less than 500 Pfam domain annotations.</li>
<li><strong>flag_Lasek</strong>: Flag notes from Lasek-Nesselquist and Johnson (2019); contains the flag 'FLAG_LASEK' indicating ciliate samples reported as contaminated in this study.</li>
<li><strong>VV_contam_pct</strong>: Estimated contamination reported for MMETSP entries in Van Vlierberghe et al., (2021).</li>
<li><strong>flag_VanVlierberghe: </strong>Flag for a high level of estimated contamination, from 'flag_VanVlierberghe' values over 50%: FLAG_VV.</li>
<li><strong>rp63_npfams</strong>: Number of ribosomal protein Pfam domains out of 63 total.</li>
<li><strong>rp63_contam_pct</strong>: Percent of total ribosomal protein sequences with an inferred taxonomic identity in any lineage other than the recorded identity, as described in the Technical Validation section from analysis of 63 Pfam ribosomal protein domains.</li>
<li><strong>flag_rp63</strong>: Flag for a high level of estimated contamination, from 'rp63_contam_pct' values over 50%: FLAG_RP63.</li>
<li><strong>flag_sum: </strong>Count of the number of flag columns (`qc_flag`, `flag_Lasek`, `flag_VanVlierberghe`, and `flag_rp63`). All entries with one or more flag are nominally rejected ('accepted' = N); entries without any flags are validated and accepted ('accepted' = Y).</li>
<li><strong>accepted: </strong>Acceptance into the final MarFERReT build (Y or N).</li>
</ol>
<p> </p>
<p><strong>MarFERReT.v1.1.1.proteins.faa.gz</strong><br>This Gzip-compressed FASTA file contains the 27,951,013 final translated and clustered protein sequences for all 800 accepted MarFERReT entries. The sequence defline contains the unique identifier for the sequence and its reference (mftX, where 'X' is a ten-digit integer value). </p>
<p> </p>
<p><strong>MarFERReT.v1.1.1.taxonomies.tab.gz</strong><br>This Gzip-compressed tab-separated file is formatted for interoperability with the DIAMOND protein alignment tool commonly used for downstream analyses and contains some columns without any data. Each row contains an entry for one of the MarFERReT protein sequences in MarFERReT.v1.proteins.faa.gz. Note that 'accession.version' and 'taxid' are populated columns while 'accession' and 'gi' have NA values; the latter columns are required for back-compatibility as input for the DIAMOND alignment software and LCA analysis. </p>
<p>The columns in this file contain the following information:</p>
<ol>
<li><strong>accession</strong>: (NA)</li>
<li><strong>accession.version</strong>: The unique MarFERReT sequence identifier ('mftX').</li>
<li><strong>taxid</strong>: The NCBI Taxonomy ID associated with this reference sequence.</li>
<li><strong>gi</strong>: (NA).</li>
</ol>
<p> </p>
<p><strong>MarFERReT.v1.1.1.proteins_info.tab.gz</strong><br>This Gzip-compressed tab-separated file contains a row for each final MarFERReT protein sequence with the following columns:</p>
<ol>
<li><strong>aa_id</strong>: the unique identifier for each MarFERReT protein sequence.</li>
<li><strong>entry_id</strong>: The unique numeric identifier for each MarFERReT entry.</li>
<li><strong>source_defline</strong>: The original, unformatted sequence identifier</li>
</ol>
<p> </p>
<p><strong>MarFERReT.v1.1.1.best_pfam_annotations.csv.gz<br></strong>This Gzip-compressed CSV file contains the best-scoring Pfam annotation for intra-species clustered protein sequences from the 800 validated MarFERReT entries; derived from the hmmsearch annotations against Pfam 34.0 functional domains. This file contains the following fields:</p>
<ol>
<li><strong>aa_id</strong>: The unique MarFERReT protein sequence ID ('mftX').</li>
<li><strong>pfam_name</strong>: The shorthand Pfam protein family name.</li>
<li><strong>pfam_id</strong>: The Pfam identifier.</li>
<li><strong>pfam_eval</strong>: hmm profile match e-value score</li>
<li><strong>pfam_score:</strong> hmm profile match bitscore</li>
</ol>
<p><br><strong>MarFERReT.v1.1.1.dmnd</strong><br>This binary file is the indexed database of the MarFERReT protein library with embedded NCBI taxonomic information generated by the DIAMOND makedb tool using the build_diamond_db.sh script from the MarFERReT /scripts/ library. This can be used as the reference DIAMOND database for annotating environment sequences from eukaryotic metatranscriptomes. <br><br></p>
Divergent functions of two clades of flavodoxin in diatoms mitigate oxidative stress and iron limitation
Phytoplankton rely on diverse mechanisms to adapt to the decreased iron bioavailability and oxidative stress-inducing conditions of today’s oxygenated oceans, including replacement of the iron-requiring ferredoxin electron shuttle protein with a less-efficient iron-free flavodoxin under iron-limiting conditions. Yet, diatoms transcribe flavodoxins in high-iron regions in contrast to other phytoplankton. Here, we show that the two clades of flavodoxins present within diatoms exhibit a functional divergence, with only clade II flavodoxins displaying the canonical role in acclimation to iron limitation. We created CRISPR/Cas9 knock-outs of the clade I flavodoxin from the model diatom Thalassiosira pseudonana and found that these cell lines are hypersensitive to oxidative stress, while maintaining a wild-type response to iron limitation. Within natural diatom communities, clade I flavodoxin transcript abundance is regulated over the diel cycle rather than in response to iron availability, whereas clade II transcript abundances increase either in iron-limiting regions or under artificially induced iron limitation. The observed functional specialization of two flavodoxin variants within diatoms reiterates two major stressors associated with contemporary oceans and illustrates diatom strategies to flourish in diverse aquatic ecosystems
Fe limitation decreases transcriptional regulation over the diel cycle in the model diatom Thalassiosira pseudonana.
Iron (Fe) is an important growth factor for diatoms and its availability is further restricted by changes in the carbonate chemistry of seawater. We investigated the physiological attributes and transcriptional profiles of the diatom Thalassiosira pseudonana grown on a day: night cycle under different CO2/pH and iron concentrations, that in combination generated available iron (Fe') concentrations of 1160, 233, 58 and 12 pM. We found the light-dark conditions to be the main driver of transcriptional patterns, followed by Fe' concentration and CO2 availability, respectively. At the highest Fe' (1160 pM), 55% of the transcribed genes were differentially expressed between day and night, whereas at the lowest Fe' (12 pM), only 28% of the transcribed genes displayed comparable patterns. While Fe limitation disrupts the diel expression patterns for genes in most central metabolism pathways, the diel expression of light- signaling molecules and glycolytic genes was relatively robust in response to reduced Fe'. Moreover, we identified a non-canonical splicing of transcripts encoding triose-phosphate isomerase, a key-enzyme of glycolysis, generating transcript isoforms that would encode proteins with and without an active site. Transcripts that encoded an active enzyme maintained a diel expression at low Fe', while transcripts that encoded the non-active enzyme lost the diel expression. This work illustrates the interplay between nutrient limitation and transcriptional regulation over the diel cycle. Considering that future ocean conditions will reduce the availability of Fe in many parts of the oceans, our work identifies some of the regulatory mechanisms that may shape future ecological communities