26 research outputs found
Capturing the ‘ome’ : the expanding molecular toolbox for RNA and DNA library construction
All sequencing experiments and most functional genomics screens rely on the generation of libraries to comprehensively capture pools of targeted sequences. In the past decade especially, driven by the progress in the field of massively parallel sequencing, numerous studies have comprehensively assessed the impact of particular manipulations on library complexity and quality, and characterized the activities and specificities of several key enzymes used in library construction. Fortunately, careful protocol design and reagent choice can substantially mitigate many of these biases, and enable reliable representation of sequences in libraries. This review aims to guide the reader through the vast expanse of literature on the subject to promote informed library generation, independent of the application
Genome dynamics of the human embryonic kidney 293 lineage in response to cell biology manipulations
The HEK293 human cell lineage is widely used in cell biology and biotechnology. Here we use whole-genome resequencing of six 293 cell lines to study the dynamics of this aneuploid genome in response to the manipulations used to generate common 293 cell derivatives, such as transformation and stable clone generation (293T); suspension growth adaptation (293S); and cytotoxic lectin selection (293SG). Remarkably, we observe that copy number alteration detection could identify the genomic region that enabled cell survival under selective conditions (i.c. ricin selection). Furthermore, we present methods to detect human/vector genome breakpoints and a user-friendly visualization tool for the 293 genome data. We also establish that the genome structure composition is in steady state for most of these cell lines when standard cell culturing conditions are used. This resource enables novel and more informed studies with 293 cells, and we will distribute the sequenced cell lines to this effect
Recommended from our members
Coexpressed subunits of dual genetic origin define a conserved supercomplex mediating essential protein import into chloroplasts
In photosynthetic eukaryotes, thousands of proteins are translated in the cytosol and imported into the chloroplast through the concerted action of two translocons-termed TOC and TIC-located in the outer and inner membranes of the chloroplast envelope, respectively. The degree to which the molecular composition of the TOC and TIC complexes is conserved over phylogenetic distances has remained controversial. Here, we combine transcriptomic, biochemical, and genetic tools in the green alga Chlamydomonas (Chlamydomonas reinhardtii) to demonstrate that, despite a lack of evident sequence conservation for some of its components, the algal TIC complex mirrors the molecular composition of a TIC complex from Arabidopsis thaliana. The Chlamydomonas TIC complex contains three nuclear-encoded subunits, Tic20, Tic56, and Tic100, and one chloroplast-encoded subunit, Tic214, and interacts with the TOC complex, as well as with several uncharacterized proteins to form a stable supercomplex (TIC-TOC), indicating that protein import across both envelope membranes is mechanistically coupled. Expression of the nuclear and chloroplast genes encoding both known and uncharacterized TIC-TOC components is highly coordinated, suggesting that a mechanism for regulating its biogenesis across compartmental boundaries must exist. Conditional repression of Tic214, the only chloroplast-encoded subunit in the TIC-TOC complex, impairs the import of chloroplast proteins with essential roles in chloroplast ribosome biogenesis and protein folding and induces a pleiotropic stress response, including several proteins involved in the chloroplast unfolded protein response. These findings underscore the functional importance of the TIC-TOC supercomplex in maintaining chloroplast proteostasis
A point mutation in the nucleotide exchange factor eIF2B constitutively activates the integrated stress response by allosteric modulation.
S. cerevisiae fragment datasets
<div>Results data of two-round SECRiFY screens of human (HEK293) cDNA fragments in S. cerevisiae (3 replicate screens). Sequencing data was mapped on the human GRCh38 transcriptome assembled using known transcripts from protein-coding genes only.<br>"Sc_resultstable_all.txt" = all in-frame fragments detected in either the unsorted baseline library (merged for 3 replicates), or in all 3 sorted replicate samples. <br>"Sc_resultstable_enriched.txt" = those with log_FC > 1 in all three replicates (11625)<br>"Sc_resultstable_depleted.txt" = those with log_FC < -1 in all three replicates (136531)<br><br>For each fragment, the following information is provided:<br><br># Ensembl_geneID --> Ensembl gene ID<br># Ensembl_txID --> Ensembl transcript ID<br># tx_start --> Transcript start position on human genome GRCh38<br># tx_end --> Transcript end position on human genome GRCh38<br># chr --> chromosome #<br># gene_symbol --> official gene symbol<br># frag_start --> fragment start position on the transcript, 0-based <br># frag_stop --> fragment end position on the transcript, 0-based<br># cDNA --> DNA sequence of the fragment<br># protein --> translated AA sequence of the fragment in frame 1<br># IND_count --> raw count value in the baseline (unsorted) library (merged for 3 replicates), NAs replaced by 0.001<br># SORT(1)_count --> raw count value in the sorted sample replicate (1), NAs replaced by 0.001<br># IND_FPTM --> normalized FPTM value in the baseline (unsorted) library <br># SORT(1)_FPTM --> normalized FPTM value in the sorted sample replicate (1)<br># logFC_(1) --> log2(SORT(1)_FPTM/IND_FPTM)</div><br
SECRiFY R/bash code
R/bash scripts used for sequencing data processing and analysis of SECRiFY fragment sequences<br
Domain hits Pfam
Domain hits of human representative fragments identified in <i>S. cerevisiae</i> or <i>P. pastoris</i> SECRiFY screening. 'Common' indicates domains found in both enriched and depleted fragments, 'Unique' domains are found in enriched or depleted fragments exclusively.<br
P. pastoris fragment datasets
<div>Results data of two-round SECRiFY screens of human (HEK293) cDNA
fragments in P. pastoris (3 replicate screens). Sequencing data was
mapped on the human GRCh38 transcriptome assembled using known
transcripts from protein-coding genes only.<br>"Pp_resultstable_all.txt"
= all in-frame fragments detected in either the unsorted baseline
library (merged for 3 replicates), or in all 3 sorted replicate samples.
<br>"Pp_resultstable_enriched.txt" = those with log_FC > 1 in all three replicates (10404)<br>"Pp_resultstable_depleted.txt" = those with log_FC < -1 in all three replicates (141357)<br><br>For each fragment, the following information is provided:<br><br># Ensembl_geneID --> Ensembl gene ID<br># Ensembl_txID --> Ensembl transcript ID<br># tx_start --> Transcript start position on human genome GRCh38<br># tx_end --> Transcript end position on human genome GRCh38<br># chr --> chromosome #<br># gene_symbol --> official gene symbol<br># frag_start --> fragment start position on the transcript, 0-based <br># frag_stop --> fragment end position on the transcript, 0-based<br># cDNA --> DNA sequence of the fragment<br># protein --> translated AA sequence of the fragment in frame 1<br># IND_count --> raw count value in the baseline (unsorted) library (merged for 3 replicates), NAs replaced by 0.001<br># SORT(1)_count --> raw count value in the sorted sample replicate (1), NAs replaced by 0.001<br># IND_FPTM --> normalized FPTM value in the baseline (unsorted) library <br># SORT(1)_FPTM --> normalized FPTM value in the sorted sample replicate (1)<br># logFC_(1) --> log2(SORT(1)_FPTM/IND_FPTM)</div
Human transcriptome GRCh38 known protein-coding
Fasta file used for mapping of sequenced SECRiFY fragments<br