36 research outputs found
TReSR: A PCR-compatible DNA sequence design method for engineering proteins containing tandem repeats
Protein tandem repeats (TRs) are motifs comprised of near-identical contiguous sequence duplications. They are found in approximately 14% of all proteins and are implicated in diverse biological functions facilitating both structured and disordered protein-protein and protein-DNA interactions. These functionalities make protein TR domains an attractive component for the modular design of protein constructs. However, the repetitive nature of DNA sequences encoding TR motifs complicates their synthesis and mutagenesis by traditional molecular biology workflows commonly employed by protein engineers and synthetic biologists. To address this challenge, we developed a computational protocol to significantly reduce the complementarity of DNA sequences encoding TRs called TReSR (for Tandem Repeat DNA Sequence Redesign). The utility of TReSR was demonstrated by constructing a novel constitutive repressor synthesized by duplicating the LacI DNA binding domain into a single-chain TR construct by assembly PCR. Repressor function was evaluated by expression of a fluorescent reporter delivered on a single plasmid encoding a three-component genetic circuit. The successful application of TReSR to construct a novel TR-containing repressor with a DNA sequence that is amenable to PCR-based construction and manipulation will enable the incorporation of a wide range of TR-containing proteins for protein engineering and synthetic biology applications
TReSR: A PCR-compatible DNA sequence design method for engineering proteins containing tandem repeats.
Protein tandem repeats (TRs) are motifs comprised of near-identical contiguous sequence duplications. They are found in approximately 14% of all proteins and are implicated in diverse biological functions facilitating both structured and disordered protein-protein and protein-DNA interactions. These functionalities make protein TR domains an attractive component for the modular design of protein constructs. However, the repetitive nature of DNA sequences encoding TR motifs complicates their synthesis and mutagenesis by traditional molecular biology workflows commonly employed by protein engineers and synthetic biologists. To address this challenge, we developed a computational protocol to significantly reduce the complementarity of DNA sequences encoding TRs called TReSR (for Tandem Repeat DNA Sequence Redesign). The utility of TReSR was demonstrated by constructing a novel constitutive repressor synthesized by duplicating the LacI DNA binding domain into a single-chain TR construct by assembly PCR. Repressor function was evaluated by expression of a fluorescent reporter delivered on a single plasmid encoding a three-component genetic circuit. The successful application of TReSR to construct a novel TR-containing repressor with a DNA sequence that is amenable to PCR-based construction and manipulation will enable the incorporation of a wide range of TR-containing proteins for protein engineering and synthetic biology applications
Supplementary Information for TReSR: A PCR-compatible DNA sequence design method for engineering proteins containing tandem repeats.
Supplementary Information for TReSR: A PCR-compatible DNA sequence design method for engineering proteins containing tandem repeats.</p
Overview of the TR DNA sequence redesign strategy implemented in TReSR.
The design strategy presented in this study is schematically outlined for the construction of a TR containing two identical 20 amino acid segments from the N-terminus of the LacI repressor. (A) The TReSR protocol is initiated by dissection of the 20-amino acid target sequence into contiguous 5-residue segments (labelled with upper-case roman numerals) for DNA sequence redesign. (B) This is followed by the generation of a sequence list (with individual sequence entries labelled with lower-case roman numerals) constructed from combinations of synonymous codons that encode the amino acid sequence for each segment. An example codon combination encoding the amino acid sequence for segment I is shown that uses the codons highlighted in red. The label for this codon combination is given by a number for each amino acid that corresponds to the list position for the codon used (e.g., for the sequence shown, the first codon is used for all amino acids except for the last one which used the 6th codon in the list). Melting temperatures (Tm) of all codon combinations are then calculated using the UNAfold web server24 to provide a measure of homodimerization affinities for the forward (TFF) and reverse complement (TRR) sequences, along with the Tm of heterodimerization for the forward sequence with its reverse complement (TFR) and with the reverse complement of the wild-type sequence (TWT). Sequences are then filtered and discarded based on computed hybridization metrics (described in detail in the Materials and Methods), favouring codon combinations that maximize the Tm of heterodimization (TFR) while minimizing the Tm of homodimerization (TFF and TRR) and hybridization with the wild-type sequence (TWT). (C) The third step of the TReSR protocol assigns codon combinations to groups according to sequence similarity. All pair-wise percent sequence identities are calculated (shown as a heat map for codon combinations (i) to (iv)) and used to identify pairs of codon combinations having high sequence complementarity (e.g., codon combination (i) is similar to (ii) and (iii), and dissimilar to (iv)β(vi)). These are plotted in an interaction graph of codon combination space which is shown for the six codon combinations partitioned into four unique clusters (shaded portions) according to their sequence similarity. (Red arrows indicate codon combinations that share a high degree of percent identity and would therefore be assigned to the same group, while green lines indicate codon combinations that are distinct, and consequently assigned to different groups.) (D) After group assignment, the fourth step involves the assembly of sequences from two adjacent codon combinations from different groups (shown for codon combinations from orange, purple and blue groups from interaction graph). Hybridization metrics are calculated for the joined adjacent segments and then the list of paired codon combinations is filtered (as was done in the second step, B) to eliminate paired segments which are predicted to have problematic homodimerization behaviours (i.e., high TFF and TRR). (E) The TReSR algorithm is concluded following a depth-first-search of remaining adjacent codon combinations to identify sequence paths joining contiguous segments. An example TR sequence path is shown with adjacent segment codon combination pairs connected by green arrows for the first domain and continued with red arrows for the path encoding the second domain. A randomly selected sequence resulting from an assembled path is then evaluated as described in the Materials and Methods to confirm that the DNA sequence would be suitable for aPCR construction of the target gene.</p
Architecture of the single-plasmid three-component genetic circuit used to evaluate scDBD repressor constructs.
Plasmid architecture (A) includes the ColE1 origin and ampicillin resistance selection marker (AmpR) in addition to the three genetic circuit components: pDBD(scDBD), pGFP(eGFP), and pLacI(LacIW220F), incorporated at Cloning Sites I, II, and III, respectively. The gene cassettes of Site I (scDBD) and Site II (eGFP) are flanked by identical T7 terminator sequences and near identical pDBD and pGFP promoter sequences, respectively. The promoter regions (B) of Cloning Sites I and II deliver identical riboJ genetic insulator, hairpin, and ribosome binding site (RBS) sequences upstream of the start codon. These promoter cassettes differ by the identity of their β10 box promoter sequence and their operator sequences placed at positions core and proximal along the cassette, with base pair (BP) position indicated relative to the mRNA transcription start site. The operator elements belonging to the pDBD promoter (Site I) incorporate the lacOsym operator sequence responsible for recruiting the LacIW220F repressor expressed by the pLacI promoter from Cloning Site III. The operator element belonging to the pGFP promoter (Site II) incorporates an operator sequence variant (lacOTTA) recognized by the functional scDBD tandem repeat repressor (scDBDIAN/IAN: LacI DBD triple mutations Y17I/Q18A/R22N).</p
<i>In Vivo</i> evaluation of scDBD repressor function.
Four variants of scDBD repressor protein architecture (A) were inserted into the genetic circuit enabling quantification of repressor function, reported by cell density-normalized fluorescence resulting from expression of eGFP. These four variants are comprised of combinatorial pairs of triple-mutations rendering each duplicated DBD functional (IAN: Y17I/Q18A/R22N) and non-functional (DFT: Y17D/Q18F/R22T). The repressor function of these four variants was evaluated by measuring genetic circuit output when scDBD production is repressed (0 mM IPTG) and when scDBD is expressed (10 mM IPTG) for full-length (B) and C-terminal truncated (C) constructs. Statistically significant differences (p-value β€ 0.001) corresponding to a repression event are indicated (two-tailed homoscedastic t-test, n = 4).</p
Assembly PCR oligonucleotide primers for the TReSR designed tandem repeat repressor.
Assembly PCR oligonucleotide primers for the TReSR designed tandem repeat repressor.</p
S2 File -
Protein tandem repeats (TRs) are motifs comprised of near-identical contiguous sequence duplications. They are found in approximately 14% of all proteins and are implicated in diverse biological functions facilitating both structured and disordered protein-protein and protein-DNA interactions. These functionalities make protein TR domains an attractive component for the modular design of protein constructs. However, the repetitive nature of DNA sequences encoding TR motifs complicates their synthesis and mutagenesis by traditional molecular biology workflows commonly employed by protein engineers and synthetic biologists. To address this challenge, we developed a computational protocol to significantly reduce the complementarity of DNA sequences encoding TRs called TReSR (for Tandem Repeat DNA Sequence Redesign). The utility of TReSR was demonstrated by constructing a novel constitutive repressor synthesized by duplicating the LacI DNA binding domain into a single-chain TR construct by assembly PCR. Repressor function was evaluated by expression of a fluorescent reporter delivered on a single plasmid encoding a three-component genetic circuit. The successful application of TReSR to construct a novel TR-containing repressor with a DNA sequence that is amenable to PCR-based construction and manipulation will enable the incorporation of a wide range of TR-containing proteins for protein engineering and synthetic biology applications.</div