Skip to main content
Article thumbnail
Location of Repository

Identification of soluble protein fragments by gene fragmentation and genetic selection

By Michael R. Dyson, Rajika L. Perera, S. Paul Shadbolt, Lynn Biderman, Krystyna Bromek, Natalia V. Murzina and John McCafferty


We describe a new method, which identifies protein fragments for soluble expression in Escherichia coli from a randomly fragmented gene library. Inhibition of E. coli dihydrofolate reductase (DHFR) by trimethoprim (TMP) prevents growth, but this can be relieved by murine DHFR (mDHFR). Bacterial strains expressing mDHFR fusions with the soluble proteins green fluroscent protein (GFP) or EphB2 (SAM domain) displayed markedly increased growth rates with TMP compared to strains expressing insoluble EphB2 (TK domain) or ketosteroid isomerase (KSI). Therefore, mDHFR is affected by the solubility of fusion partners and can act as a reporter of soluble protein expression. Random fragment libraries of the transcription factor Fli1 were generated by deoxyuridine incorporation and endonuclease V cleavage. The fragments were cloned upstream of mDHFR and TMP resistant clones expressing soluble protein were identified. These were found to cluster around the DNA binding ETS domain. A selected Fli1 fragment was expressed independently of mDHFR and was judged to be correctly folded by various biophysical methods including NMR. Soluble fragments of the cell-surface receptor Pecam1 were also identified. This genetic selection method was shown to generate expression clones useful for both structural studies and antibody generation and does not require a priori knowledge of domain architecture

Topics: Methods Online
Publisher: Oxford University Press
OAI identifier:
Provided by: PubMed Central
Download PDF:
Sorry, we are unable to provide the full text but you may find it at the following location(s):
  • http://www.pubmedcentral.nih.g... (external link)
  • Suggested articles


    1. (2005). A novel flow cytometry-based method for analysis of expression levels in Escherichia coli, giving information about precipitated and soluble protein.
    2. (1999). A simple in vivo assay for increased protein solubility.
    3. (2006). A simple vector system to improve
    4. (2003). A system using convertible vectors for screening soluble recombinant proteins produced in Escherichia coli from randomly fragmented cDNAs.
    5. (2006). An efficient and generic strategy for producing soluble human proteins and domains in E. coli by screening construct libraries.
    6. (2002). Application of NMR in structural proteomics: screening for proteins amenable to structural analysis.
    7. (2007). Application of phage display to high throughput antibody generation and characterisation.
    8. (2005). Characterization of the aggregates formed during recombinant protein expression in bacteria.
    9. (2006). Combinatorial domain hunting: an effective approach for the identification of soluble protein domains adaptable to high-throughput applications.
    10. (2006). Combinatorial library approaches for improving soluble protein expression in Escherichia coli.
    11. (1991). Construction of T-vectors, a rapid and general system for direct cloning of unmodified PCR products.
    12. (2000). DNA cloning using in vitro site-specific recombination.
    13. (2007). DNA fragmentation-based combinatorial approaches to soluble protein expression: Part I. Generating DNA fragment libraries. Drug Discov.
    14. (2006). Domain structure and protein interactions of the silent information regulator Sir3 revealed by screening a nested deletion library of protein fragments.
    15. (1999). Efficient folding of proteins with multiple disulfide bonds in the Escherichia coli cytoplasm.
    16. (2006). Enhancement of soluble protein expression through the use of fusion tags.
    17. (2004). Fast identifi-cation of folded human protein domains expressed in E. coli suitable for structural analysis.
    18. (2007). Formation of well-defined soluble aggregates upon fusion to MBP is a generic property of E6 proteins from various human papillomavirus species. Protein Expres.
    19. (2006). Fragmentation of DNA by sonication.
    20. (1997). Further characterization of Escherichia coli endonuclease V. Mechanism of recognition for deoxyinosine, deoxyuridine and base mismatches in DNA.
    21. (2002). Gene expression response to misfolded protein as a screen for soluble recombinant protein.
    22. (2006). Genetic selection for protein solubility enabled by the folding quality control feature of the twin-arginine translocation pathway.
    23. (2000). Identification of natural ligands for SH2 domains from a phage display cDNA library.
    24. (2006). Identification of protein domains by shotgun proteolysis.
    25. (2006). Improving protein solubility: the use of the Escherichia coli dihydrofolate reductase gene as a fusion reporter. Protein Expres.
    26. (2001). Molecular Cloning: A Laboratory Manual. 3rd edn.
    27. (2006). Multiplexed expression and screening for recombinant protein production in mammalian cells.
    28. (2005). Natively unfolded proteins.
    29. (2006). Pfam: clans, web tools and services.
    30. (1990). Phage antibodies: filamentous phage displaying antibody variable domains.
    31. (2004). Production of soluble mammalian proteins in Escherichia coli: identification of protein features that correlate with successful expression.
    32. (2007). Protein expression in Escherichia coli, Chapter 2.
    33. (2005). Protein production by auto-induction in highdensity shaking cultures. Protein Expres.
    34. (2001). Protein solubility and folding monitored in vivo by structural complementation of a genetic marker protein.
    35. (2005). Protein tagging and detection with engineered self-assembling fragments of green fluorescent protein.
    36. (2002). Random DNA fragmentation with endonuclease V: application to DNA shuffling.
    37. (2001). Random PCR-based screening for soluble domains using green fluorescent protein.
    38. (1999). Rapid protein-folding assay using green fluorescent protein.
    39. (2001). Screening for soluble expression of recombinant proteins in a 96-well format.
    40. (2003). Screening methods to determine biophysical properties of proteins in structural genomics.
    41. (2004). SMART 4.0: towards genomic data integration.
    42. (2005). Soluble domains of telomerase reverse transcriptase identified by high-throughput screening.
    43. (1994). Solution structure of the ets domain of Fli-1 when bound to DNA.
    44. (2007). Structure and nuclear import function of the C-terminal domain of influenza virus polymerase PB2 subunit.
    45. (2001). Vascular expression of Notch pathway receptors and ligands is restricted to arterial vessels.

    To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.