35 research outputs found

    ORFer – retrieval of protein sequences and open reading frames from GenBank and storage into relational databases or text files

    Get PDF
    BACKGROUND: Functional genomics involves the parallel experimentation with large sets of proteins. This requires management of large sets of open reading frames as a prerequisite of the cloning and recombinant expression of these proteins. RESULTS: A Java program was developed for retrieval of protein and nucleic acid sequences and annotations from NCBI GenBank, using the XML sequence format. Annotations retrieved by ORFer include sequence name, organism and also the completeness of the sequence. The program has a graphical user interface, although it can be used in a non-interactive mode. For protein sequences, the program also extracts the open reading frame sequence, if available, and checks its correct translation. ORFer accepts user input in the form of single or lists of GenBank GI identifiers or accession numbers. It can be used to extract complete sets of open reading frames and protein sequences from any kind of GenBank sequence entry, including complete genomes or chromosomes. Sequences are either stored with their features in a relational database or can be exported as text files in Fasta or tabulator delimited format. The ORFer program is freely available at http://www.proteinstrukturfabrik.de/orfer. CONCLUSION: The ORFer program allows for fast retrieval of DNA sequences, protein sequences and their open reading frames and sequence annotations from GenBank. Furthermore, storage of sequences and features in a relational database is supported. Such a database can supplement a laboratory information system (LIMS) with appropriate sequence information

    An automated method for high-throughput protein purification applied to a comparison of His-tag and GST-tag affinity chromatography

    Get PDF
    BACKGROUND: Functional Genomics, the systematic characterisation of the functions of an organism's genes, includes the study of the gene products, the proteins. Such studies require methods to express and purify these proteins in a parallel, time and cost effective manner. RESULTS: We developed a method for parallel expression and purification of recombinant proteins with a hexahistidine tag (His-tag) or glutathione S-transferase (GST)-tag from bacterial expression systems. Proteins are expressed in 96-well microplates and are purified by a fully automated procedure on a pipetting robot. Up to 90 microgram purified protein can be obtained from 1 ml microplate cultures. The procedure is readily reproducible and 96 proteins can be purified in approximately three hours. It avoids clearing of crude cellular lysates and the use of magnetic affinity beads and is therefore less expensive than comparable commercial systems. We have used this method to compare purification of a set of human proteins via His-tag or GST-tag. Proteins were expressed as fusions to an N-terminal tandem His- and GST-tag and were purified by metal chelating or glutathione affinity chromatography. The purity of the obtained protein samples was similar, yet His-tag purification resulted in higher yields for some proteins. CONCLUSION: A fully automated, robust and cost effective method was developed for the purification of proteins that can be used to quickly characterise expression clones in high throughput and to produce large numbers of proteins for functional studies. His-tag affinity purification was found to be more efficient than purification via GST-tag for some proteins

    Structural genomics of human proteins – target selection and generation of a public catalogue of expression clones

    Get PDF
    BACKGROUND: The availability of suitable recombinant protein is still a major bottleneck in protein structure analysis. The Protein Structure Factory, part of the international structural genomics initiative, targets human proteins for structure determination. It has implemented high throughput procedures for all steps from cloning to structure calculation. This article describes the selection of human target proteins for structure analysis, our high throughput cloning strategy, and the expression of human proteins in Escherichia coli host cells. RESULTS AND CONCLUSION: Protein expression and sequence data of 1414 E. coli expression clones representing 537 different proteins are presented. 139 human proteins (18%) could be expressed and purified in soluble form and with the expected size. All E. coli expression clones are publicly available to facilitate further functional characterisation of this set of human proteins

    Fast identification of folded human protein domains expressed in E. coli suitable for structural analysis

    Get PDF
    BACKGROUND: High-throughput protein structure analysis of individual protein domains requires analysis of large numbers of expression clones to identify suitable constructs for structure determination. For this purpose, methods need to be implemented for fast and reliable screening of the expressed proteins as early as possible in the overall process from cloning to structure determination. RESULTS: 88 different E. coli expression constructs for 17 human protein domains were analysed using high-throughput cloning, purification and folding analysis to obtain candidates suitable for structural analysis. After 96 deep-well microplate expression and automated protein purification, protein domains were directly analysed using 1D (1)H-NMR spectroscopy. In addition, analytical hydrophobic interaction chromatography (HIC) was used to detect natively folded protein. With these two analytical methods, six constructs (representing two domains) were quickly identified as being well folded and suitable for structural analysis. CONCLUSION: The described approach facilitates high-throughput structural analysis. Clones expressing natively folded proteins suitable for NMR structure determination were quickly identified upon small scale expression screening using 1D (1)H-NMR and/or analytical HIC. This procedure is especially effective as a fast and inexpensive screen for the 'low hanging fruits' in structural genomics

    X-ray structure of engineered human Aortic Preferentially Expressed Protein-1 (APEG-1)

    Get PDF
    BACKGROUND: Human Aortic Preferentially Expressed Protein-1 (APEG-1) is a novel specific smooth muscle differentiation marker thought to play a role in the growth and differentiation of arterial smooth muscle cells (SMCs). RESULTS: Good quality crystals that were suitable for X-ray crystallographic studies were obtained following the truncation of the 14 N-terminal amino acids of APEG-1, a region predicted to be disordered. The truncated protein (termed ΔAPEG-1) consists of a single immunoglobulin (Ig) like domain which includes an Arg-Gly-Asp (RGD) adhesion recognition motif. The RGD motif is crucial for the interaction of extracellular proteins and plays a role in cell adhesion. The X-ray structure of ΔAPEG-1 was determined and was refined to sub-atomic resolution (0.96 Å). This is the best resolution for an immunoglobulin domain structure so far. The structure adopts a Greek-key β-sandwich fold and belongs to the I (intermediate) set of the immunoglobulin superfamily. The residues lying between the β-sheets form a hydrophobic core. The RGD motif folds into a 3(10 )helix that is involved in the formation of a homodimer in the crystal which is mainly stabilized by salt bridges. Analytical ultracentrifugation studies revealed a moderate dissociation constant of 20 μM at physiological ionic strength, suggesting that APEG-1 dimerisation is only transient in the cell. The binding constant is strongly dependent on ionic strength. CONCLUSION: Our data suggests that the RGD motif might play a role not only in the adhesion of extracellular proteins but also in intracellular protein-protein interactions. However, it remains to be established whether the rather weak dimerisation of APEG-1 involving this motif is physiogically relevant

    The solution structure of the SODD BAG domain reveals additional electrostatic interactions in the HSP70 complexes of SODD subfamily BAG domains

    Get PDF
    The solution structure of an N-terminally extended construct of the SODD BAG domain was determined by nuclear magnetic resonance spectroscopy. A homology model of the SODD-BAG/HSP70 complex reveals additional possible interactions that are specific for the SODD subfamily of BAG domains while the overall geometry of the complex remains the same. Relaxation rate measurements show that amino acids N358–S375 of SODD which were previously assigned to its BAG domain are not structured in our construct. The SODD BAG domain is thus indeed smaller than the homologous domain in Bag1 defining a new subfamily of BAG domains
    corecore