2,655 research outputs found

    Gene Composer: database software for protein construct design, codon engineering, and gene synthesis

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>To improve efficiency in high throughput protein structure determination, we have developed a database software package, Gene Composer, which facilitates the information-rich design of protein constructs and their codon engineered synthetic gene sequences. With its modular workflow design and numerous graphical user interfaces, Gene Composer enables researchers to perform all common bio-informatics steps used in modern structure guided protein engineering and synthetic gene engineering.</p> <p>Results</p> <p>An interactive <b>Alignment Viewer </b>allows the researcher to simultaneously visualize sequence conservation in the context of known protein secondary structure, ligand contacts, water contacts, crystal contacts, B-factors, solvent accessible area, residue property type and several other useful property views. The <b>Construct Design Module </b>enables the facile design of novel protein constructs with altered N- and C-termini, internal insertions or deletions, point mutations, and desired affinity tags. The modifications can be combined and permuted into multiple protein constructs, and then virtually cloned <it>in silico </it>into defined expression vectors. The <b>Gene Design Module </b>uses a protein-to-gene algorithm that automates the back-translation of a protein amino acid sequence into a codon engineered nucleic acid gene sequence according to a selected codon usage table with minimal codon usage threshold, defined G:C% content, and desired sequence features achieved through synonymous codon selection that is optimized for the intended expression system. The gene-to-oligo algorithm of the Gene Design Module plans out all of the required overlapping oligonucleotides and mutagenic primers needed to synthesize the desired gene constructs by PCR, and for physically cloning them into selected vectors by the most popular subcloning strategies.</p> <p>Conclusion</p> <p>We present a complete description of Gene Composer functionality, and an efficient PCR-based synthetic gene assembly procedure with mis-match specific endonuclease error correction in combination with PIPE cloning. In a sister manuscript we present data on how Gene Composer designed genes and protein constructs can result in improved protein production for structural studies.</p

    Genomics and proteomics: a signal processor's tour

    Get PDF
    The theory and methods of signal processing are becoming increasingly important in molecular biology. Digital filtering techniques, transform domain methods, and Markov models have played important roles in gene identification, biological sequence analysis, and alignment. This paper contains a brief review of molecular biology, followed by a review of the applications of signal processing theory. This includes the problem of gene finding using digital filtering, and the use of transform domain methods in the study of protein binding spots. The relatively new topic of noncoding genes, and the associated problem of identifying ncRNA buried in DNA sequences are also described. This includes a discussion of hidden Markov models and context free grammars. Several new directions in genomic signal processing are briefly outlined in the end

    Synonymous genome recoding : a tool to explore microbial biology and new therapeutic strategies

    Get PDF
    Synthetic genome recoding is a new means of generating designed organisms with altered phenotypes. Synonymous mutations introduced into the protein coding region tolerate modifications in DNA or mRNA without modifying the encoded proteins. Synonymous genome-wide recoding has allowed the synthetic generation of different small-genome viruses with modified phenotypes and biological properties. Recently, a decreased cost of chemically synthesizing DNA and improved methods for assembling DNA fragments (e.g. lambda red recombination and CRISPR-based editing) have enabled the construction of an Escherichia coli variant with a 4-Mb synthetic synonymously recoded genome with a reduced number of sense codons (n = 59) encoding the 20 canonical amino acids. Synonymous genome recoding is increasing our knowledge of microbial interactions with innate immune responses, identifying functional genome structures, and strategically ameliorating cis-inhibitory signaling sequences related to splicing, replication (in eukaryotes), and complex microbe functions, unraveling the relevance of codon usage for the temporal regulation of gene expression and the microbe mutant spectrum and adaptability. New biotechnological and therapeutic applications of this methodology can easily be envisaged. In this review, we discuss how synonymous genome recoding may impact our knowledge of microbial biology and the development of new and better therapeutic methodologies

    New Clox Systems for rapid and efficient gene disruption in Candida albicans

    Get PDF
    Acknowledgements: We are grateful to Janet Quinn, Lila Kastora, Joanna Potrykus, Michelle Leach, and others for sharing their experiences with the Clox cassettes. We thank Julia Kohler for her kind gift of the NAT1-flipper plasmid pJK863, Claudia Jacob for her advice with In-fusion cloning, and our colleagues in the Aberdeen Fungal Group for numerous stimulating discussions. Data Availability: The authors confirm that all data underlying the findings are fully available without restriction. The sequences of all Clox cassettes are available in GenBank: URA3-Clox (loxP-URA3-MET3p-cre-loxP): GenBank accession number KC999858. NAT1-Clox (loxP-NAT1-MET3p-cre-loxP): GenBank accession number KC999859. LAL (loxP-ARG4-loxP): GenBank accession number DQ015897. LHL (loxP-HIS1-loxP): GenBank accession number DQ015898. LUL (loxP-URA3-loxP): GenBank accession number DQ015899. Funding: This work was supported by the Wellcome Trust (www.wellcome.ac.uk): S.S., F.C.O., N.A.R.G., A.J.P.B. (080088); N.A.R.G., A.J.P.B. (097377). The authors also received support from the European Research Council [http://erc.europa.eu/]: DSC. ERB, AJPB (STRIFE Advanced Grant; C-2009-AdG-249793). The European Commission also provided funding [http://ec.europa.eu/research/fp7]: I.B., A.J.P.B. (FINSysB MC-ITN; PITN-GA-2008-214004). Also the UK Biotechnology and Biological Research Council provided support [www.bbsrc.ac.uk]: N.A.R.G., A.J.P.B. (Research Grant; BB/F00513X/1). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.Peer reviewedPublisher PD

    Demarcation of coding and non-coding regions of DNA using linear transforms

    Get PDF
    Deoxyribonucleic Acid (DNA) strand carries genetic information in the cell. A strand of DNA consists of nitrogenous molecules called nucleotides. Nucleotides triplets, or the codons, code for amino acids. There are two distinct regions in DNA, the gene and the intergenic DNA, or the junk DNA. Two regions can be distinguished in the gene- the exons, or the regions that code for amino acid, and the introns, or the regions that do not code for amino acid. The main aim of the thesis is to study signal processing techniques that help distinguish between the regions of the exons and the introns. Previous research has shown the fact that the exons can be considered as a sequence of signal and noise, whereas introns are noise-like sequences. Fourier Transform of an exonic sequence exhibits a peak at frequency sample value k N/3 where N is the length of the FFT transform. This property is referred to as the period -3 property. Unlike exons, introns have a noise-like spectrum. The factor that determines the performance efficiency of a transform is the figure of merit, defined as the ratio of the peak value to the arithmetic mean of all the values. A comparative study was conducted for the application of the Discrete Fourier Transform and the Karhunen Loeve Transform. Though both DFT and KLT of an exon sequence produce a higher figure of merit than that for an intron sequence, it is interesting to note that the difference in the figure of merits of exons and introns was higher when the KLT was applied to the sequence than when the DFT was applied. The two transforms were also applied on entire sequences in a sliding window fashion. Finally, the two transforms were applied on a large number of sequences from a variety of organisms. A Neyman Pearson based detector was used to obtain receiver operating curves, i.e., probability of detection versus probability of false alarm. When a transform is applied as a sliding window, the values for exons and introns are taken separately. The exons and the introns served as the two hypotheses of the detector. The Neyman Pearson detector helped indicate the fact the KLT worked better on a variety of organisms than the DFT

    Prion Switching in Response to Environmental Stress

    Get PDF
    Evolution depends on the manner in which genetic variation is translated into new phenotypes. There has been much debate about whether organisms might have specific mechanisms for “evolvability,” which would generate heritable phenotypic variation with adaptive value and could act to enhance the rate of evolution. Capacitor systems, which allow the accumulation of cryptic genetic variation and release it under stressful conditions, might provide such a mechanism. In yeast, the prion [PSI+] exposes a large array of previously hidden genetic variation, and the phenotypes it thereby produces are advantageous roughly 25% of the time. The notion that [PSI+] is a mechanism for evolvability would be strengthened if the frequency of its appearance increased with stress. That is, a system that mediates even the haphazard appearance of new phenotypes, which have a reasonable chance of adaptive value would be beneficial if it were deployed at times when the organism is not well adapted to its environment. In an unbiased, high-throughput, genome-wide screen for factors that modify the frequency of [PSI+] induction, signal transducers and stress response genes were particularly prominent. Furthermore, prion induction increased by as much as 60-fold when cells were exposed to various stressful conditions, such as oxidative stress (H2O2) or high salt concentrations. The severity of stress and the frequency of [PSI+] induction were highly correlated. These findings support the hypothesis that [PSI+] is a mechanism to increase survival in fluctuating environments and might function as a capacitor to promote evolvability

    Sequence similarity is more relevant than species specificity in probabilistic backtranslation

    Get PDF
    BACKGROUND: Backtranslation is the process of decoding a sequence of amino acids into the corresponding codons. All synthetic gene design systems include a backtranslation module. The degeneracy of the genetic code makes backtranslation potentially ambiguous since most amino acids are encoded by multiple codons. The common approach to overcome this difficulty is based on imitation of codon usage within the target species. RESULTS: This paper describes EasyBack, a new parameter-free, fully-automated software for backtranslation using Hidden Markov Models. EasyBack is not based on imitation of codon usage within the target species, but instead uses a sequence-similarity criterion. The model is trained with a set of proteins with known cDNA coding sequences, constructed from the input protein by querying the NCBI databases with BLAST. Unlike existing software, the proposed method allows the quality of prediction to be estimated. When tested on a group of proteins that show different degrees of sequence conservation, EasyBack outperforms other published methods in terms of precision. CONCLUSION: The prediction quality of a protein backtranslation methis markedly increased by replacing the criterion of most used codon in the same species with a Hidden Markov Model trained with a set of most similar sequences from all species. Moreover, the proposed method allows the quality of prediction to be estimated probabilistically
    corecore