2,655 research outputs found
Gene Composer: database software for protein construct design, codon engineering, and gene synthesis
<p>Abstract</p> <p>Background</p> <p>To improve efficiency in high throughput protein structure determination, we have developed a database software package, Gene Composer, which facilitates the information-rich design of protein constructs and their codon engineered synthetic gene sequences. With its modular workflow design and numerous graphical user interfaces, Gene Composer enables researchers to perform all common bio-informatics steps used in modern structure guided protein engineering and synthetic gene engineering.</p> <p>Results</p> <p>An interactive <b>Alignment Viewer </b>allows the researcher to simultaneously visualize sequence conservation in the context of known protein secondary structure, ligand contacts, water contacts, crystal contacts, B-factors, solvent accessible area, residue property type and several other useful property views. The <b>Construct Design Module </b>enables the facile design of novel protein constructs with altered N- and C-termini, internal insertions or deletions, point mutations, and desired affinity tags. The modifications can be combined and permuted into multiple protein constructs, and then virtually cloned <it>in silico </it>into defined expression vectors. The <b>Gene Design Module </b>uses a protein-to-gene algorithm that automates the back-translation of a protein amino acid sequence into a codon engineered nucleic acid gene sequence according to a selected codon usage table with minimal codon usage threshold, defined G:C% content, and desired sequence features achieved through synonymous codon selection that is optimized for the intended expression system. The gene-to-oligo algorithm of the Gene Design Module plans out all of the required overlapping oligonucleotides and mutagenic primers needed to synthesize the desired gene constructs by PCR, and for physically cloning them into selected vectors by the most popular subcloning strategies.</p> <p>Conclusion</p> <p>We present a complete description of Gene Composer functionality, and an efficient PCR-based synthetic gene assembly procedure with mis-match specific endonuclease error correction in combination with PIPE cloning. In a sister manuscript we present data on how Gene Composer designed genes and protein constructs can result in improved protein production for structural studies.</p
Genomics and proteomics: a signal processor's tour
The theory and methods of signal processing are becoming increasingly important in molecular biology. Digital filtering techniques, transform domain methods, and Markov models have played important roles in gene identification, biological sequence analysis, and alignment. This paper contains a brief review of molecular biology, followed by a review of the applications of signal processing theory. This includes the problem of gene finding using digital filtering, and the use of transform domain methods in the study of protein binding spots. The relatively new topic of noncoding genes, and the associated problem of identifying ncRNA buried in DNA sequences are also described. This includes a discussion of hidden Markov models and context free grammars. Several new directions in genomic signal processing are briefly outlined in the end
Synonymous genome recoding : a tool to explore microbial biology and new therapeutic strategies
Synthetic genome recoding is a new means of generating designed organisms with altered phenotypes. Synonymous mutations introduced into the protein coding region tolerate modifications in DNA or mRNA without modifying the encoded proteins. Synonymous genome-wide recoding has allowed the synthetic generation of different small-genome viruses with modified phenotypes and biological properties. Recently, a decreased cost of chemically synthesizing DNA and improved methods for assembling DNA fragments (e.g. lambda red recombination and CRISPR-based editing) have enabled the construction of an Escherichia coli variant with a 4-Mb synthetic synonymously recoded genome with a reduced number of sense codons (n = 59) encoding the 20 canonical amino acids. Synonymous genome recoding is increasing our knowledge of microbial interactions with innate immune responses, identifying functional genome structures, and strategically ameliorating cis-inhibitory signaling sequences related to splicing, replication (in eukaryotes), and complex microbe functions, unraveling the relevance of codon usage for the temporal regulation of gene expression and the microbe mutant spectrum and adaptability. New biotechnological and therapeutic applications of this methodology can easily be envisaged. In this review, we discuss how synonymous genome recoding may impact our knowledge of microbial biology and the development of new and better therapeutic methodologies
Recommended from our members
A single H/ACA small nucleolar RNA mediates tumor suppression downstream of oncogenic RAS.
Small nucleolar RNAs (snoRNAs) are a diverse group of non-coding RNAs that direct chemical modifications at specific residues on other RNA molecules, primarily on ribosomal RNA (rRNA). SnoRNAs are altered in several cancers; however, their role in cell homeostasis as well as in cellular transformation remains poorly explored. Here, we show that specific subsets of snoRNAs are differentially regulated during the earliest cellular response to oncogenic RASG12V expression. We describe a novel function for one H/ACA snoRNA, SNORA24, which guides two pseudouridine modifications within the small ribosomal subunit, in RAS-induced senescence in vivo. We find that in mouse models, loss of Snora24 cooperates with RASG12V to promote the development of liver cancer that closely resembles human steatohepatitic hepatocellular carcinoma (HCC). From a clinical perspective, we further show that human HCCs with low SNORA24 expression display increased lipid content and are associated with poor patient survival. We next asked whether ribosomes lacking SNORA24-guided pseudouridine modifications on 18S rRNA have alterations in their biophysical properties. Single-molecule Fluorescence Resonance Energy Transfer (FRET) analyses revealed that these ribosomes exhibit perturbations in aminoacyl-transfer RNA (aa-tRNA) selection and altered pre-translocation ribosome complex dynamics. Furthermore, we find that HCC cells lacking SNORA24-guided pseudouridine modifications have increased translational miscoding and stop codon readthrough frequencies. These findings highlight a role for specific snoRNAs in safeguarding against oncogenic insult and demonstrate a functional link between H/ACA snoRNAs regulated by RAS and the biophysical properties of ribosomes in cancer
Recommended from our members
Synthetic gene design - The rationale for codon optimization and implications for molecular pharming in plants.
Degeneracy in the genetic code allows multiple codon sequences to encode the same protein. Codon usage bias in genes is the term given to the preferred use of particular synonymous codons. Synonymous codon substitutions had been regarded as "silent" as the primary structure of the protein was not affected; however, it is now accepted that synonymous substitutions can have a significant effect on heterologous protein expression. Codon optimization, the process of altering codons within the gene sequence to improve recombinant protein expression, has become widely practised. Multiple inter-linked factors affecting protein expression need to be taken into consideration when optimizing a gene sequence. Over the years, various computer programmes have been developed to aid in the gene sequence optimization process. However, as the rulebook for altering codon usage to affect protein expression is still not completely understood, it is difficult to predict which strategy, if any, will design the 'optimal' gene sequence. In this review, codon usage bias and factors affecting codon selection will be discussed and the evidence for codon optimization impact will be reviewed for recombinant protein expression using plants as a case study. These developments will be relevant to all recombinant expression systems, however, molecular pharming in plants is an area which has consistently encountered difficulties with low levels of recombinant protein expression, and should benefit from an evidence based rational approach to synthetic gene design
New Clox Systems for rapid and efficient gene disruption in Candida albicans
Acknowledgements: We are grateful to Janet Quinn, Lila Kastora, Joanna Potrykus, Michelle Leach, and others for sharing their experiences with the Clox cassettes. We thank Julia Kohler for her kind gift of the NAT1-flipper plasmid pJK863, Claudia Jacob for her advice with In-fusion cloning, and our colleagues in the Aberdeen Fungal Group for numerous stimulating discussions. Data Availability: The authors confirm that all data underlying the findings are fully available without restriction. The sequences of all Clox cassettes are available in GenBank: URA3-Clox (loxP-URA3-MET3p-cre-loxP): GenBank accession number KC999858. NAT1-Clox (loxP-NAT1-MET3p-cre-loxP): GenBank accession number KC999859. LAL (loxP-ARG4-loxP): GenBank accession number DQ015897. LHL (loxP-HIS1-loxP): GenBank accession number DQ015898. LUL (loxP-URA3-loxP): GenBank accession number DQ015899. Funding: This work was supported by the Wellcome Trust (www.wellcome.ac.uk): S.S., F.C.O., N.A.R.G., A.J.P.B. (080088); N.A.R.G., A.J.P.B. (097377). The authors also received support from the European Research Council [http://erc.europa.eu/]: DSC. ERB, AJPB (STRIFE Advanced Grant; C-2009-AdG-249793). The European Commission also provided funding [http://ec.europa.eu/research/fp7]: I.B., A.J.P.B. (FINSysB MC-ITN; PITN-GA-2008-214004). Also the UK Biotechnology and Biological Research Council provided support [www.bbsrc.ac.uk]: N.A.R.G., A.J.P.B. (Research Grant; BB/F00513X/1). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.Peer reviewedPublisher PD
Demarcation of coding and non-coding regions of DNA using linear transforms
Deoxyribonucleic Acid (DNA) strand carries genetic information in the cell. A strand of DNA consists of nitrogenous molecules called nucleotides. Nucleotides triplets, or the codons, code for amino acids. There are two distinct regions in DNA, the gene and the intergenic DNA, or the junk DNA. Two regions can be distinguished in the gene- the exons, or the regions that code for amino acid, and the introns, or the regions that do not code for amino acid. The main aim of the thesis is to study signal processing techniques that help distinguish between the regions of the exons and the introns. Previous research has shown the fact that the exons can be considered as a sequence of signal and noise, whereas introns are noise-like sequences. Fourier Transform of an exonic sequence exhibits a peak at frequency sample value k N/3 where N is the length of the FFT transform. This property is referred to as the period -3 property. Unlike exons, introns have a noise-like spectrum. The factor that determines the performance efficiency of a transform is the figure of merit, defined as the ratio of the peak value to the arithmetic mean of all the values. A comparative study was conducted for the application of the Discrete Fourier Transform and the Karhunen Loeve Transform. Though both DFT and KLT of an exon sequence produce a higher figure of merit than that for an intron sequence, it is interesting to note that the difference in the figure of merits of exons and introns was higher when the KLT was applied to the sequence than when the DFT was applied. The two transforms were also applied on entire sequences in a sliding window fashion. Finally, the two transforms were applied on a large number of sequences from a variety of organisms. A Neyman Pearson based detector was used to obtain receiver operating curves, i.e., probability of detection versus probability of false alarm. When a transform is applied as a sliding window, the values for exons and introns are taken separately. The exons and the introns served as the two hypotheses of the detector. The Neyman Pearson detector helped indicate the fact the KLT worked better on a variety of organisms than the DFT
Prion Switching in Response to Environmental Stress
Evolution depends on the manner in which genetic variation is translated into new phenotypes. There has been much debate about whether organisms might have specific mechanisms for “evolvability,” which would generate heritable phenotypic variation with adaptive value and could act to enhance the rate of evolution. Capacitor systems, which allow the accumulation of cryptic genetic variation and release it under stressful conditions, might provide such a mechanism. In yeast, the prion [PSI+] exposes a large array of previously hidden genetic variation, and the phenotypes it thereby produces are advantageous roughly 25% of the time. The notion that [PSI+] is a mechanism for evolvability would be strengthened if the frequency of its appearance increased with stress. That is, a system that mediates even the haphazard appearance of new phenotypes, which have a reasonable chance of adaptive value would be beneficial if it were deployed at times when the organism is not well adapted to its environment. In an unbiased, high-throughput, genome-wide screen for factors that modify the frequency of [PSI+] induction, signal transducers and stress response genes were particularly prominent. Furthermore, prion induction increased by as much as 60-fold when cells were exposed to various stressful conditions, such as oxidative stress (H2O2) or high salt concentrations. The severity of stress and the frequency of [PSI+] induction were highly correlated. These findings support the hypothesis that [PSI+] is a mechanism to increase survival in fluctuating environments and might function as a capacitor to promote evolvability
Sequence similarity is more relevant than species specificity in probabilistic backtranslation
BACKGROUND: Backtranslation is the process of decoding a sequence of amino acids into the corresponding codons. All synthetic gene design systems include a backtranslation module. The degeneracy of the genetic code makes backtranslation potentially ambiguous since most amino acids are encoded by multiple codons. The common approach to overcome this difficulty is based on imitation of codon usage within the target species. RESULTS: This paper describes EasyBack, a new parameter-free, fully-automated software for backtranslation using Hidden Markov Models. EasyBack is not based on imitation of codon usage within the target species, but instead uses a sequence-similarity criterion. The model is trained with a set of proteins with known cDNA coding sequences, constructed from the input protein by querying the NCBI databases with BLAST. Unlike existing software, the proposed method allows the quality of prediction to be estimated. When tested on a group of proteins that show different degrees of sequence conservation, EasyBack outperforms other published methods in terms of precision. CONCLUSION: The prediction quality of a protein backtranslation methis markedly increased by replacing the criterion of most used codon in the same species with a Hidden Markov Model trained with a set of most similar sequences from all species. Moreover, the proposed method allows the quality of prediction to be estimated probabilistically
- …