35 research outputs found

    RepSeq-A database of amino acid repeats present in lower eukaryotic pathogens

    Get PDF
    BACKGROUND Amino acid repeat-containing proteins have a broad range of functions and their identification is of relevance to many experimental biologists. In human-infective protozoan parasites (such as the Kinetoplastid and Plasmodium species), they are implicated in immune evasion and have been shown to influence virulence and pathogenicity. RepSeq http://repseq.gugbe.com is a new database of amino acid repeat-containing proteins found in lower eukaryotic pathogens. The RepSeq database is accessed via a web-based application which also provides links to related online tools and databases for further analyses. RESULTS The RepSeq algorithm typically identifies more than 98% of repeat-containing proteins and is capable of identifying both perfect and mismatch repeats. The proportion of proteins that contain repeat elements varies greatly between different families and even species (3 - 35% of the total protein content). The most common motif type is the Sequence Repeat Region (SRR) - a repeated motif containing multiple different amino acid types. Proteins containing Single Amino Acid Repeats (SAARs) and Di-Peptide Repeats (DPRs) typically account for 0.5 - 1.0% of the total protein number. Notable exceptions are P. falciparum and D. discoideum, in which 33.67% and 34.28% respectively of the predicted proteomes consist of repeat-containing proteins. These numbers are due to large insertions of low complexity single and multi-codon repeat regions. CONCLUSION The RepSeq database provides a repository for repeat-containing proteins found in parasitic protozoa. The database allows for both individual and cross-species proteome analyses and also allows users to upload sequences of interest for analysis by the RepSeq algorithm. Identification of repeat-containing proteins provides researchers with a defined subset of proteins which can be analysed by expression profiling and functional characterisation, thereby facilitating study of pathogenicity and virulence factors in the parasitic protozoa. While primarily designed for kinetoplastid work, the RepSeq algorithm and database retain full functionality when used to analyse other species

    Synthetic long oligonucleotides to generate artificial templates for use as positive controls in molecular assays: drug resistance mutations in influenza virus as an example

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Positive controls are an integral component of any sensitive molecular diagnostic tool, but this can be affected, if several mutations are being screened in a scenario of a pandemic or newly emerging disease where it can be difficult to acquire all the necessary positive controls from the host. This work describes the development of a synthetic oligo-cassette for positive controls for accurate and highly sensitive diagnosis of several mutations relevant to influenza virus drug resistance.</p> <p>Results</p> <p>Using influenza antiviral drug resistance mutations as an example by employing the utility of synthetic paired long oligonucleotides containing complementary sequences at their 3' ends and utilizing the formation of oligonucleotide dimers and DNA polymerization, we generated ~170bp dsDNA containing several known specific neuraminidase inhibitor (NAI) resistance mutations. These templates were further cloned and successfully applied as positive controls in downstream assays.</p> <p>Conclusion</p> <p>This approach significantly improved the development of diagnosis of resistance mutations in terms of time, accuracy, efficiency and sensitivity, which are paramount to monitoring the emergence and spread of antiviral drug resistant influenza strains. Thus, this may have a significantly broader application in molecular diagnostics along with its application in rapid molecular testing of all relevant mutations in an event of pandemic.</p

    Rapid dissection and model-based optimization of inducible enhancers in human cells using a massively parallel reporter assay

    Get PDF
    Learning to read and write the transcriptional regulatory code is of central importance to progress in genetic analysis and engineering. Here we describe a massively parallel reporter assay (MPRA) that facilitates the systematic dissection of transcriptional regulatory elements. In MPRA, microarray-synthesized DNA regulatory elements and unique sequence tags are cloned into plasmids to generate a library of reporter constructs. These constructs are transfected into cells and tag expression is assayed by high-throughput sequencing. We apply MPRA to compare >27,000 variants of two inducible enhancers in human cells: a synthetic cAMP-regulated enhancer and the virus-inducible interferon-β enhancer. We first show that the resulting data define accurate maps of functional transcription factor binding sites in both enhancers at single-nucleotide resolution. We then use the data to train quantitative sequence-activity models (QSAMs) of the two enhancers. We show that QSAMs from two cellular states can be combined to design enhancer variants that optimize potentially conflicting objectives, such as maximizing induced activity while minimizing basal activity.National Human Genome Research Institute (U.S.) (grant R01HG004037)National Science Foundation (U.S.) ((NSF) grant PHY-0957573)National Science Foundation (U.S.) (NSF grant PHY-1022140)Broad Institut

    Discovery of T Cell Antigens by High-Throughput Screening of Synthetic Minigene Libraries

    Get PDF
    The identification of novel T cell antigens is central to basic and translational research in autoimmunity, tumor immunology, transplant immunology, and vaccine design for infectious disease. However, current methods for T cell antigen discovery are low throughput, and fail to explore a wide range of potential antigen-receptor interactions. To overcome these limitations, we developed a method in which programmable microarrays are used to cost-effectively synthesize complex libraries of thousands of minigenes that collectively encode the content of hundreds of candidate protein targets. Minigene-derived mRNA are transfected into autologous antigen presenting cells and used to challenge complex populations of purified peripheral blood CD8+ T cells in multiplex, parallel ELISPOT assays. In this proof-of-concept study, we apply synthetic minigene screening to identify two novel pancreatic islet autoantigens targeted in a patient with Type I Diabetes. To our knowledge, this is the first successful screen of a highly complex, synthetic minigene library for identification of a T cell antigen. In principle, responses against the full protein complement of any tissue or pathogen can be assayed by this approach, suggesting that further optimization of synthetic libraries holds promise for high throughput antigen discovery

    Microarrays, megasynthesis

    No full text

    Scalable gene synthesis by selective amplification of DNA pools from high-fidelity microchips

    No full text
    Development of cheap, high-throughput and reliable gene synthesis methods will broadly stimulate progress in biology and biotechnology. Currently, the reliance on column-synthesized oligonucleotides as a source of DNA limits further cost reductions in gene synthesis. Oligonucleotides from DNA microchips can reduce costs by at least an order of magnitude, yet efforts to scale their use have been largely unsuccessful owing to the high error rates and complexity of the oligonucleotide mixtures. Here we use high-fidelity DNA microchips, selective oligonucleotide pool amplification, optimized gene assembly protocols and enzymatic error correction to develop a method for highly parallel gene synthesis. We tested our approach by assembling 47 genes, including 42 challenging therapeutic antibody sequences, encoding a total of ∼35 kilobase pairs of DNA. These assemblies were performed from a complex background containing 13,000 oligonucleotides encoding ∼2.5 megabases of DNA, which is at least 50 times larger than in previously published attempts
    corecore