38 research outputs found

    N-gram analysis of 970 microbial organisms reveals presence of biological language models

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>It has been suggested previously that genome and proteome sequences show characteristics typical of natural-language texts such as "signature-style" word usage indicative of authors or topics, and that the algorithms originally developed for natural language processing may therefore be applied to genome sequences to draw biologically relevant conclusions. Following this approach of 'biological language modeling', statistical n-gram analysis has been applied for comparative analysis of whole proteome sequences of 44 organisms. It has been shown that a few particular amino acid n-grams are found in abundance in one organism but occurring very rarely in other organisms, thereby serving as genome signatures. At that time proteomes of only 44 organisms were available, thereby limiting the generalization of this hypothesis. Today nearly 1,000 genome sequences and corresponding translated sequences are available, making it feasible to test the existence of biological language models over the evolutionary tree.</p> <p>Results</p> <p>We studied whole proteome sequences of 970 microbial organisms using n-gram frequencies and cross-perplexity employing the Biological Language Modeling Toolkit and Patternix Revelio toolkit. Genus-specific signatures were observed even in a simple unigram distribution. By taking statistical n-gram model of one organism as reference and computing cross-perplexity of all other microbial proteomes with it, cross-perplexity was found to be predictive of branch distance of the phylogenetic tree. For example, a 4-gram model from proteome of <it>Shigellae flexneri 2a</it>, which belongs to the <it>Gammaproteobacteria </it>class showed a self-perplexity of 15.34 while the cross-perplexity of other organisms was in the range of 15.59 to 29.5 and was proportional to their branching distance in the evolutionary tree from <it>S. flexneri</it>. The organisms of this genus, which happen to be pathotypes of <it>E.coli</it>, also have the closest perplexity values with <it>E. coli.</it></p> <p>Conclusion</p> <p>Whole proteome sequences of microbial organisms have been shown to contain particular n-gram sequences in abundance in one organism but occurring very rarely in other organisms, thereby serving as proteome signatures. Further it has also been shown that perplexity, a statistical measure of similarity of n-gram composition, can be used to predict evolutionary distance within a genus in the phylogenetic tree.</p

    Colorectal Cancer Stem Cells Are Enriched in Xenogeneic Tumors Following Chemotherapy

    Get PDF
    Patients generally die of cancer after the failure of current therapies to eliminate residual disease. A subpopulation of tumor cells, termed cancer stem cells (CSC), appears uniquely able to fuel the growth of phenotypically and histologically diverse tumors. It has been proposed, therefore, that failure to effectively treat cancer may in part be due to preferential resistance of these CSC to chemotherapeutic agents. The subpopulation of human colorectal tumor cells with an ESA(+)CD44(+) phenotype are uniquely responsible for tumorigenesis and have the capacity to generate heterogeneous tumors in a xenograft setting (i.e. CoCSC). We hypothesized that if non-tumorigenic cells are more susceptible to chemotherapeutic agents, then residual tumors might be expected to contain a higher frequency of CoCSC.Xenogeneic tumors initiated with CoCSC were allowed to reach approximately 400 mm(3), at which point mice were randomized and chemotherapeutic regimens involving cyclophosphamide or Irinotecan were initiated. Data from individual tumor phenotypic analysis and serial transplants performed in limiting dilution show that residual tumors are enriched for cells with the CoCSC phenotype and have increased tumorigenic cell frequency. Moreover, the inherent ability of residual CoCSC to generate tumors appears preserved. Aldehyde dehydrogenase 1 gene expression and enzymatic activity are elevated in CoCSC and using an in vitro culture system that maintains CoCSC as demonstrated by serial transplants and lentiviral marking of single cell-derived clones, we further show that ALDH1 enzymatic activity is a major mediator of resistance to cyclophosphamide: a classical chemotherapeutic agent.CoCSC are enriched in colon tumors following chemotherapy and remain capable of rapidly regenerating tumors from which they originated. By focusing on the biology of CoCSC, major resistance mechanisms to specific chemotherapeutic agents can be attributed to specific genes, thereby suggesting avenues for improving cancer therapy

    Development and Function of CD94-Deficient Natural Killer Cells

    Get PDF
    The CD94 transmembrane-anchored glycoprotein forms disulfide-bonded heterodimers with the NKG2A subunit to form an inhibitory receptor or with the NKG2C or NKG2E subunits to assemble a receptor complex with activating DAP12 signaling proteins. CD94 receptors expressed on human and mouse NK cells and T cells have been proposed to be important in NK cell tolerance to self, play an important role in NK cell development, and contribute to NK cell-mediated immunity to certain infections including human cytomegalovirus. We generated a gene-targeted CD94-deficient mouse to understand the role of CD94 receptors in NK cell biology. CD94-deficient NK cells develop normally and efficiently kill NK cell-susceptible targets. Lack of these CD94 receptors does not alter control of mouse cytomegalovirus, lymphocytic choriomeningitis virus, vaccinia virus, or Listeria monocytogenes. Thus, the expression of CD94 and its associated NKG2A, NKG2C, and NKG2E subunits is dispensable for NK cell development, education, and many NK cell functions

    A Directed Molecular Evolution Approach to Improved Immunogenicity of the HIV-1 Envelope Glycoprotein

    Get PDF
    A prophylactic vaccine is needed to slow the spread of HIV-1 infection. Optimization of the wild-type envelope glycoproteins to create immunogens that can elicit effective neutralizing antibodies is a high priority. Starting with ten genes encoding subtype B HIV-1 gp120 envelope glycoproteins and using in vitro homologous DNA recombination, we created chimeric gp120 variants that were screened for their ability to bind neutralizing monoclonal antibodies. Hundreds of variants were identified with novel antigenic phenotypes that exhibit considerable sequence diversity. Immunization of rabbits with these gp120 variants demonstrated that the majority can induce neutralizing antibodies to HIV-1. One novel variant, called ST-008, induced significantly improved neutralizing antibody responses when assayed against a large panel of primary HIV-1 isolates. Further study of various deletion constructs of ST-008 showed that the enhanced immunogenicity results from a combination of effective DNA priming, an enhanced V3-based response, and an improved response to the constant backbone sequences

    NK cells and cancer: you can teach innate cells new tricks

    Full text link
    Natural killer (NK) cells are the prototype innate lymphoid cells endowed with potent cytolytic function that provide host defence against microbial infection and tumours. Here, we review evidence for the role of NK cells in immune surveillance against cancer and highlight new therapeutic approaches for targeting NK cells in the treatment of cancer
    corecore