10 research outputs found

    Specialized Hidden Markov Model Databases for Microbial Genomics

    Get PDF
    As hidden Markov models (HMMs) become increasingly more important in the analysis of biological sequences, so too have databases of HMMs expanded in size, number and importance. While the standard paradigm a short while ago was the analysis of one or a few sequences at a time, it has now become standard procedure to submit an entire microbial genome. In the future, it will be common to submit large groups of completed genomes to run simultaneously against a dozen public databases and any number of internally developed targets. This paper looks at some of the readily available HMM (or HMM-like) algorithms and several publicly available HMM databases, and outlines methods by which the reader may develop custom HMM targets

    What makes species unique? The contribution of proteins with obscure features

    Get PDF
    BACKGROUND: Proteins with obscure features (POFs), which lack currently defined motifs or domains, represent between 18% and 38% of a typical eukaryotic proteome. To evaluate the contribution of this class of proteins to the diversity of eukaryotes, we performed a comparative analysis of the predicted proteomes derived from 10 different sequenced genomes, including budding and fission yeast, worm, fly, mosquito, Arabidopsis, rice, mouse, rat, and human. RESULTS: Only 1,650 protein groups were found to be conserved among these proteomes (BLAST E-value threshold of 10(-6)). Of these, only three were designated as POFs. Surprisingly, we found that, on average, 60% of the POFs identified in these 10 proteomes (44,236 in total) were species specific. In contrast, only 7.5% of the proteins with defined features (PDFs) were species specific (17,554 in total). As a group, POFs appear similar to PDFs in their relative contribution to biological functions, as indicated by their expression, participation in protein-protein interactions and association with mutant phenotypes. However, POF have more predicted disordered structure than PDFs, implying that they may exhibit preferential involvement in species-specific regulatory and signaling networks. CONCLUSION: Because the majority of eukaryotic POFs are not well conserved, and by definition do not have defined domains or motifs upon which to formulate a functional working hypothesis, understanding their biochemical and biological functions will require species-specific investigations

    Conference Review Specialized hidden Markov model databases for microbial genomics

    No full text
    As hidden Markov models (HMMs) become increasingly more important in the analysis of biological sequences, so too have databases of HMMs expanded in size, number and importance. While the standard paradigm a short while ago was the analysis of one or a few sequences at a time, it has now become standard procedure to submit an entire microbial genome. In the future, it will be common to submit large groups of completed genomes to run simultaneously against a dozen public databases and any number of internally developed targets. This paper looks at some of the readily available HMM (or HMM-like) algorithms and several publicly available HMM databases, and outlines methods by which the reader may develop custo

    Reactive oxygen gene network of plants

    No full text
    Reactive oxygen species (ROS) control many different processes in plants. However, being toxic molecules, they are also capable of injuring cells. How this conflict is resolved in plants is largely unknown. Nonetheless, it is clear that the steady-state level of ROS in cells needs to be tightly regulated. In Arabidopsis, a network of at least 152 genes is involved in managing the level of ROS. This network is highly dynamic and redundant, and encodes ROS-scavenging and ROS-producing proteins. Although recent studies have unraveled some of the key players in the network, many questions related to its mode of regulation, its protective roles and its modulation of signaling networks that control growth, development and stress response remain unanswered

    Transcriptome and gene expression analysis in cold-acclimated guayule (\u3ci\u3eParthenium argentatum\u3c/i\u3e) rubber-producing tissue

    Get PDF
    Natural rubber biosynthesis in guayule (Parthenium argentatum Gray) is associated with moderately cold night temperatures. To begin to dissect the molecular events triggered by cold temperatures that govern rubber synthesis induction in guayule, the transcriptome of bark tissue, where rubber is produced, was investigated. A total of 11,748 quality expressed sequence tags (ESTs) were obtained. The vast majority of ESTs encoded proteins that are similar to stress-related proteins, whereas those encoding rubber biosynthesis- related proteins comprised just over one percent of the ESTs. Sequence information derived from the ESTs was used to design primers for quantitative analysis of the expression of genes that encode selected enzymes and proteins with potential impact on rubber biosynthesis in field-grown guayule plants, including 3-hydroxy-3-methylglutaryl-CoA synthase, 3-hydroxy-3-methylglutaryl-CoA reductase, farnesyl pyrophosphate synthase, squalene synthase, small rubber particle protein, allene oxide synthase, and cis-prenyl transferase. Gene expression was studied for field-grown plants during the normal course of seasonal variation in temperature (monthly average maximum 41.7 Ā°C to minimum 0 Ā°C, from November 2005 through March 2007) and rubber transferase enzymatic activity was also evaluated. Levels of gene expression did not correlate with air temperatures nor with rubber transferase activity. Interestingly, a sudden increase in night temperature 10 days before harvest took place in advance of the highest CPT gene expression level

    Annotating Genes of Known and Unknown Function by Large-Scale Coexpression Analysis1[W][OA]

    No full text
    About 40% of the proteins encoded in eukaryotic genomes are proteins of unknown function (PUFs). Their functional characterization remains one of the main challenges in modern biology. In this study we identified the PUF encoding genes from Arabidopsis (Arabidopsis thaliana) using a combination of sequence similarity, domain-based, and empirical approaches. Large-scale gene expression analyses of 1,310 publicly available Affymetrix chips were performed to associate the identified PUF genes with regulatory networks and biological processes of known function. To generate quality results, the study was restricted to expression sets with replicated samples. First, genome-wide clustering and gene function enrichment analysis of clusters allowed us to associate 1,541 PUF genes with tightly coexpressed genes for proteins of known function (PKFs). Over 70% of them could be assigned to more specific biological process annotations than the ones available in the current Gene Ontology release. The most highly overrepresented functional categories in the obtained clusters were ribosome assembly, photosynthesis, and cell wall pathways. Interestingly, the majority of the PUF genes appeared to be controlled by the same regulatory networks as most PKF genes, because clusters enriched in PUF genes were extremely rare. Second, large-scale analysis of differentially expressed genes was applied to identify a comprehensive set of abiotic stress-response genes. This analysis resulted in the identification of 269 PKF and 104 PUF genes that responded to a wide variety of abiotic stresses, whereas 608 PKF and 206 PUF genes responded predominantly to specific stress treatments. The provided coexpression and differentially expressed gene data represent an important resource for guiding future functional characterization experiments of PUF and PKF genes. Finally, the public Plant Gene Expression Database (http://bioweb.ucr.edu/PED) was developed as part of this project to provide efficient access and mining tools for the vast gene expression data of this study

    Humoral Immunity Profiling of Subjects with Myalgic Encephalomyelitis Using a Random Peptide Microarray Differentiates Cases from Controls with High Specificity and Sensitivity

    No full text
    Myalgic encephalomyelitis (ME) is a complex, heterogeneous illness of unknown etiology. The search for biomarkers that can delineate cases from controls is one of the most active areas of ME research however, little progress has been made in achieving this goal. In contrast to identifying biomarkers that are directly involved in the pathological process, an immunosignature identifies antibodies raised to proteins expressed during, and potentially involved in, the pathological process. Although these proteins might be unknown, it is possible to detect antibodies that react to these proteins using random peptide arrays. In the present study, we probe a custom 125,000 random 12-mer peptide microarray with sera from 21 ME cases and 21 controls from the USA and Europe and used these data to develop a diagnostic signature. We further used these peptide sequences to potentially uncover the naturally occurring candidate antigens to which these antibodies may specifically react with in vivo. Our analysis revealed a subset of 25 peptides that distinguished cases and controls with high specificity and sensitivity. Additionally, Basic Local Alignment Search Tool (BLAST) searches suggest that these peptides primarily represent human self-antigens and endogenous retroviral sequences and, to a minor extent, viral and bacterial pathogens
    corecore