173 research outputs found

    Composition-based statistics and translated nucleotide searches: Improving the TBLASTN module of BLAST

    Get PDF
    BACKGROUND: TBLASTN is a mode of operation for BLAST that aligns protein sequences to a nucleotide database translated in all six frames. We present the first description of the modern implementation of TBLASTN, focusing on new techniques that were used to implement composition-based statistics for translated nucleotide searches. Composition-based statistics use the composition of the sequences being aligned to generate more accurate E-values, which allows for a more accurate distinction between true and false matches. Until recently, composition-based statistics were available only for protein-protein searches. They are now available as a command line option for recent versions of TBLASTN and as an option for TBLASTN on the NCBI BLAST web server. RESULTS: We evaluate the statistical and retrieval accuracy of the E-values reported by a baseline version of TBLASTN and by two variants that use different types of composition-based statistics. To test the statistical accuracy of TBLASTN, we ran 1000 searches using scrambled proteins from the mouse genome and a database of human chromosomes. To test retrieval accuracy, we modernize and adapt to translated searches a test set previously used to evaluate the retrieval accuracy of protein-protein searches. We show that composition-based statistics greatly improve the statistical accuracy of TBLASTN, at a small cost to the retrieval accuracy. CONCLUSION: TBLASTN is widely used, as it is common to wish to compare proteins to chromosomes or to libraries of mRNAs. Composition-based statistics improve the statistical accuracy, and therefore the reliability, of TBLASTN results. The algorithms used by TBLASTN are not widely known, and some of the most important are reported here. The data used to test TBLASTN are available for download and may be useful in other studies of translated search algorithms

    Cognitive Neuropsychology of HIV-Associated Neurocognitive Disorders

    Get PDF
    Advances in the treatment of the human immunodeficiency virus (HIV) have dramatically improved survival rates over the past 10 years, but HIV-associated neurocognitive disorders (HAND) remain highly prevalent and continue to represent a significant public health problem. This review provides an update on the nature, extent, and diagnosis of HAND. Particular emphasis is placed on critically evaluating research within the realm of cognitive neuropsychology that aims to elucidate the component processes of HAND across the domains of executive functions, motor skills, speeded information processing, episodic memory, attention/working memory, language, and visuoperception. In addition to clarifying the cognitive mechanisms of HAND (e.g., impaired cognitive control), the cognitive neuropsychology approach may enhance the ecological validity of neuroAIDS research and inform the development of much needed novel, targeted cognitive and behavioral therapies

    Exopolysaccharide-associated protein sorting in environmental organisms: the PEP-CTERM/EpsH system. Application of a novel phylogenetic profiling heuristic

    Get PDF
    BACKGROUND: Protein translocation to the proper cellular destination may be guided by various classes of sorting signals recognizable in the primary sequence. Detection in some genomes, but not others, may reveal sorting system components by comparison of the phylogenetic profile of the class of sorting signal to that of various protein families. RESULTS: We describe a short C-terminal homology domain, sporadically distributed in bacteria, with several key characteristics of protein sorting signals. The domain includes a near-invariant motif Pro-Glu-Pro (PEP). This possible recognition or processing site is followed by a predicted transmembrane helix and a cluster rich in basic amino acids. We designate this domain PEP-CTERM. It tends to occur multiple times in a genome if it occurs at all, with a median count of eight instances; Verrucomicrobium spinosum has sixty-five. PEP-CTERM-containing proteins generally contain an N-terminal signal peptide and exhibit high diversity and little homology to known proteins. All bacteria with PEP-CTERM have both an outer membrane and exopolysaccharide (EPS) production genes. By a simple heuristic for screening phylogenetic profiles in the absence of pre-formed protein families, we discovered that a homolog of the membrane protein EpsH (exopolysaccharide locus protein H) occurs in a species when PEP-CTERM domains are found. The EpsH family contains invariant residues consistent with a transpeptidase function. Most PEP-CTERM proteins are encoded by single-gene operons preceded by large intergenic regions. In the Proteobacteria, most of these upstream regions share a DNA sequence, a probable cis-regulatory site that contains a sigma-54 binding motif. The phylogenetic profile for this DNA sequence exactly matches that of three proteins: a sigma-54-interacting response regulator (PrsR), a transmembrane histidine kinase (PrsK), and a TPR protein (PrsT). CONCLUSION: These findings are consistent with the hypothesis that PEP-CTERM and EpsH form a protein export sorting system, analogous to the LPXTG/sortase system of Gram-positive bacteria, and correlated to EPS expression. It occurs preferentially in bacteria from sediments, soils, and biofilms. The novel method that led to these findings, partial phylogenetic profiling, requires neither global sequence clustering nor arbitrary similarity cutoffs and appears to be a rapid, effective alternative to other profiling methods

    Predicting protein linkages in bacteria: Which method is best depends on task

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Applications of computational methods for predicting protein functional linkages are increasing. In recent years, several bacteria-specific methods for predicting linkages have been developed. The four major genomic context methods are: Gene cluster, Gene neighbor, Rosetta Stone, and Phylogenetic profiles. These methods have been shown to be powerful tools and this paper provides guidelines for when each method is appropriate by exploring different features of each method and potential improvements offered by their combination. We also review many previous treatments of these prediction methods, use the latest available annotations, and offer a number of new observations.</p> <p>Results</p> <p>Using <it>Escherichia coli </it>K12 and <it>Bacillus subtilis</it>, linkage predictions made by each of these methods were evaluated against three benchmarks: functional categories defined by COG and KEGG, known pathways listed in EcoCyc, and known operons listed in RegulonDB. Each evaluated method had strengths and weaknesses, with no one method dominating all aspects of predictive ability studied. For functional categories, as previous studies have shown, the Rosetta Stone method was individually best at detecting linkages and predicting functions among proteins with shared KEGG categories while the Phylogenetic profile method was best for linkage detection and function prediction among proteins with common COG functions. Differences in performance under COG versus KEGG may be attributable to the presence of paralogs. Better function prediction was observed when using a weighted combination of linkages based on reliability versus using a simple unweighted union of the linkage sets. For pathway reconstruction, 99 complete metabolic pathways in <it>E. coli </it>K12 (out of the 209 known, non-trivial pathways) and 193 pathways with 50% of their proteins were covered by linkages from at least one method. Gene neighbor was most effective individually on pathway reconstruction, with 48 complete pathways reconstructed. For operon prediction, Gene cluster predicted completely 59% of the known operons in <it>E. coli </it>K12 and 88% (333/418)in <it>B. subtilis</it>. Comparing two versions of the <it>E. coli </it>K12 operon database, many of the unannotated predictions in the earlier version were updated to true predictions in the later version. Using only linkages found by both Gene Cluster and Gene Neighbor improved the precision of operon predictions. Additionally, as previous studies have shown, combining features based on intergenic region and protein function improved the specificity of operon prediction.</p> <p>Conclusion</p> <p>A common problem for computational methods is the generation of a large number of false positives that might be caused by an incomplete source of validation. By comparing two versions of a database, we demonstrated the dramatic differences on reported results. We used several benchmarks on which we have shown the comparative effectiveness of each prediction method, as well as provided guidelines as to which method is most appropriate for a given prediction task.</p

    ErbB2, EphrinB1, Src Kinase and PTPN13 Signaling Complex Regulates MAP Kinase Signaling in Human Cancers

    Get PDF
    In non-cancerous cells, phosphorylated proteins exist transiently, becoming de-phosphorylated by specific phosphatases that terminate propagation of signaling pathways. In cancers, compromised phosphatase activity and/or expression occur and contribute to tumor phenotype. The non-receptor phosphatase, PTPN13, has recently been dubbed a putative tumor suppressor. It decreased expression in breast cancer correlates with decreased overall survival. Here we show that PTPN13 regulates a new signaling complex in breast cancer consisting of ErbB2, Src, and EphrinB1. To our knowledge, this signaling complex has not been previously described. Co-immunoprecipitation and localization studies demonstrate that EphrinB1, a PTPN13 substrate, interacts with ErbB2. In addition, the oncogenic V660E ErbB2 mutation enhances this interaction, while Src kinase mediates EphrinB1 phosphorylation and subsequent MAP Kinase signaling. Decreased PTPN13 function further enhances signaling. The association of oncogene kinases (ErbB2, Src), a signaling transmembrane ligand (EphrinB1) and a phosphatase tumor suppressor (PTPN13) suggest that EphrinB1 may be a relevant therapeutic target in breast cancers harboring ErbB2-activating mutations and decreased PTPN13 expression

    Hsf1 Activation Inhibits Rapamycin Resistance and TOR Signaling in Yeast Revealed by Combined Proteomic and Genetic Analysis

    Get PDF
    TOR kinases integrate environmental and nutritional signals to regulate cell growth in eukaryotic organisms. Here, we describe results from a study combining quantitative proteomics and comparative expression analysis in the budding yeast, S. cerevisiae, to gain insights into TOR function and regulation. We profiled protein abundance changes under conditions of TOR inhibition by rapamycin treatment, and compared this data to existing expression information for corresponding gene products measured under a variety of conditions in yeast. Among proteins showing abundance changes upon rapamycin treatment, almost 90% of them demonstrated homodirectional (i.e., in similar direction) transcriptomic changes under conditions of heat/oxidative stress. Because the known downstream responses regulated by Tor1/2 did not fully explain the extent of overlap between these two conditions, we tested for novel connections between the major regulators of heat/oxidative stress response and the TOR pathway. Specifically, we hypothesized that activation of regulator(s) of heat/oxidative stress responses phenocopied TOR inhibition and sought to identify these putative TOR inhibitor(s). Among the stress regulators tested, we found that cells (hsf1-R206S, F256S and ssa1-3 ssa2-2) constitutively activated for heat shock transcription factor 1, Hsf1, inhibited rapamycin resistance. Further analysis of the hsf1-R206S, F256S allele revealed that these cells also displayed multiple phenotypes consistent with reduced TOR signaling. Among the multiple Hsf1 targets elevated in hsf1-R206S, F256S cells, deletion of PIR3 and YRO2 suppressed the TOR-regulated phenotypes. In contrast to our observations in cells activated for Hsf1, constitutive activation of other regulators of heat/oxidative stress responses, such as Msn2/4 and Hyr1, did not inhibit TOR signaling. Thus, we propose that activated Hsf1 inhibits rapamycin resistance and TOR signaling via elevated expression of specific target genes in S. cerevisiae. Additionally, these results highlight the value of comparative expression analyses between large-scale proteomic and transcriptomic datasets to reveal new regulatory connections

    A Cross-Study Transcriptional Analysis of Parkinson's Disease

    Get PDF
    The study of Parkinson's disease (PD), like other complex neurodegenerative disorders, is limited by access to brain tissue from patients with a confirmed diagnosis. Alternatively the study of peripheral tissues may offer some insight into the molecular basis of disease susceptibility and progression, but this approach still relies on brain tissue to benchmark relevant molecular changes against. Several studies have reported whole-genome expression profiling in post-mortem brain but reported concordance between these analyses is lacking. Here we apply a standardised pathway analysis to seven independent case-control studies, and demonstrate increased concordance between data sets. Moreover data convergence increased when the analysis was limited to the five substantia nigra (SN) data sets; this highlighted the down regulation of dopamine receptor signaling and insulin-like growth factor 1 (IGF1) signaling pathways. We also show that case-control comparisons of affected post mortem brain tissue are more likely to reflect terminal cytoarchitectural differences rather than primary pathogenic mechanisms. The implementation of a correction factor for dopaminergic neuronal loss predictably resulted in the loss of significance of the dopamine signaling pathway while axon guidance pathways increased in significance. Interestingly the IGF1 signaling pathway was also over-represented when data from non-SN areas, unaffected or only terminally affected in PD, were considered. Our findings suggest that there is greater concordance in PD whole-genome expression profiling when standardised pathway membership rather than ranked gene list is used for comparison

    Stability of Metabolic Correlations under Changing Environmental Conditions in Escherichia coli – A Systems Approach

    Get PDF
    Background: Biological systems adapt to changing environments by reorganizing their cellular and physiological program with metabolites representing one important response level. Different stresses lead to both conserved and specific responses on the metabolite level which should be reflected in the underlying metabolic network. Methodology/Principal Findings: Starting from experimental data obtained by a GC-MS based high-throughput metabolic profiling technology we here develop an approach that: (1) extracts network representations from metabolic condition-dependent data by using pairwise correlations, (2) determines the sets of stable and condition-dependent correlations based on a combination of statistical significance and homogeneity tests, and (3) can identify metabolites related to the stress response, which goes beyond simple observations about the changes of metabolic concentrations. The approach was tested with Escherichia coli as a model organism observed under four different environmental stress conditions (cold stress, heat stress, oxidative stress, lactose diauxie) and control unperturbed conditions. By constructing the stable network component, which displays a scale free topology and small-world characteristics, we demonstrated that: (1) metabolite hubs in this reconstructed correlation networks are significantly enriched for those contained in biochemical networks such as EcoCyc, (2) particular components of the stable network are enriched for functionally related biochemical pathways, and (3) independently of the response scale, based on their importance in the reorganization of the correlation network a set of metabolites can be identified which represent hypothetical candidates for adjusting to a stress-specific response. Conclusions/Significance: Network-based tools allowed the identification of stress-dependent and general metabolic correlation networks. This correlation-network-based approach does not rely on major changes in concentration to identify metabolites important for stress adaptation, but rather on the changes in network properties with respect to metabolites. This should represent a useful complementary technique in addition to more classical approaches
    corecore