216 research outputs found

    Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists

    Get PDF
    Functional analysis of large gene lists, derived in most cases from emerging high-throughput genomic, proteomic and bioinformatics scanning approaches, is still a challenging and daunting task. The gene-annotation enrichment analysis is a promising high-throughput strategy that increases the likelihood for investigators to identify biological processes most pertinent to their study. Approximately 68 bioinformatics enrichment tools that are currently available in the community are collected in this survey. Tools are uniquely categorized into three major classes, according to their underlying enrichment algorithms. The comprehensive collections, unique tool classifications and associated questions/issues will provide a more comprehensive and up-to-date view regarding the advantages, pitfalls and recent trends in a simpler tool-class level rather than by a tool-by-tool approach. Thus, the survey will help tool designers/developers and experienced end users understand the underlying algorithms and pertinent details of particular tool categories/tools, enabling them to make the best choices for their particular research interests

    PubMatrix: a tool for multiplex literature mining

    Get PDF
    BACKGROUND: Molecular experiments using multiplex strategies such as cDNA microarrays or proteomic approaches generate large datasets requiring biological interpretation. Text based data mining tools have recently been developed to query large biological datasets of this type of data. PubMatrix is a web-based tool that allows simple text based mining of the NCBI literature search service PubMed using any two lists of keywords terms, resulting in a frequency matrix of term co-occurrence. RESULTS: For example, a simple term selection procedure allows automatic pair-wise comparisons of approximately 1–100 search terms versus approximately 1–10 modifier terms, resulting in up to 1,000 pair wise comparisons. The matrix table of pair-wise comparisons can then be surveyed, queried individually, and archived. Lists of keywords can include any terms currently capable of being searched in PubMed. In the context of cDNA microarray studies, this may be used for the annotation of gene lists from clusters of genes that are expressed coordinately. An associated PubMatrix public archive provides previous searches using common useful lists of keyword terms. CONCLUSIONS: In this way, lists of terms, such as gene names, or functional assignments can be assigned genetic, biological, or clinical relevance in a rapid flexible systematic fashion

    Gene duplications in prokaryotes can be associated with environmental adaptation

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Gene duplication is a normal evolutionary process. If there is no selective advantage in keeping the duplicated gene, it is usually reduced to a pseudogene and disappears from the genome. However, some paralogs are retained. These gene products are likely to be beneficial to the organism, e.g. in adaptation to new environmental conditions. The aim of our analysis is to investigate the properties of paralog-forming genes in prokaryotes, and to analyse the role of these retained paralogs by relating gene properties to life style of the corresponding prokaryotes.</p> <p>Results</p> <p>Paralogs were identified in a number of prokaryotes, and these paralogs were compared to singletons of persistent orthologs based on functional classification. This showed that the paralogs were associated with for example energy production, cell motility, ion transport, and defence mechanisms. A statistical overrepresentation analysis of gene and protein annotations was based on paralogs of the 200 prokaryotes with the highest fraction of paralog-forming genes. Biclustering of overrepresented gene ontology terms versus species was used to identify clusters of properties associated with clusters of species. The clusters were classified using similarity scores on properties and species to identify interesting clusters, and a subset of clusters were analysed by comparison to literature data. This analysis showed that paralogs often are associated with properties that are important for survival and proliferation of the specific organisms. This includes processes like ion transport, locomotion, chemotaxis and photosynthesis. However, the analysis also showed that the gene ontology terms sometimes were too general, imprecise or even misleading for automatic analysis.</p> <p>Conclusions</p> <p>Properties described by gene ontology terms identified in the overrepresentation analysis are often consistent with individual prokaryote lifestyles and are likely to give a competitive advantage to the organism. Paralogs and singletons dominate different categories of functional classification, where paralogs in particular seem to be associated with processes involving interaction with the environment.</p

    Interferon stimulated exonuclease gene 20 kDa links psychiatric events to distinct hepatitis C virus responses in human immunodeficiency virus positive patients: ISG20 Links Psychiatric Events to HCV Clearance

    Get PDF
    Hepatitis C Virus (HCV) infection occurs frequently in patients with preexisting mental illness. Treatment for chronic hepatitis C using interferon formulations often increases risk for neuropsychiatric symptoms. Pegylated-Interferon-α (PegIFN-α) remains crucial for attaining sustained virologic response (SVR); however, PegIFN-α based treatment is associated with psychiatric adverse effects, which require dose reduction and/or interruption. This study's main objective was to identify genes induced by PegIFN-α and expressed in the central nervous system and immune system, which could mediate the development of psychiatric toxicity in association with antiviral outcome. Using peripheral blood mononuclear cells from Human Immunodeficiency Virus (HIV)/HCV co-infected donors (N=28), DNA microarray analysis was performed and 21 differentially regulated genes were identified in patients with psychiatric toxicity vs. those without. Using these 21 expression profiles a two-way-ANOVA was performed to select genes based on antiviral outcome and occurrence of neuropsychiatric adverse events. Microarray analysis demonstrated that Interferon-stimulated-exonuclease-gene 20kDa (ISG20) and Interferon-alpha-inducible-protein 27 (IFI27) were the most regulated genes (P<0.05) between three groups that were built by combining antiviral outcome and neuropsychiatric toxicity. Validation by bDNA assay confirmed that ISG20 expression levels were significantly associated with these outcomes (P<0.035). Baseline levels and induction of ISG20 correlated independently with no occurrence of psychiatric adverse events and non-response to therapy (P<0.001). Among the 21 genes that were associated with psychiatric adverse events and 20 Interferon-inducible genes (IFIGs) used as controls, only ISG20 expression was able to link PegIFN-α related neuropsychiatric toxicity to distinct HCV-responses in patients co-infected with HIV and HCV in vivo

    Pre-gastrula expression of zebrafish extraembryonic genes

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Many species form extraembryonic tissues during embryogenesis, such as the placenta of humans and other viviparous mammals. Extraembryonic tissues have various roles in protecting, nourishing and patterning embryos. Prior to gastrulation in zebrafish, the yolk syncytial layer - an extraembryonic nuclear syncytium - produces signals that induce mesoderm and endoderm formation. Mesoderm and endoderm precursor cells are situated in the embryonic margin, an external ring of cells along the embryo-yolk interface. The yolk syncytial layer initially forms below the margin, in a domain called the external yolk syncytial layer (E-YSL).</p> <p>Results</p> <p>We hypothesize that key components of the yolk syncytial layer's mesoderm and endoderm inducing activity are expressed as mRNAs in the E-YSL. To identify genes expressed in the E-YSL, we used microarrays to compare the transcription profiles of intact pre-gastrula embryos with pre-gastrula embryonic cells that we had separated from the yolk and yolk syncytial layer. This identified a cohort of genes with enriched expression in intact embryos. Here we describe our whole mount <it>in situ </it>hybridization analysis of sixty-eight of them. This includes ten genes with E-YSL expression (<it>camsap1l1</it>, <it>gata3</it>, <it>znf503</it>, <it>hnf1ba</it>, <it>slc26a1</it>, <it>slc40a1</it>, <it>gata6</it>, <it>gpr137bb</it>, <it>otop1 </it>and <it>cebpa</it>), four genes with expression in the enveloping layer (EVL), a superficial epithelium that protects the embryo (<it>zgc:136817</it>, <it>zgc:152778</it>, <it>slc14a2 </it>and <it>elovl6l</it>), three EVL genes whose expression is transiently confined to the animal pole (<it>elovl6l</it>, <it>zgc:136359 </it>and <it>clica</it>), and six genes with transient maternal expression (<it>mtf1</it>, <it>wu:fj59f04</it>, <it>mospd2</it>, <it>rftn2</it>, <it>arrdc1a </it>and <it>pho</it>). We also assessed the requirement of Nodal signaling for the expression of selected genes in the E-YSL, EVL and margin. Margin expression was Nodal dependent for all genes we tested, including the concentrated margin expression of an EVL gene: <it>zgc:110712</it>. All other instances of EVL and E-YSL expression that we tested were Nodal independent.</p> <p>Conclusion</p> <p>We have devised an effective strategy for enriching and identifying genes expressed in the E-YSL of pre-gastrula embryos. To our surprise, maternal genes and genes expressed in the EVL were also enriched by this strategy. A number of these genes are promising candidates for future functional studies on early embryonic patterning.</p

    DAVID-WS: a stateful web service to facilitate gene/protein list analysis

    Get PDF
    Summary: The database for annotation, visualization and integrated discovery (DAVID), which can be freely accessed at http://david.abcc.ncifcrf.gov/, is a web-based online bioinformatics resource that aims to provide tools for the functional interpretation of large lists of genes/proteins. It has been used by researchers from more than 5000 institutes worldwide, with a daily submission rate of ∼1200 gene lists from ∼400 unique researchers, and has been cited by more than 6000 scientific publications. However, the current web interface does not support programmatic access to DAVID, and the uniform resource locator (URL)-based application programming interface (API) has a limit on URL size and is stateless in nature as it uses URL request and response messages to communicate with the server, without keeping any state-related details. DAVID-WS (web service) has been developed to automate user tasks by providing stateful web services to access DAVID programmatically without the need for human interactions
    corecore