661 research outputs found

    A genome-wide MeSH-based literature mining system predicts implicit gene-to-gene relationships and networks

    Full text link
    Abstract Background The large amount of literature in the post-genomics era enables the study of gene interactions and networks using all available articles published for a specific organism. MeSH is a controlled vocabulary of medical and scientific terms that is used by biomedical scientists to manually index articles in the PubMed literature database. We hypothesized that genome-wide gene-MeSH term associations from the PubMed literature database could be used to predict implicit gene-to-gene relationships and networks. While the gene-MeSH associations have been used to detect gene-gene interactions in some studies, different methods have not been well compared, and such a strategy has not been evaluated for a genome-wide literature analysis. Genome-wide literature mining of gene-to-gene interactions allows ranking of the best gene interactions and investigation of comprehensive biological networks at a genome level. Results The genome-wide GenoMesh literature mining algorithm was developed by sequentially generating a gene-article matrix, a normalized gene-MeSH term matrix, and a gene-gene matrix. The gene-gene matrix relies on the calculation of pairwise gene dissimilarities based on gene-MeSH relationships. An optimized dissimilarity score was identified from six well-studied functions based on a receiver operating characteristic (ROC) analysis. Based on the studies with well-studied Escherichia coli and less-studied Brucella spp., GenoMesh was found to accurately identify gene functions using weighted MeSH terms, predict gene-gene interactions not reported in the literature, and cluster all the genes studied from an organism using the MeSH-based gene-gene matrix. A web-based GenoMesh literature mining program is also available at: http://genomesh.hegroup.org. GenoMesh also predicts gene interactions and networks among genes associated with specific MeSH terms or user-selected gene lists. Conclusions The GenoMesh algorithm and web program provide the first genome-wide, MeSH-based literature mining system that effectively predicts implicit gene-gene interaction relationships and networks in a genome-wide scope.http://deepblue.lib.umich.edu/bitstream/2027.42/112478/1/12918_2013_Article_1166.pd

    Integration and mining of malaria molecular, functional and pharmacological data: how far are we from a chemogenomic knowledge space?

    Get PDF
    The organization and mining of malaria genomic and post-genomic data is highly motivated by the necessity to predict and characterize new biological targets and new drugs. Biological targets are sought in a biological space designed from the genomic data from Plasmodium falciparum, but using also the millions of genomic data from other species. Drug candidates are sought in a chemical space containing the millions of small molecules stored in public and private chemolibraries. Data management should therefore be as reliable and versatile as possible. In this context, we examined five aspects of the organization and mining of malaria genomic and post-genomic data: 1) the comparison of protein sequences including compositionally atypical malaria sequences, 2) the high throughput reconstruction of molecular phylogenies, 3) the representation of biological processes particularly metabolic pathways, 4) the versatile methods to integrate genomic data, biological representations and functional profiling obtained from X-omic experiments after drug treatments and 5) the determination and prediction of protein structures and their molecular docking with drug candidate structures. Progresses toward a grid-enabled chemogenomic knowledge space are discussed.Comment: 43 pages, 4 figures, to appear in Malaria Journa

    Ontology-based literature mining and class effect analysis of adverse drug reactions associated with neuropathy-inducing drugs

    Full text link
    Abstract Background Adverse drug reactions (ADRs), also called as drug adverse events (AEs), are reported in the FDA drug labels; however, it is a big challenge to properly retrieve and analyze the ADRs and their potential relationships from textual data. Previously, we identified and ontologically modeled over 240 drugs that can induce peripheral neuropathy through mining public drug-related databases and drug labels. However, the ADR mechanisms of these drugs are still unclear. In this study, we aimed to develop an ontology-based literature mining system to identify ADRs from drug labels and to elucidate potential mechanisms of the neuropathy-inducing drugs (NIDs). Results We developed and applied an ontology-based SciMiner literature mining strategy to mine ADRs from the drug labels provided in the Text Analysis Conference (TAC) 2017, which included drug labels for 53 neuropathy-inducing drugs (NIDs). We identified an average of 243 ADRs per NID and constructed an ADR-ADR network, which consists of 29 ADR nodes and 149 edges, including only those ADR-ADR pairs found in at least 50% of NIDs. Comparison to the ADR-ADR network of non-NIDs revealed that the ADRs such as pruritus, pyrexia, thrombocytopenia, nervousness, asthenia, acute lymphocytic leukaemia were highly enriched in the NID network. Our ChEBI-based ontology analysis identified three benzimidazole NIDs (i.e., lansoprazole, omeprazole, and pantoprazole), which were associated with 43 ADRs. Based on ontology-based drug class effect definition, the benzimidazole drug group has a drug class effect on all of these 43 ADRs. Many of these 43 ADRs also exist in the enriched NID ADR network. Our Ontology of Adverse Events (OAE) classification further found that these 43 benzimidazole-related ADRs were distributed in many systems, primarily in behavioral and neurological, digestive, skin, and immune systems. Conclusions Our study demonstrates that ontology-based literature mining and network analysis can efficiently identify and study specific group of drugs and their associated ADRs. Furthermore, our analysis of drug class effects identified 3 benzimidazole drugs sharing 43 ADRs, leading to new hypothesis generation and possible mechanism understanding of drug-induced peripheral neuropathy.https://deepblue.lib.umich.edu/bitstream/2027.42/144217/1/13326_2018_Article_185.pd

    OHMI: The Ontology of Host-Microbiome Interactions

    Get PDF
    Host-microbiome interactions (HMIs) are critical for the modulation of biological processes and are associated with several diseases, and extensive HMI studies have generated large amounts of data. We propose that the logical representation of the knowledge derived from these data and the standardized representation of experimental variables and processes can foster integration of data and reproducibility of experiments and thereby further HMI knowledge discovery. A community-based Ontology of Host-Microbiome Interactions (OHMI) was developed following the OBO Foundry principles. OHMI leverages established ontologies to create logically structured representations of microbiomes, microbial taxonomy, host species, host anatomical entities, and HMIs under different conditions and associated study protocols and types of data analysis and experimental results

    Ontology-based Brucella vaccine literature indexing and systematic analysis of gene-vaccine association network

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Vaccine literature indexing is poorly performed in PubMed due to limited hierarchy of Medical Subject Headings (MeSH) annotation in the vaccine field. Vaccine Ontology (VO) is a community-based biomedical ontology that represents various vaccines and their relations. SciMiner is an in-house literature mining system that supports literature indexing and gene name tagging. We hypothesize that application of VO in SciMiner will aid vaccine literature indexing and mining of vaccine-gene interaction networks. As a test case, we have examined vaccines for <it>Brucella</it>, the causative agent of brucellosis in humans and animals.</p> <p>Results</p> <p>The VO-based SciMiner (VO-SciMiner) was developed to incorporate a total of 67 <it>Brucella </it>vaccine terms. A set of rules for term expansion of VO terms were learned from training data, consisting of 90 biomedical articles related to <it>Brucella </it>vaccine terms. VO-SciMiner demonstrated high recall (91%) and precision (99%) from testing a separate set of 100 manually selected biomedical articles. VO-SciMiner indexing exhibited superior performance in retrieving <it>Brucella </it>vaccine-related papers over that obtained with MeSH-based PubMed literature search. For example, a VO-SciMiner search of "live attenuated <it>Brucella </it>vaccine" returned 922 hits as of April 20, 2011, while a PubMed search of the same query resulted in only 74 hits. Using the abstracts of 14,947 <it>Brucella</it>-related papers, VO-SciMiner identified 140 <it>Brucella </it>genes associated with <it>Brucella </it>vaccines. These genes included known protective antigens, virulence factors, and genes closely related to <it>Brucella </it>vaccines. These VO-interacting <it>Brucella </it>genes were significantly over-represented in biological functional categories, including metabolite transport and metabolism, replication and repair, cell wall biogenesis, intracellular trafficking and secretion, posttranslational modification, and chaperones. Furthermore, a comprehensive interaction network of <it>Brucella </it>vaccines and genes were identified. The asserted and inferred VO hierarchies provide semantic support for inferring novel knowledge of association of vaccines and genes from the retrieved data. New hypotheses were generated based on this analysis approach.</p> <p>Conclusion</p> <p>VO-SciMiner can be used to improve the efficiency for PubMed searching in the vaccine domain.</p

    Emerging Vaccine Informatics

    Get PDF
    Vaccine informatics is an emerging research area that focuses on development and applications of bioinformatics methods that can be used to facilitate every aspect of the preclinical, clinical, and postlicensure vaccine enterprises. Many immunoinformatics algorithms and resources have been developed to predict T- and B-cell immune epitopes for epitope vaccine development and protective immunity analysis. Vaccine protein candidates are predictable in silico from genome sequences using reverse vaccinology. Systematic transcriptomics and proteomics gene expression analyses facilitate rational vaccine design and identification of gene responses that are correlates of protection in vivo. Mathematical simulations have been used to model host-pathogen interactions and improve vaccine production and vaccination protocols. Computational methods have also been used for development of immunization registries or immunization information systems, assessment of vaccine safety and efficacy, and immunization modeling. Computational literature mining and databases effectively process, mine, and store large amounts of vaccine literature and data. Vaccine Ontology (VO) has been initiated to integrate various vaccine data and support automated reasoning

    Omics‐Based Systems Vaccinology for Vaccine Target Identification

    Full text link
    Preclinical Research Omics technologies include genomics, transcriptomics, proteomics, metabolomics, and immunomics. These technologies have been used in vaccine research, which can be summarized using the term “vaccinomics.” These omics technologies combined with advanced bioinformatics analysis form the core of “systems vaccinology.” Omics technologies provide powerful methods in vaccine target identification. The genomics‐based reverse vaccinology starts with predicting vaccine protein candidates through in silico bioinformatics analysis of genome sequences. The VIOLIN V axign vaccine design program ( http://www.violinet.org/vaxign ) is the first web‐based vaccine target prediction software based on the reverse vaccinology strategy. Systematic transcriptomics and proteomics analyses facilitate rational vaccine target identification by detesting genome‐wide gene expression profiles. Immunomics is the study of the set of antigens recognized by host immune systems and has also been used for efficient vaccine target prediction. With the large amount of omics data available, it is necessary to integrate various vaccine data using ontologies, including the G ene O ntology ( GO ) and V accine O ntology ( VO ), for more efficient vaccine target prediction and assessment. All these omics technologies combined with advanced bioinformatics analysis methods for a systems biology‐based vaccine target prediction strategy. This article reviews the various omics technologies and how they can be used in vaccine target identification.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/94576/1/ddr21049.pd

    Bioinformatics analysis of Brucella vaccines and vaccine targets using VIOLIN

    Get PDF
    Abstract Background Brucella spp. are Gram-negative, facultative intracellular bacteria that cause brucellosis, one of the commonest zoonotic diseases found worldwide in humans and a variety of animal species. While several animal vaccines are available, there is no effective and safe vaccine for prevention of brucellosis in humans. VIOLIN (http://www.violinet.org) is a web-based vaccine database and analysis system that curates, stores, and analyzes published data of commercialized vaccines, and vaccines in clinical trials or in research. VIOLIN contains information for 454 vaccines or vaccine candidates for 73 pathogens. VIOLIN also contains many bioinformatics tools for vaccine data analysis, data integration, and vaccine target prediction. To demonstrate the applicability of VIOLIN for vaccine research, VIOLIN was used for bioinformatics analysis of existing Brucella vaccines and prediction of new Brucella vaccine targets. Results VIOLIN contains many literature mining programs (e.g., Vaxmesh) that provide in-depth analysis of Brucella vaccine literature. As a result of manual literature curation, VIOLIN contains information for 38 Brucella vaccines or vaccine candidates, 14 protective Brucella antigens, and 68 host response studies to Brucella vaccines from 97 peer-reviewed articles. These Brucella vaccines are classified in the Vaccine Ontology (VO) system and used for different ontological applications. The web-based VIOLIN vaccine target prediction program Vaxign was used to predict new Brucella vaccine targets. Vaxign identified 14 outer membrane proteins that are conserved in six virulent strains from B. abortus, B. melitensis, and B. suis that are pathogenic in humans. Of the 14 membrane proteins, two proteins (Omp2b and Omp31-1) are not present in B. ovis, a Brucella species that is not pathogenic in humans. Brucella vaccine data stored in VIOLIN were compared and analyzed using the VIOLIN query system. Conclusions Bioinformatics curation and ontological representation of Brucella vaccines promotes classification and analysis of existing Brucella vaccines and vaccine candidates. Computational prediction of Brucella vaccine targets provides more candidates for rational vaccine development. The use of VIOLIN provides a general approach that can be applied for analyses of vaccines against other pathogens and infection diseases.http://deepblue.lib.umich.edu/bitstream/2027.42/78263/1/1745-7580-6-S1-S5.xmlhttp://deepblue.lib.umich.edu/bitstream/2027.42/78263/2/1745-7580-6-S1-S5.pdfPeer Reviewe

    Module-based subnetwork alignments reveal novel transcriptional regulators in malaria parasite Plasmodium falciparum

    Get PDF
    Background Malaria causes over one million deaths annually, posing an enormous health and economic burden in endemic regions. The completion of genome sequencing of the causative agents, a group of parasites in the genus Plasmodium, revealed potential drug and vaccine candidates. However, genomics-driven target discovery has been significantly hampered by our limited knowledge of the cellular networks associated with parasite development and pathogenesis. In this paper, we propose an approach based on aligning neighborhood PPI subnetworks across species to identify network components in the malaria parasite P. falciparum. Results Instead of only relying on sequence similarities to detect functional orthologs, our approach measures the conservation between the neighborhood subnetworks in protein-protein interaction (PPI) networks in two species, P. falciparum and E. coli. 1,082 P. falciparum proteins were predicted as functional orthologs of known transcriptional regulators in the E. coli network, including general transcriptional regulators, parasite-specific transcriptional regulators in the ApiAP2 protein family, and other potential regulatory proteins. They are implicated in a variety of cellular processes involving chromatin remodeling, genome integrity, secretion, invasion, protein processing, and metabolism. Conclusions In this proof-of-concept study, we demonstrate that a subnetwork alignment approach can reveal previously uncharacterized members of the subnetworks, which opens new opportunities to identify potential therapeutic targets and provide new insights into parasite biology, pathogenesis and virulence. This approach can be extended to other systems, especially those with poor genome annotation and a paucity of knowledge about cellular networks
    corecore