Search CORE

Evaluation of genomic island predictors using a comparative genomics approach

Author: Brinkman Fiona SL
Hsiao William WL
Langille Morgan GI
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background Genomic islands (GIs) are clusters of genes in prokaryotic genomes of probable horizontal origin. GIs are disproportionately associated with microbial adaptations of medical or environmental interest. Recently, multiple programs for automated detection of GIs have been developed that utilize sequence composition characteristics, such as G+C ratio and dinucleotide bias. To robustly evaluate the accuracy of such methods, we propose that a dataset of GIs be constructed using criteria that are independent of sequence composition-based analysis approaches. Results We developed a comparative genomics approach (IslandPick) that identifies both very probable islands and non-island regions. The approach involves 1) flexible, automated selection of comparative genomes for each query genome, using a distance function that picks appropriate genomes for identification of GIs, 2) identification of regions unique to the query genome, compared with the chosen genomes (positive dataset) and 3) identification of regions conserved across all genomes (negative dataset). Using our constructed datasets, we investigated the accuracy of several sequence composition-based GI prediction tools. Conclusion Our results indicate that AlienHunter has the highest recall, but the lowest measured precision, while SIGI-HMM is the most precise method. SIGI-HMM and IslandPath/DIMOB have comparable overall highest accuracy. Our comparative genomics approach, IslandPick, was the most accurate, compared with a curated list of GIs, indicating that we have constructed suitable datasets. This represents the first evaluation, using diverse and, independent datasets that were not artificially constructed, of the accuracy of several sequence composition-based GI predictors. The caveats associated with this analysis and proposals for optimal island prediction are discussed.</p

Springer - Publisher Connector

Identification of the Regulatory Logic Controlling Salmonella Pathoadaptation by the SsrA-SsrB Two-Component System

Author: Brinkman Fiona
Coombes Brian
Mulder David
Tomljenovic-Berube
Whiteside Matthew
Publication venue
Publication date: 01/01/2010
Field of study

Sequence data from the past decade has laid bare the significance of horizontal gene transfer in creating genetic diversity in the bacterial world. Regulatory evolution, in which non-coding DNA is mutated to create new regulatory nodes, also contributes to this diversity to allow niche adaptation and the evolution of pathogenesis. To survive in the host environment, Salmonella enterica uses a type III secretion system and effector proteins, which are activated by the SsrA-SsrB two-component system in response to the host environment. To better understand the phenomenon of regulatory evolution in S. enterica, we defined the SsrB regulon and asked how this transcription factor interacts with the cis-regulatory region of target genes. Using ChIP-on-chip, cDNA hybridization, and comparative genomics analyses, we describe the SsrB-dependent regulon of ancestral and horizontally acquired genes. Further, we used a genetic screen and computational analyses integrating experimental data from S. enterica and sequence data from an orthologous regulatory system in the insect endosymbiont, Sodalis glossinidius, to identify the conserved yet flexible palindrome sequence that defines DNA recognition by SsrB. Mutational analysis of a representative promoter validated this palindrome as the minimal architecture needed for regulatory input by SsrB. These data provide a high-resolution map of a regulatory network and the underlying logic enabling pathogen adaptation to a host

Pathway-GPS and SIGORA: identifying relevant pathways based on the over-representation of their gene-pair signatures

Author: Brinkman Fiona S. L.
Foroushani Amir B. K.
Lynn David J
Publication venue: 'PeerJ'
Publication date: 01/12/2013
Field of study

peer-reviewedMotivation. Predominant pathway analysis approaches treat pathways as collections of individual genes and consider all pathway members as equally informative. As a result, at times spurious and misleading pathways are inappropriately identified as statistically significant, solely due to components that they share with the more relevant pathways. Results. We introduce the concept of Pathway Gene-Pair Signatures (Pathway-GPS) as pairs of genes that, as a combination, are specific to a single pathway. We devised and implemented a novel approach to pathway analysis, Signature Over-representation Analysis (SIGORA), which focuses on the statistically significant enrichment of Pathway-GPS in a user-specified gene list of interest. In a comparative evaluation of several published datasets, SIGORA outperformed traditional methods by delivering biologically more plausible and relevant results. Availability. An efficient implementation of SIGORA, as an R package with precompiled GPS data for several human and mouse pathway repositories is available for download from http://sigora.googlecode.com/svn/

T-Stór

InnateDB: systems biology of innate immunity and beyond—recent updates and continuing curation

Author: Breuer Karin
Brinkman Fiona S. L.
Chen Carol
Foroushani Amir K.
Hancock Robert E.W.
Laird Matthew R
Lo Raymond
Lynn David J
Sribnaia Anastasia
Winsor Geoffrey L
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2012
Field of study

peer-reviewedInnateDB (http://www.innatedb.com) is an integrated analysis platform that has been specifically designed to facilitate systems-level analyses of mammalian innate immunity networks, pathways and genes. In this article, we provide details of recent updates and improvements to the database. InnateDB now contains >196 000 human, mouse and bovine experimentally validated molecular interactions and 3000 pathway annotations of relevance to all mammalian cellular systems (i.e. not just immune relevant pathways and interactions). In addition, the InnateDB team has, to date, manually curated in excess of 18 000 molecular interactions of relevance to innate immunity, providing unprecedented insight into innate immunity networks, pathways and their component molecules. More recently, InnateDB has also initiated the curation of allergy- and asthma-related interactions. Furthermore, we report a range of improvements to our integrated bioinformatics solutions including web service access to InnateDB interaction data using Proteomics Standards Initiative Common Query Interface, enhanced Gene Ontology analysis for innate immunity, and the availability of new network visualizations tools. Finally, the recent integration of bovine data makes InnateDB the first integrated network analysis platform for this agriculturally important model organism.This work was supported by Genome BC through the Pathogenomics of Innate Immunity (PI2) project and by the Foundation for the National Institutes of Health and the Canadian Institutes of Health Research under the Grand Challenges in Global Health Research Initiative [Grand Challenges ID: 419]. Further funding was also provided by AllerGen grants 12ASI1 and 12B&B2. D.J.L. was funded in part during this project by a postdoctoral trainee award from the Michael Smith Foundation for Health Research (MSFHR). F.S.L.B. is a MSFHR Senior Scholar and R.E.W.H. holds a Canada Research Chair (CRC). Funding to enable bovine systems biology in InnateDB is provided by Teagasc [RMIS6018] and the Teagasc Walsh Fellowship scheme. IMEx is funded by the European Commission under the PSIMEx project [contract number FP7-HEALTH-2007-223411]. Funding for open access charge: Teagasc [RMIS6018]

PSORTdb: a protein subcellular localization database for bacteria

Author: Acab Michael
Brinkman Fiona S. L.
deFays Katalin
Gardy Jennifer L.
Laird Matthew R.
Lambert Christophe
Rey Sébastien
Publication venue: Oxford University Press
Publication date: 17/12/2004
Field of study

Information about bacterial subcellular localization (SCL) is important for protein function prediction and identification of suitable drug/vaccine/diagnostic targets. PSORTdb (http://db.psort.org/) is a web-accessible database of SCL for bacteria that contains both information determined through laboratory experimentation and computational predictions. The dataset of experimentally verified information (∼2000 proteins) was manually curated by us and represents the largest dataset of its kind. Earlier versions have been used for training SCL predictors, and its incorporation now into this new PSORTdb resource, with its associated additional annotation information and dataset version control, should aid researchers in future development of improved SCL predictors. The second component of this database contains computational analyses of proteins deduced from the most recent NCBI dataset of completely sequenced genomes. Analyses are currently calculated using PSORTb, the most precise automated SCL predictor for bacterial proteins. Both datasets can be accessed through the web using a very flexible text search engine, a data browser, or using BLAST, and the entire database or search results may be downloaded in various formats. Features such as GO ontologies and multiple accession numbers are incorporated to facilitate integration with other bioinformatics resources. PSORTdb is freely available under GNU General Public License

CiteSeerX

Repository of the University of Namur

The Burkholderia Genome Database: facilitating flexible queries and comparative analyses

Author: Bhavjinder Khaira
Dmitrij Frishman
Fiona S. L. Brinkman
Geoffrey L. Winsor
Matthew D. Whiteside
Raymond Lo
Thea Van Rossum
Publication venue: Oxford University Press
Publication date: 01/01/2008
Field of study

Summary: As the genome sequences of multiple strains of a given bacterial species are obtained, more generalized bacterial genome databases may be complemented by databases that are focused on providing more information geared for a distinct bacterial phylogenetic group and its associated research community. The Burkholderia Genome Database represents a model for such a database, providing a powerful, user-friendly search and comparative analysis interface that contains features not found in other genome databases. It contains continually updated, curated and tracked information about Burkholderia cepacia complex genome annotations, plus other Burkholderia species genomes for comparison, providing a high-quality resource for its targeted cystic fibrosis research community

CiteSeerX

Public Library of Science (PLOS)

Evidence of a Large Novel Gene Pool Associated with Prokaryotic Genomic Islands

Author: B. Brett Finlay
Claire Fraser
Dana Aeschliman
Fiona S. L Brinkman
Jenny Bryan
Korine Ung
William W. L Hsiao
Publication venue: Public Library of Science
Publication date: 01/01/2005
Field of study

Microbial genes that are “novel” (no detectable homologs in other species) have become of increasing interest as environmental sampling suggests that there are many more such novel genes in yet-to-be-cultured microorganisms. By analyzing known microbial genomic islands and prophages, we developed criteria for systematic identification of putative genomic islands (clusters of genes of probable horizontal origin in a prokaryotic genome) in 63 prokaryotic genomes, and then characterized the distribution of novel genes and other features. All but a few of the genomes examined contained significantly higher proportions of novel genes in their predicted genomic islands compared with the rest of their genome (Paired t test = 4.43E-14 to 1.27E-18, depending on method). Moreover, the reverse observation (i.e., higher proportions of novel genes outside of islands) never reached statistical significance in any organism examined. We show that this higher proportion of novel genes in predicted genomic islands is not due to less accurate gene prediction in genomic island regions, but likely reflects a genuine increase in novel genes in these regions for both bacteria and archaea. This represents the first comprehensive analysis of novel genes in prokaryotic genomic islands and provides clues regarding the origin of novel genes. Our collective results imply that there are different gene pools associated with recently horizontally transmitted genomic regions versus regions that are primarily vertically inherited. Moreover, there are more novel genes within the gene pool associated with genomic islands. Since genomic islands are frequently associated with a particular microbial adaptation, such as antibiotic resistance, pathogen virulence, or metal resistance, this suggests that microbes may have access to a larger “arsenal” of novel genes for adaptation than previously thought

The Francis Crick Institute

The Association of Virulence Factors with Genomic Islands

Author: Amber Fedynak
Fiona S. L. Brinkman
Morgan G. I. Langille
Niyaz Ahmed
Shannan J. Ho Sui
William W. L. Hsiao
Publication venue: Public Library of Science
Publication date: 01/01/2009
Field of study

Background: It has been noted that many bacterial virulence factor genes are located within genomic islands (GIs; clusters of genes in a prokaryotic genome of probable horizontal origin). However, such studies have been limited to single genera or isolated observations. We have performed the first large-scale analysis of multiple diverse pathogens to examine this association. We additionally identified genes found predominantly in pathogens, but not non-pathogens, across multiple genera using 631 complete bacterial genomes, and we identified common trends in virulence for genes in GIs. Furthermore, we examined the relationship between GIs and clustered regularly interspaced palindromic repeats (CRISPRs) proposed to confer resistance to phage. Methodology/Principal Findings: We show quantitatively that GIs disproportionately contain more virulence factors than the rest of a given genome (p,1E-40 using three GI datasets) and that CRISPRs are also over-represented in GIs. Virulence factors in GIs and pathogen-associated virulence factors are enriched for proteins having more ‘‘offensive’ ’ functions, e.g. active invasion of the host, and are disproportionately components of type III/IV secretion systems or toxins. Numerous hypothetical pathogen-associated genes were identified, meriting further study. Conclusions/Significance: This is the first systematic analysis across diverse genera indicating that virulence factors are disproportionately associated with GIs. ‘‘Offensive’ ’ virulence factors, as opposed to host-interaction factors, may more ofte

CiteSeerX

Public Library of Science (PLOS)

Testing and healthcare seeking behavior preceding HIV diagnosis among migrant and non-migrant individuals living in the Netherlands: Directions for early-case finding

Author: Bedert Maarten
Bil Janneke P
Brinkman Kees
Burns Fiona
Davidovich Udi
Leyten Eliane
Prins Jan M
Prins Maria
van Bilsen Ward PH
van Sighem Ard
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2022
Field of study

OBJECTIVES: To assess differences in socio-demographics, HIV testing and healthcare seeking behavior between individuals diagnosed late and those diagnosed early after HIV-acquisition. DESIGN: Cross-sectional study among recently HIV-diagnosed migrant and non-migrant individuals living in the Netherlands. METHODS: Participants self-completed a questionnaire on socio-demographics, HIV-testing and healthcare seeking behavior preceding HIV diagnosis between 2013-2015. Using multivariable logistic regression, socio-demographic determinants of late diagnosis were explored. Variables on HIV-infection, testing and access to care preceding HIV diagnosis were compared between those diagnosed early and those diagnosed late using descriptive statistics. RESULTS: We included 143 individuals with early and 101 with late diagnosis, of whom respectively 59/143 (41%) and 54/101 (53%) were migrants. Late diagnosis was significantly associated with older age and being heterosexual. Before HIV diagnosis, 89% of those with early and 62% of those with late diagnosis had ever been tested for HIV-infection (p<0.001), and respectively 99% and 97% reported healthcare usage in the Netherlands in the two years preceding HIV diagnosis (p = 0.79). Individuals diagnosed late most frequently visited a general practitioner (72%) or dentist (62%), and 20% had been hospitalized preceding diagnosis. In these settings, only in respectively 20%, 2%, and 6% HIV-testing was discussed. CONCLUSION: A large proportion of people diagnosed late had previously tested for HIV and had high levels of healthcare usage. For earlier-case finding of HIV it therefore seems feasible to successfully roll out interventions within the existing healthcare system. Simultaneously, efforts should be made to encourage future repeated or routine HIV testing among individuals whenever they undergo an HIV test

UCL Discovery