Search CORE

20 research outputs found

GC-Profile: a web-based tool for visualizing and analyzing the variation of GC content in genomic sequences

Author: Gao Feng
Zhang Chun-Ting
Publication venue: Oxford University Press
Publication date: 14/07/2006
Field of study

In order to understand the evolution, structure and function of genomes, it is important to know the general compositional features of DNA sequences. Based on the quadratic divergence, a new segmentation algorithm to partition a given genome or DNA sequence into compositionally distinct domains has been put forward. With the aid of the technique of cumulative GC profile, the distribution of segmentation points can be displayed intuitively. We have therefore developed them into GC-Profile, an interactive web-based software system, which can be used to segment prokaryotic and eukaryotic genomes. GC-Profile provides a quantitative and qualitative view of genome organization. Based on the obtained results, the relationships between the G+C content and other genomic features, such as distributions of genes and CpG islands, can be analyzed in a perceivable manner. It shows that GC-Profile would be an appropriate starting point for analyzing the isochore structure of higher eukaryotic genomes, and an intuitive tool for identifying genomic islands in prokaryotic genomes. GC-Profile is freely available at the website . In addition, precompiled binaries, together with examples and documentation, can also be freely downloaded for a local execution

Crossref

PubMed Central

Towards pathogenomics: a web-based resource for pathogenicity islands

Author: Choi Doil
Hur Cheol-Goo
Kim Jihyun F.
Lee Soohyun
Oh Tae Kwang
Park Young-Kyu
Yoon Sung Ho
Publication venue: Oxford University Press
Publication date: 01/01/2006
Field of study

Pathogenicity islands (PAIs) are genetic elements whose products are essential to the process of disease development. They have been horizontally (laterally) transferred from other microbes and are important in evolution of pathogenesis. In this study, a comprehensive database and search engines specialized for PAIs were established. The pathogenicity island database (PAIDB) is a comprehensive relational database of all the reported PAIs and potential PAI regions which were predicted by a method that combines feature-based analysis and similarity-based analysis. Also, using the PAI Finder search application, a multi-sequence query can be analyzed onsite for the presence of potential PAIs. As of April 2006, PAIDB contains 112 types of PAIs and 889 GenBank accessions containing either partial or all PAI loci previously reported in the literature, which are present in 497 strains of pathogenic bacteria. The database also offers 310 candidate PAIs predicted from 118 sequenced prokaryotic genomes. With the increasing number of prokaryotic genomes without functional inference and sequenced genetic regions of suspected involvement in diseases, this web-based, user-friendly resource has the potential to be of significant use in pathogenomics. PAIDB is freely accessible at

CiteSeerX

Crossref

PubMed Central

The Wavelet-Based Cluster Analysis for Temporal Gene Expression Data

Author: Duan KM
Song JZ
Surette M
Ware T
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

A variety of high-throughput methods have made it possible to generate detailed temporal expression data for a single gene or large numbers of genes. Common methods for analysis of these large data sets can be problematic. One challenge is the comparison of temporal expression data obtained from different growth conditions where the patterns of expression may be shifted in time. We propose the use of wavelet analysis to transform the data obtained under different growth conditions to permit comparison of expression patterns from experiments that have time shifts or delays. We demonstrate this approach using detailed temporal data for a single bacterial gene obtained under 72 different growth conditions. This general strategy can be applied in the analysis of data sets of thousands of genes under different conditions

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Application of Wavelet Packet Transform to detect genetic polymorphisms by the analysis of inter-Alu PCR patterns

Author: Bazzani Armando
Cardelli Maurizio
Franceschi Claudio
Nicoli Matteo
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background The analysis of Inter-Alu PCR patterns obtained from human genomic DNA samples is a promising technique for a simultaneous analysis of many genomic loci flanked by Alu repetitive sequences in order to detect the presence of genetic polymorphisms. Inter-Alu PCR products may be separated and analyzed by capillary electrophoresis using an automatic sequencer that generates a complex pattern of peaks. We propose an algorithmic method based on the Haar-Walsh Wavelet Packet Transformation (WPT) for an efficient detection of fingerprint-type patterns generated by PCR-based methodologies. We have tested our algorithmic approach on inter-Alu patterns obtained from the genomic DNA of three couples of monozygotic twins, expecting that the inter-Alu patterns of each twins couple will show differences due to unavoidable experimental variability. On the contrary the differences among samples of different twins are supposed to originate from genetic variability. Our goal is to automatically detect regions in the inter-Alu pattern likely associated to the presence of genetic polymorphisms. Results We show that the WPT algorithm provides a reliable tool to identify sample to sample differences in complex peak patterns, reducing the possible errors and limits associated to a subjective evaluation. The redundant decomposition of the WPT algorithm allows for a procedure of best basis selection which maximizes the pattern differences at the lowest possible scale. Our analysis points out few classifying signal regions that could indicate the presence of possible genetic polymorphisms. Conclusions The WPT algorithm based on the Haar-Walsh wavelet is an efficient tool for a non-supervised pattern classification of inter-ALU signals provided by a genetic analyzer, even if it was not possible to estimate the power and false positive rate due to the lacking of a suitable data base. The identification of non-reproducible peaks is usually accomplished comparing different experimental replicates of each sample. Moreover, we remark that, albeit we developed and optimized an algorithm able to analyze patterns obtained through inter-Alu PCR, the method is theoretically applicable to whatever fingerprint-type pattern obtained analyzing anonymous DNA fragments through capillary electrophoresis, and it could be usefully applied on a wide range of fingerprint-type methodologies.</p

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Genome Scale Comparison of Mycobacterium avium subsp. paratuberculosis with Mycobacterium avium subsp. avium Reveals Potential Diagnostic Sequences

Author: Baechler Emily
Bannantine John
Kapur Vivek
Li LingLing
Zhang Qing
Publication venue: DigitalCommons@University of Nebraska - Lincoln
Publication date: 01/01/2002
Field of study

The genetic similarity between Mycobacterium avium subsp. paratuberculosis and other mycobacterial species has confounded the development of M. avium subsp. paratuberculosis-specific diagnostic reagents. Random shotgun sequencing of the M. avium subsp. paratuberculosis genome in our laboratories has shown \u3e98% sequence identity with Mycobacterium avium subsp. avium in some regions. However, an in silico comparison of the largest annotated M. avium subsp. paratuberculosis contigs, totaling 2,658,271 bp, with the unfinished M. avium subsp. avium genome has revealed 27 predicted M. avium subsp. paratuberculosis coding sequences that do not align with M. avium subsp. avium sequences. BLASTP analysis of the 27 predicted coding sequences (genes) shows that 24 do not match sequences in public sequence databases, such as GenBank. These novel sequences were examined by PCR amplification with genomic DNA from eight mycobacterial species and ten independent isolates of M. avium subsp. paratuberculosis. From these analyses, 21 genes were found to be present in all M. avium subsp. paratuberculosis isolates and absent from all other mycobacterial species tested. One region of the M. avium subsp. paratuberculosis genome contains a cluster of eight genes, arranged in tandem, that is absent in other mycobacterial species. This region spans 4.4 kb and is separated from other predicted coding regions by 1,408 bp upstream and 1,092 bp downstream. The gene upstream of this eight-gene cluster has strong similarity to mycobacteriophage integrase sequences. The GC content of this 4.4-kb region is 66%, which is similar to the rest of the genome, indicating that this region was not horizontally acquired recently. Southern hybridization analysis confirmed that this gene cluster is present only in M. avium subsp. paratuberculosis. Collectively, these studies suggest that a genomics approach will help in identifying novel M. avium subsp. paratuberculosis genes as candidate diagnostic sequences

DigitalCommons@University of Nebraska

PubMed Central

A Benchmark of Parametric Methods for Horizontal Transfers Detection

Author: A Carbone
A Tsirigos
B Wang
C Dufraigne
C Dutta
C Medigue
C Regeard
Cécile Churlaud
DQ Cortez
E Lerat
G Perriere
H Ochman
J Hacker
J Hacker
J Mrazek
JA Eisen
Jennifer Becq
JG Lawrence
JG Lawrence
JG Lawrence
JP Gogarten
JP Gogarten
L Koski
L Ruiting
M Hamady
M Ip
M Letek
M Poptsova
MA Ragan
MA Ragan
MA Ragan
MGI Langille
MGI Langille
MW van Passel
N Sueoka
N Sueoka
Olivier Neyrolles
P Deschavanne
P Lio
P Lio
Patrick Deschavanne
PJ Deschavanne
Q Tu
R Merkl
R Rolfe
RK Azad
RK Azad
S Garcia-Vallve
S Garcia-Vallvé
S Guindon
S Karlin
S Karlin
S Karlin
S Schjorring
S Waack
SD Hooper
SH Yoon
V Daubin
V Daubin
W Hsiao
WF Doolittle
WS Hayes
Y Nakamura
Publication venue: Public Library of Science
Publication date: 01/04/2010
Field of study

Horizontal gene transfer (HGT) has appeared to be of importance for prokaryotic species evolution. As a consequence numerous parametric methods, using only the information embedded in the genomes, have been designed to detect HGTs. Numerous reports of incongruencies in results of the different methods applied to the same genomes were published. The use of artificial genomes in which all HGT parameters are controlled allows testing different methods in the same conditions. The results of this benchmark concerning 16 representative parametric methods showed a great variety of efficiencies. Some methods work very poorly whatever the type of HGTs and some depend on the conditions or on the metrics used. The best methods in terms of total errors were those using tetranucleotides as criterion for the window methods or those using codon usage for gene based methods and the Kullback-Leibler divergence metric. Window methods are very sensitive but less specific and detect badly lone isolated gene. On the other hand gene based methods are often very specific but lack of sensitivity. We propose using two methods in combination to get the best of each category, a gene based one for specificity and a window based one for sensitivity

Public Library of Science (PLOS)

Crossref

PubMed Central

A computational approach for identifying pathogenicity islands in prokaryotic genomes

Author: Hur Cheol-Goo
Kang Ho-Young
Kim Jihyun F
Kim Yeoun Hee
Oh Tae Kwang
Yoon Sung Ho
Publication venue: BioMed Central
Publication date: 01/01/2005
Field of study

BACKGROUND: Pathogenicity islands (PAIs), distinct genomic segments of pathogens encoding virulence factors, represent a subgroup of genomic islands (GIs) that have been acquired by horizontal gene transfer event. Up to now, computational approaches for identifying PAIs have been focused on the detection of genomic regions which only differ from the rest of the genome in their base composition and codon usage. These approaches often lead to the identification of genomic islands, rather than PAIs. RESULTS: We present a computational method for detecting potential PAIs in complete prokaryotic genomes by combining sequence similarities and abnormalities in genomic composition. We first collected 207 GenBank accessions containing either part or all of the reported PAI loci. In sequenced genomes, strips of PAI-homologs were defined based on the proximity of the homologs of genes in the same PAI accession. An algorithm reminiscent of sequence-assembly procedure was then devised to merge overlapping or adjacent genomic strips into a large genomic region. Among the defined genomic regions, PAI-like regions were identified by the presence of homolog(s) of virulence genes. Also, GIs were postulated by calculating G+C content anomalies and codon usage bias. Of 148 prokaryotic genomes examined, 23 pathogenic and 6 non-pathogenic bacteria contained 77 candidate PAIs that partly or entirely overlap GIs. CONCLUSION: Supporting the validity of our method, included in the list of candidate PAIs were thirty four PAIs previously identified from genome sequencing papers. Furthermore, in some instances, our method was able to detect entire PAIs for those only partial sequences are available. Our method was proven to be an efficient method for demarcating the potential PAIs in our study. Also, the function(s) and origin(s) of a candidate PAI can be inferred by investigating the PAI queries comprising it. Identification and analysis of potential PAIs in prokaryotic genomes will broaden our knowledge on the structure and properties of PAIs and the evolution of bacterial pathogenesis

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

PIPS: Pathogenicity Island Prediction Software

The adaptability of pathogenic bacteria to hosts is influenced by the genomic plasticity of the bacteria, which can be increased by such mechanisms as horizontal gene transfer. Pathogenicity islands play a major role in this type of gene transfer because they are large, horizontally acquired regions that harbor clusters of virulence genes that mediate the adhesion, colonization, invasion, immune system evasion, and toxigenic properties of the acceptor organism. Currently, pathogenicity islands are mainly identified in silico based on various characteristic features: (1) deviations in codon usage, G+C content or dinucleotide frequency and (2) insertion sequences and/or tRNA genetic flanking regions together with transposase coding genes. Several computational techniques for identifying pathogenicity islands exist. However, most of these techniques are only directed at the detection of horizontally transferred genes and/or the absence of certain genomic regions of the pathogenic bacterium in closely related non-pathogenic species. Here, we present a novel software suite designed for the prediction of pathogenicity islands (pathogenicity island prediction software, or PIPS). In contrast to other existing tools, our approach is capable of utilizing multiple features for pathogenicity island detection in an integrative manner. We show that PIPS provides better accuracy than other available software packages. As an example, we used PIPS to study the veterinary pathogen Corynebacterium pseudotuberculosis, in which we identified seven putative pathogenicity islands

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Publications at Bielefeld University

MPG.PuRe

The Influence of Recombination on Human Genetic Diversity

Author: Bernard Silverman
Chris C. A Spencer
David Bentley
Gil McVean
International Human Genome Sequencing Consortium
Jeffrey D Wall
Jim Mullikin
Panos Deloukas
Peter Donnelly
Sarah Hunt
Simon Myers
The International HapMap Consortium
Publication venue: Public Library of Science
Publication date: 01/01/2005
Field of study

In humans, the rate of recombination, as measured on the megabase scale, is positively associated with the level of genetic variation, as measured at the genic scale. Despite considerable debate, it is not clear whether these factors are causally linked or, if they are, whether this is driven by the repeated action of adaptive evolution or molecular processes such as double-strand break formation and mismatch repair. We introduce three innovations to the analysis of recombination and diversity: fine-scale genetic maps estimated from genotype experiments that identify recombination hotspots at the kilobase scale, analysis of an entire human chromosome, and the use of wavelet techniques to identify correlations acting at different scales. We show that recombination influences genetic diversity only at the level of recombination hotspots. Hotspots are also associated with local increases in GC content and the relative frequency of GC-increasing mutations but have no effect on substitution rates. Broad-scale association between recombination and diversity is explained through covariance of both factors with base composition. To our knowledge, these results are the first evidence of a direct and local influence of recombination hotspots on genetic variation and the fate of individual mutations. However, that hotspots have no influence on substitution rates suggests that they are too ephemeral on an evolutionary time scale to have a strong influence on broader scale patterns of base composition and long-term molecular evolution

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Oxford University Research Archive

Bacterial genomic G + C composition-eliciting environmental adaptation

Author: Alberts
Allen
Altschul
Ashelford
Beiko
Beiko
Beiko
Beiko
Bentley
Boc
Bridges
Budd
Canchaya
Chen
Chen
Chen
Connell
Daubin
Deschavanne
Dobrindt
Eden
Ermolaeva
Fichant
Foerstner
Fouts
Frost
Garcia-Vallve
Garcia-Vallve
Glass
Greub
Guttman
Hacker
Hacker
Hagberg
Hamady
Hein
Hill
Ikemura
Jain
Kagawa
Karaolis
Karlin
Kurland
Lee
Leslie
Lin
Lindahl
Lio
Lobry
Louarn
Lu
Mann
Mantri
Middendorf
Miller
Mitchell
Moran
Musto
Musto
Muto
Nakabachi
Nakhleh
Naya
Nishio
Ochman
Oliver
Pace
Parham
Perna
Perriere
Peshkin
Rocha
Romeu
Rothstein
Schmidt
Schneider
Schouls
Scott Mann
Sharp
Sharp
Shigenobu
Suchard
Sueoka
Than
Toh
Tsirigos
Tsirigos
van Ham
van Passel
Varki
Vernikos
Wang
Wang
Wixon
Xia
Yap
Yi-Ping Phoebe Chen
Yoon
Yoon
Zhang
Zhao
Publication venue: 'Elsevier BV'
Publication date: 01/01/2010
Field of study

Bacterial genomes reflect their adaptation strategies through nucleotide usage trends found in their chromosome composition. Bacteria, unlike eukaryotes contain a wide range of genomic G + C. This wide variability may be viewed as a response to environmental adaptation. Two overarching trends are observed across bacterial genomes, the first, correlates genomic G + C to environmental niches and lifestyle, while the other utilizees intra-genomic G + C incongruence to delineate horizontally transferred material. In this review, we focus on the influence of several properties including biochemical, genetic flows, selection biases, and the biochemical-energetic properties shaping genome composition. Outcomes indicate a trend toward high G + C and larger genomes in free-living organisms, as a result of more complex and varied environments (higher chance for horizontal gene transfer). Conversely, nutrient limiting and nutrient poor environments dictate smaller genomes of low GC in attempts to conserve replication expense. Varied processes including translesion repair mechanisms, phage insertion and cytosine degradation has been shown to introduce higher AT in genomic sequences. We conclude the review with an analysis of current bioinformatics tools seeking to elicit compositional variances and highlight the practical implications when using such techniques

Deakin Research Online

Elsevier - Publisher Connector

Crossref