460 research outputs found
The genetic organisation of prokaryotic two-component system signalling pathways
<p>Abstract</p> <p>Background</p> <p>Two-component systems (TCSs) are modular and diverse signalling pathways, involving a stimulus-responsive transfer of phosphoryl groups from transmitter to partner receiver domains. TCS gene and domain organisation are both potentially informative regarding biological function, interaction partnerships and molecular mechanisms. However, there is currently little understanding of the relationships between domain architecture, gene organisation and TCS pathway structure.</p> <p>Results</p> <p>Here we classify the gene and domain organisation of TCS gene loci from 1405 prokaryotic replicons (>40,000 TCS proteins). We find that 200 bp is the most appropriate distance cut-off for defining whether two TCS genes are functionally linked. More than 90% of all TCS gene loci encode just one or two transmitter and/or receiver domains, however numerous other geometries exist, often with large numbers of encoded TCS domains. Such information provides insights into the distribution of TCS domains between genes, and within genes. As expected, the organisation of TCS genes and domains is affected by phylogeny, and plasmid-encoded TCS exhibit differences in organisation from their chromosomally-encoded counterparts.</p> <p>Conclusions</p> <p>We provide here an overview of the genomic and genetic organisation of TCS domains, as a resource for further research. We also propose novel metrics that build upon TCS gene/domain organisation data and allow comparisons between genomic complements of TCSs. In particular, '<it>percentage orphaned TCS genes</it>' (or 'Dissemination') and '<it>percentage of complex loci</it>' (or 'Sophistication') appear to be useful discriminators, and to reflect mechanistic aspects of TCS organisation not captured by existing metrics.</p
RegExpBlasting (REB), a Regular Expression Blasting algorithm based on multiply aligned sequences
Background: One of the most frequent uses of bioinformatics tools
concerns functional characterization of a newly produced nucleotide
sequence (a query sequence) by applying Blast or FASTA against a set of
sequences (the subject sequences).
However, in some specific contexts, it is useful to compare the query
sequence against a cluster such as a MultiAlignment (MA). We present
here the RegExpBlasting (REB) algorithm, which compares an unclassified
sequence with a dataset of patterns defined by application of Regular
Expression rules to a given-as-input MA datasets.
The REB algorithm workflow consists in
i. the definition of a dataset of multialignments
ii. the association of each MA to a pattern, defined by application of
regular expression rules;
iii. automatic characterization of a submitted biosequence according to
the function of the sequences described by the pattern best matching the
query sequence.
Results: An application of this algorithm is used in the "characterize
your sequence" tool available in the PPNEMA resource. PPNEMA is a
resource of Ribosomal Cistron sequences from various species, grouped
according to nematode genera. It allows the retrieval of plant nematode
multialigned sequences or the classification of new nematode rDNA
sequences by applying REB. The same algorithm also supports automatic
updating of the PPNEMA database. The present paper gives examples of the
use of REB within PPNEMA.
Conclusion: The use of REB in PPNEMA updating, the PPNEMA "characterize
your sequence" option clearly demonstrates the power of the method.
Using REB can also rapidly solve any other bioinformatics problem, where
the addition of a new sequence to a pre-existing cluster is required.
The statistical tests carried out here show the powerful flexibility of
the method
Tumor taxonomy for the developmental lineage classification of neoplasms
BACKGROUND: The new "Developmental lineage classification of neoplasms" was described in a prior publication. The classification is simple (the entire hierarchy is described with just 39 classifiers), comprehensive (providing a place for every tumor of man), and consistent with recent attempts to characterize tumors by cytogenetic and molecular features. A taxonomy is a list of the instances that populate a classification. The taxonomy of neoplasia attempts to list every known term for every known tumor of man. METHODS: The taxonomy provides each concept with a unique code and groups synonymous terms under the same concept. A Perl script validated successive drafts of the taxonomy ensuring that: 1) each term occurs only once in the taxonomy; 2) each term occurs in only one tumor class; 3) each concept code occurs in one and only one hierarchical position in the classification; and 4) the file containing the classification and taxonomy is a well-formed XML (eXtensible Markup Language) document. RESULTS: The taxonomy currently contains 122,632 different terms encompassing 5,376 neoplasm concepts. Each concept has, on average, 23 synonyms. The taxonomy populates "The developmental lineage classification of neoplasms," and is available as an XML file, currently 9+ Megabytes in length. A representation of the classification/taxonomy listing each term followed by its code, followed by its full ancestry, is available as a flat-file, 19+ Megabytes in length. The taxonomy is the largest nomenclature of neoplasms, with more than twice the number of neoplasm names found in other medical nomenclatures, including the 2004 version of the Unified Medical Language System, the Systematized Nomenclature of Medicine Clinical Terminology, the National Cancer Institute's Thesaurus, and the International Classification of Diseases Oncolology version. CONCLUSIONS: This manuscript describes a comprehensive taxonomy of neoplasia that collects synonymous terms under a unique code number and assigns each tumor to a single class within the tumor hierarchy. The entire classification and taxonomy are available as open access files (in XML and flat-file formats) with this article
Automatically extracting functionally equivalent proteins from SwissProt
In summary, FOSTA provides an automated analysis of annotations in UniProtKB/Swiss-Prot to enable groups of proteins already annotated as functionally equivalent, to be extracted. Our results demonstrate that the vast majority of UniProtKB/Swiss-Prot functional annotations are of high quality, and that FOSTA can interpret annotations successfully. Where FOSTA is not successful, we are able to highlight inconsistencies in UniProtKB/Swiss-Prot annotation. Most of these would have presented equal difficulties for manual interpretation of annotations. We discuss limitations and possible future extensions to FOSTA, and recommend changes to the UniProtKB/Swiss-Prot format, which would facilitate text-mining of UniProtKB/Swiss-Prot
Detection of putative new mutacins by bioinformatic analysis using available web tools
In order to characterise new bacteriocins produced by Streptococcus mutans we perform a complete bioinformatic analyses by scanning the genome sequence of strains UA159 and NN2025. By searching in the adjacent genomic context of the two-component signal transduction system we predicted the existence of many putative new bacteriocins' maturation pathways and some of them were only exclusive to a group of Streptococcus. Computational genomic and proteomic analysis combined to predictive functionnal analysis represent an alternative way for rapid identification of new putative bacteriocins as well as new potential antimicrobial drugs compared to the more traditional methods of drugs discovery using antagonism tests
Novel cyclic di-GMP effectors of the YajQ protein family control bacterial virulence
Bis-(3 ',5 ') cyclic di-guanylate (cyclic di-GMP) is a key bacterial second messenger that is implicated in the regulation of many critical processes that include motility, biofilm formation and virulence. Cyclic di-GMP influences diverse functions through interaction with a range of effectors. Our knowledge of these effectors and their different regulatory actions is far from complete, however. Here we have used an affinity pull-down assay using cyclic di-GMP-coupled magnetic beads to identify cyclic di-GMP binding proteins in the plant pathogen Xanthomonas campestris pv. campestris (Xcc). This analysis identified XC_3703, a protein of the YajQ family, as a potential cyclic di-GMP receptor. Isothermal titration calorimetry showed that the purified XC_3703 protein bound cyclic di-GMP with a high affinity (K-d similar to 2 mu M). Mutation of XC_3703 led to reduced virulence of Xcc to plants and alteration in biofilm formation. Yeast two-hybrid and far-western analyses showed that XC_3703 was able to interact with XC_2801, a transcription factor of the LysR family. Mutation of XC_2801 and XC_3703 had partially overlapping effects on the transcriptome of Xcc, and both affected virulence. Electromobility shift assays showed that XC_3703 positively affected the binding of XC_2801 to the promoters of target virulence genes, an effect that was reversed by cyclic di-GMP. Genetic and functional analysis of YajQ family members from the human pathogens Pseudomonas aeruginosa and Stenotrophomonas maltophilia showed that they also specifically bound cyclic di-GMP and contributed to virulence in model systems. The findings thus identify a new class of cyclic di-GMP effector that regulates bacterial virulence
- …