10 research outputs found
Position dependencies in transcription factor binding sites
Motivation: Most of the available tools for transcription factor binding site prediction are based on methods which assume no sequence dependence between the binding site base positions. Our primary objective was to investigate the statistical basis for either a claim of dependence or independence, to determine whether such a claim is generally true, and to use the resulting data to develop improved scoring functions for binding-site prediction. Results: Using three statistical tests, we analyzed the number of binding sites showing dependent positions. We analyzed transcription factor-DNA crystal structures for evidence of position dependence. Our final conclusions were that some factors show evidence of dependencies whereas others do not. We observed that the conformational energy (Z-score) of the transcription factor-DNA complexes was lower (better) for sequences that showed dependency than for those that did not (P < 0.02). We suggest that where evidence exists for dependencies, these should be modeled to improve binding-site predictions. However, when no significant dependency is found, this correction should be omitted. This may be done by converting any existing scoring function which assumes independence into a form which includes a dependency correction. We present an example of such an algorithm and its implementation as a web tool. Availability: http://promoterplot.fmi.ch/cgi-bin/dep.html Contact: [email protected] Supplementary information: Supplementary data (1, 2, 3, 4, 5, 6, 7 and 8) are available at Bioinformatics onlin
Quality estimation of multiple sequence alignments by Bayesian hypothesis testing
Summary: In this work we present a web-based tool for estimating multiple alignment quality using Bayesian hypothesis testing. The proposed method is very simple, easily implemented and not time consuming with a linear complexity. We evaluated method against a series of different alignments (a set of random and biologically derived alignments) and compared the results with tools based on classical statistical methods (such as sFFT and csFFT). Taking correlation coefficient as an objective criterion of the true quality, we found that Bayesian hypothesis testing performed better on average than the classical methods we tested. This approach may be used independently or as a component of any tool in computational biology which is based on the statistical estimation of alignment quality. Availability: http://www.fmi.ch/groups/functional.genomics/tool.htm Contact: [email protected] Supplementary information: Supplementary data are available from http://www.fmi.ch/groups/functional.genomics/tool-Supp.ht
Computational Structural Analysis: Multiple Proteins Bound to DNA
BACKGROUND: With increasing numbers of crystal structures of proteinratioDNA and proteinratioproteinratioDNA complexes publically available, it is now possible to extract sufficient structural, physical-chemical and thermodynamic parameters to make general observations and predictions about their interactions. In particular, the properties of macromolecular assemblies of multiple proteins bound to DNA have not previously been investigated in detail. METHODOLOGY/PRINCIPAL FINDINGS: We have performed computational structural analyses on macromolecular assemblies of multiple proteins bound to DNA using a variety of different computational tools: PISA; PROMOTIF; X3DNA; ReadOut; DDNA and DCOMPLEX. Additionally, we have developed and employed an algorithm for approximate collision detection and overlapping volume estimation of two macromolecules. An implementation of this algorithm is available at http://promoterplot.fmi.ch/Collision1/. The results obtained are compared with structural, physical-chemical and thermodynamic parameters from proteinratioprotein and single proteinratioDNA complexes. Many of interface properties of multiple proteinratioDNA complexes were found to be very similar to those observed in binary proteinratioDNA and proteinratioprotein complexes. However, the conformational change of the DNA upon protein binding is significantly higher when multiple proteins bind to it than is observed when single proteins bind. The water mediated contacts are less important (found in less quantity) between the interfaces of components in ternary (proteinratioproteinratioDNA) complexes than in those of binary complexes (proteinratioprotein and proteinratioDNA).The thermodynamic stability of ternary complexes is also higher than in the binary interactions. Greater specificity and affinity of multiple proteins binding to DNA in comparison with binary protein-DNA interactions were observed. However, protein-protein binding affinities are stronger in complexes without the presence of DNA. CONCLUSIONS/SIGNIFICANCE: Our results indicate that the interface properties: interface area; number of interface residues/atoms and hydrogen bonds; and the distribution of interface residues, hydrogen bonds, van der Walls contacts and secondary structure motifs are independent of whether or not a protein is in a binary or ternary complex with DNA. However, changes in the shape of the DNA reduce the off-rate of the proteins which greatly enhances the stability and specificity of ternary complexes compared to binary ones
Computational analysis of promoters and DNA-protein interactions
The investigation of promoter activity and DNA-protein interactions is very important for
understanding many crucial cellular processes, including transcription, recombination and
replication. Promoter activity and DNA-protein interactions can be studied in the lab (in
vitro or in vivo) or using computational methods (in silico). Computational approaches
for analysing promoters and DNA-protein interactions have become more powerful as
more and more complete genome sequences, 3D structural data, and high-throughput data
(such as ChIP-chip and expression data) have become available. Modern scientific
research into promoters and DNA-protein interactions represents a high level of cooperation
between computational and laboratorial methods.
This thesis covers several aspects of the computational analysis of promoters and DNAprotein
interactions: analysis of transcription factor binding sites (investigating position
dependencies in transcription factor binding sties); computational prediction of
transcription factor binding sites (a new scanning method for the in silico prediction of
transcription factor binding sites is described); computational analysis of crystal
structures of DNA-protein interactions (multiple proteins bound to DNA); and
computational predictions of transcription factor co-operations (investigating
dependencies between transcription factors in human, mouse and rat genomes, and a new
method of in silico prediction of cis-regulatory motifs and transcription start sites is
described). In addition, this thesis reports how one statistical method for the analysis of
transcription factor binding sites can be used for estimating the quality of multiple
sequence alignments.
The main finding reported in this thesis is that it is wrong to assume, a priori, that
positions in transcription factor binding sites are all either independent or dependent on
one another. Position dependencies should be tested using rigorous statistical methods on
a case-by-case basis. When dependencies are detected, they can be modelled in a very
simple way, which doesn’t require complex mathematical tools with a lot of parameters
and more data. An example of such a model, including a web-based implementation of
the algorithm, is reported in this thesis. It has also been shown that the conformational energy (indirect readout) of DNA in complexes with transcription factors which have
dependent positions in their binding sites is significant ly higher than in those with
transcription factors which do not have dependent positions in their binding sites.
The structural analysis of multiple protein-DNA interactions showed that the formation
of interactions between multiple proteins and DNA results in a decrease in proteinprotein
affinity and an increase in protein-DNA affinity, with a net gain in overall
stability of complexes where multiple proteins are bound to DNA. This effect is clearly
important for modelling transcription factor co-operativity. In addition, the physical
overlap of two factors does not simply relate to the region on the DNA where the binding
site is found. Two factors may lie very close together but possibly not physically overlap
because their side-chains can interlink with one another. In this way, it is possible to find
a large overlap between two transcription factor binding sites, but from a 3D perspective
it is still possible for both factors to bind simultaneously. It may also be that one
transcription factor binds to the minor and another to the major groove of DNA. That
information is also useful for modelling transcription factor co-operativity.
Moreover, this thesis reports the results from a computational prediction of dependencies
(co-operativities) between transcription factors which usually act together in gene
regulation in human, mouse and rat genomes. It is shown that that the computational
analysis of transcription factor site dependencies is a valuable complement to
experimental approaches for discovering transcription regulatory interactions and
networks. Scanning promoter sequences with dependent groups of transcription factor binding sites improve the quality of transcription factor predictions. Finally, it has been
demonstrated that modelling transcription factor co-operativities improves the quality of
transcription start site predictions. For three genes (ctmp, gap-43 and ngfrap) in-vivo
validation of the predicted transcription start sites is performed.
Finally, the Bayesian method for the detection of dependencies between positions in
transcription factor binding sites can easily be converted into a method for estimating the
quality of multiple sequence alignments. That method is simple, linear complexity, which
is easy to implement and which performs better than other state-of-the-art methods which
are more complex
The Role of Nucleases Cleaving TLR3, TLR7/8 and TLR9 Ligands, Dicer RNase and miRNA/piRNA Proteins in Functional Adaptation to the Immune Escape and Xenophagy of Prostate Cancer Tissue
The prototypic sensors for the induction of innate and adaptive immune responses are the Toll-like receptors (TLRs). Unusually high expression of TLRs in prostate carcinoma (PC), associated with less differentiated, more aggressive and more propagating forms of PC, changed the previous paradigm about the role of TLRs strictly in immune defense system. Our data reveal an entirely novel role of nucleic acids-sensing Toll-like receptors (NA-TLRs) in functional adaptation of malignant cells for supply and digestion of surrounding metabolic substrates from dead cells as specific mechanism of cancer cells survival, by corresponding ligands accelerated degradation and purine/pyrimidine salvage pathway. The spectrophotometric measurement protocols used for the determination of the activity of RNases and DNase II have been optimized in our laboratory as well as the enzyme-linked immunosorbent method for the determination of NF-ÎşB p65 in prostate tissue samples. The protocols used to determine Dicer RNase, AGO2, TARBP2 and PIWIL4 were based on enzyme-linked immunosorbent assay. The amount of pre-existing acid-soluble oligonucleotides was measured and expressed as coefficient of absorbance. The activities of acid DNase II and RNase T2, and the activities of nucleases cleaving TLR3, TLR7/8 and TLR9 ligands (Poly I:C, poly U and unmethylated CpG), increased several times in PC, compared to the corresponding tumor adjacent and control tissue, exerting very high sensitivity and specificity of above 90%. Consequently higher levels of hypoxanthine and NF-ÎşB p65 were reported in PC, whereas the opposite results were observed for miRNA biogenesis enzyme (Dicer RNase), miRNA processing protein (TARB2), miRNA-induced silencing complex protein (Argonaute-AGO) and PIWI-interacting RNAs silence transposon. Considering the crucial role of purine and pyrimidine nucleotides as energy carriers, subunits of nucleic acids and nucleotide cofactors, future explorations will be aimed to design novel anti-cancer immune strategies based on a specific acid endolysosomal nuclease inhibition