111 research outputs found
Itt1p, a novel protein inhibiting translation termination in Saccharomyces cerevisiae
BACKGROUND: Termination of translation in eukaryotes is controlled by two interacting polypeptide chain release factors, eRFl and eRF3. eRFl recognizes nonsense codons UAA, UAG and UGA, while eRF3 stimulates polypeptide release from the ribosome in a GTP- and eRFl – dependent manner. Recent studies has shown that proteins interacting with these release factors can modulate the efficiency of nonsense codon readthrough. RESULTS: We have isolated a nonessential yeast gene, which causes suppression of nonsense mutations, being in a multicopy state. This gene encodes a protein designated Itt1p, possessing a zinc finger domain characteristic of the TRIAD proteins of higher eukaryotes. Overexpression of Itt1p decreases the efficiency of translation termination, resulting in the readthrough of all three types of nonsense codons. Itt1p interacts in vitro with both eRFl and eRF3. Overexpression of eRFl, but not of eRF3, abolishes the nonsense suppressor effect of overexpressed Itt1p. CONCLUSIONS: The data obtained demonstrate that Itt1p can modulate the efficiency of translation termination in yeast. This protein possesses a zinc finger domain characteristic of the TRIAD proteins of higher eukaryotes, and this is a first observation of such protein being involved in translation
Infectious Disease Ontology
Technological developments have resulted in tremendous increases in the volume and diversity of the data and information that must be processed in the course of biomedical and clinical research and practice. Researchers are at the same time under ever greater pressure to share data and to take steps to ensure that data resources are interoperable. The use of ontologies to annotate data has proven successful in supporting these goals and in providing new possibilities for the automated processing of data and information. In this chapter, we describe different types of vocabulary resources and emphasize those features of formal ontologies that make them most useful for computational applications. We describe current uses of ontologies and discuss future goals for ontology-based computing, focusing on its use in the field of infectious diseases. We review the largest and most widely used vocabulary resources relevant to the study of infectious diseases and conclude with a description of the Infectious Disease Ontology (IDO) suite of interoperable ontology modules that together cover the entire infectious disease domain
Amyloid-Mediated Sequestration of Essential Proteins Contributes to Mutant Huntingtin Toxicity in Yeast
BACKGROUND: Polyglutamine expansion is responsible for several neurodegenerative disorders, among which Huntington disease is the most well-known. Studies in the yeast model demonstrated that both aggregation and toxicity of a huntingtin (htt) protein with an expanded polyglutamine region strictly depend on the presence of the prion form of Rnq1 protein ([PIN+]), which has a glutamine/asparagine-rich domain. PRINCIPAL FINDINGS: Here, we showed that aggregation and toxicity of mutant htt depended on [PIN+] only quantitatively: the presence of [PIN+] elevated the toxicity and the levels of htt detergent-insoluble polymers. In cells lacking [PIN+], toxicity of mutant htt was due to the polymerization and inactivation of the essential glutamine/asparagine-rich Sup35 protein and related inactivation of another essential protein, Sup45, most probably via its sequestration into Sup35 aggregates. However, inhibition of growth of [PIN+] cells depended on Sup35/Sup45 depletion only partially, suggesting that there are other sources of mutant htt toxicity in yeast. CONCLUSIONS: The obtained data suggest that induced polymerization of essential glutamine/asparagine-rich proteins and related sequestration of other proteins which interact with these polymers represent an essential source of htt toxicity
Facilitated sequence assembly using densely labeled optical DNA barcodes:A combinatorial auction approach
<div><p>The output from whole genome sequencing is a set of contigs, i.e. short non-overlapping DNA sequences (sizes 1-100 kilobasepairs). Piecing the contigs together is an especially difficult task for previously unsequenced DNA, and may not be feasible due to factors such as the lack of sufficient coverage or larger repetitive regions which generate gaps in the final sequence. Here we propose a new method for scaffolding such contigs. The proposed method uses densely labeled optical DNA barcodes from competitive binding experiments as scaffolds. On these scaffolds we position theoretical barcodes which are calculated from the contig sequences. This allows us to construct longer DNA sequences from the contig sequences. This proof-of-principle study extends previous studies which use sparsely labeled DNA barcodes for scaffolding purposes. Our method applies a probabilistic approach that allows us to discard “foreign” contigs from mixed samples with contigs from different types of DNA. We satisfy the contig non-overlap constraint by formulating the contig placement challenge as a combinatorial auction problem. Our exact algorithm for solving this problem reduces computational costs compared to previous methods in the combinatorial auction field. We demonstrate the usefulness of the proposed scaffolding method both for synthetic contigs and for contigs obtained using Illumina sequencing for a mixed sample with plasmid and chromosomal DNA.</p></div
Occupancy Classification of Position Weight Matrix-Inferred Transcription Factor Binding Sites
BACKGROUND: Computational prediction of Transcription Factor Binding Sites (TFBS) from sequence data alone is difficult and error-prone. Machine learning techniques utilizing additional environmental information about a predicted binding site (such as distances from the site to particular chromatin features) to determine its occupancy/functionality class show promise as methods to achieve more accurate prediction of true TFBS in silico. We evaluate the Bayesian Network (BN) and Support Vector Machine (SVM) machine learning techniques on four distinct TFBS data sets and analyze their performance. We describe the features that are most useful for classification and contrast and compare these feature sets between the factors. RESULTS: Our results demonstrate good performance of classifiers both on TFBS for transcription factors used for initial training and for TFBS for other factors in cross-classification experiments. We find that distances to chromatin modifications (specifically, histone modification islands) as well as distances between such modifications to be effective predictors of TFBS occupancy, though the impact of individual predictors is largely TF specific. In our experiments, Bayesian network classifiers outperform SVM classifiers. CONCLUSIONS: Our results demonstrate good performance of machine learning techniques on the problem of occupancy classification, and demonstrate that effective classification can be achieved using distances to chromatin features. We additionally demonstrate that cross-classification of TFBS is possible, suggesting the possibility of constructing a generalizable occupancy classifier capable of handling TFBS for many different transcription factors
G+C content dominates intrinsic nucleosome occupancy
<p>Abstract</p> <p>Background</p> <p>The relative preference of nucleosomes to form on individual DNA sequences plays a major role in genome packaging. A wide variety of DNA sequence features are believed to influence nucleosome formation, including periodic dinucleotide signals, poly-A stretches and other short motifs, and sequence properties that influence DNA structure, including base content. It was recently shown by Kaplan et al. that a probabilistic model using composition of all 5-mers within a nucleosome-sized tiling window accurately predicts intrinsic nucleosome occupancy across an entire genome <it>in vitro</it>. However, the model is complicated, and it is not clear which specific DNA sequence properties are most important for intrinsic nucleosome-forming preferences.</p> <p>Results</p> <p>We find that a simple linear combination of only 14 simple DNA sequence attributes (G+C content, two transformations of dinucleotide composition, and the frequency of eleven 4-bp sequences) explains nucleosome occupancy <it>in vitro </it>and <it>in vivo </it>in a manner comparable to the Kaplan model. G+C content and frequency of AAAA are the most important features. G+C content is dominant, alone explaining ~50% of the variation in nucleosome occupancy <it>in vitro</it>.</p> <p>Conclusions</p> <p>Our findings provide a dramatically simplified means to predict and understand intrinsic nucleosome occupancy. G+C content may dominate because it both reduces frequency of poly-A-like stretches and correlates with many other DNA structural characteristics. Since G+C content is enriched or depleted at many types of features in diverse eukaryotic genomes, our results suggest that variation in nucleotide composition may have a widespread and direct influence on chromatin structure.</p
GRISOTTO: A greedy approach to improve combinatorial algorithms for motif discovery with prior knowledge
<p>Abstract</p> <p>Background</p> <p>Position-specific priors (PSP) have been used with success to boost EM and Gibbs sampler-based motif discovery algorithms. PSP information has been computed from different sources, including orthologous conservation, DNA duplex stability, and nucleosome positioning. The use of prior information has not yet been used in the context of combinatorial algorithms. Moreover, priors have been used only independently, and the gain of combining priors from different sources has not yet been studied.</p> <p>Results</p> <p>We extend RISOTTO, a combinatorial algorithm for motif discovery, by post-processing its output with a greedy procedure that uses prior information. PSP's from different sources are combined into a scoring criterion that guides the greedy search procedure. The resulting method, called GRISOTTO, was evaluated over 156 yeast TF ChIP-chip sequence-sets commonly used to benchmark prior-based motif discovery algorithms. Results show that GRISOTTO is at least as accurate as other twelve state-of-the-art approaches for the same task, even without combining priors. Furthermore, by considering combined priors, GRISOTTO is considerably more accurate than the state-of-the-art approaches for the same task. We also show that PSP's improve GRISOTTO ability to retrieve motifs from mouse ChiP-seq data, indicating that the proposed algorithm can be applied to data from a different technology and for a higher eukaryote.</p> <p>Conclusions</p> <p>The conclusions of this work are twofold. First, post-processing the output of combinatorial algorithms by incorporating prior information leads to a very efficient and effective motif discovery method. Second, combining priors from different sources is even more beneficial than considering them separately.</p
Genome-wide binding of the orphan nuclear receptor TR4 suggests its general role in fundamental biological processes
<p>Abstract</p> <p>Background</p> <p>The orphan nuclear receptor TR4 (human testicular receptor 4 or NR2C2) plays a pivotal role in a variety of biological and metabolic processes. With no known ligand and few known target genes, the mode of TR4 function was unclear.</p> <p>Results</p> <p>We report the first genome-wide identification and characterization of TR4 <it>in vivo </it>binding. Using chromatin immunoprecipitation followed by high throughput sequencing (ChIP-seq), we identified TR4 binding sites in 4 different human cell types and found that the majority of target genes were shared among different cells. TR4 target genes are involved in fundamental biological processes such as RNA metabolism and protein translation. In addition, we found that a subset of TR4 target genes exerts cell-type specific functions. Analysis of the TR4 binding sites revealed that less than 30% of the peaks from any of the cell types contained the DR1 motif previously derived from <it>in vitro </it>studies, suggesting that TR4 may be recruited to the genome via interaction with other proteins. A bioinformatics analysis of the TR4 binding sites predicted a <it>cis </it>regulatory module involving TR4 and ETS transcription factors. To test this prediction, we performed ChIP-seq for the ETS factor ELK4 and found that 30% of TR4 binding sites were also bound by ELK4. Motif analysis of the sites bound by both factors revealed a lack of the DR1 element, suggesting that TR4 binding at a subset of sites is facilitated through the ETS transcription factor ELK4. Further studies will be required to investigate the functional interdependence of these two factors.</p> <p>Conclusions</p> <p>Our data suggest that TR4 plays a pivotal role in fundamental biological processes across different cell types. In addition, the identification of cell type specific TR4 binding sites enables future studies of the pathways underlying TR4 action and its possible role in metabolic diseases.</p
- …