290 research outputs found

    An Integrated Approach to Identifying Cis-Regulatory Modules in the Human Genome

    Get PDF
    In eukaryotic genomes, it is challenging to accurately determine target sites of transcription factors (TFs) by only using sequence information. Previous efforts were made to tackle this task by considering the fact that TF binding sites tend to be more conserved than other functional sites and the binding sites of several TFs are often clustered. Recently, ChIP-chip and ChIP-sequencing experiments have been accumulated to identify TF binding sites as well as survey the chromatin modification patterns at the regulatory elements such as promoters and enhancers. We propose here a hidden Markov model (HMM) to incorporate sequence motif information, TF-DNA interaction data and chromatin modification patterns to precisely identify cis-regulatory modules (CRMs). We conducted ChIP-chip experiments on four TFs, CREB, E2F1, MAX, and YY1 in 1% of the human genome. We then trained a hidden Markov model (HMM) to identify the labels of the CRMs by incorporating the sequence motifs recognized by these TFs and the ChIP-chip ratio. Chromatin modification data was used to predict the functional sites and to further remove false positives. Cross-validation showed that our integrated HMM had a performance superior to other existing methods on predicting CRMs. Incorporating histone signature information successfully penalized false prediction and improved the whole performance. The dataset we used and the software are available at http://nash.ucsd.edu/CIS/

    The conserved C-terminus of the PcrA/UvrD helicase interacts directly with RNA polymerase

    Get PDF
    Copyright: © 2013 Gwynn et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Funding: This work was supported by a Wellcome Trust project grant to MD (Reference: 077368), an ERC starting grant to MD (Acronym: SM-DNA-REPAIR) and a BBSRC project grant to PM, NS and MD (Reference: BB/I003142/1). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.Peer reviewedPublisher PD

    Arbovirus-Derived piRNAs Exhibit a Ping-Pong Signature in Mosquito Cells

    Get PDF
    The siRNA pathway is an essential antiviral mechanism in insects. Whether other RNA interference pathways are involved in antiviral defense remains unclear. Here, we report in cells derived from the two main vectors for arboviruses, Aedes albopictus and Aedes aegypti, the production of viral small RNAs that exhibit the hallmarks of ping-pong derived piwi-associated RNAs (piRNAs) after infection with positive or negative sense RNA viruses. Furthermore, these cells produce endogenous piRNAs that mapped to transposable elements. Our results show that these mosquito cells can initiate de novo piRNA production and recapitulate the ping-pong dependent piRNA pathway upon viral infection. The mechanism of viral-piRNA production is discussed

    GibbsST: a Gibbs sampling method for motif discovery with enhanced resistance to local optima

    Get PDF
    BACKGROUND: Computational discovery of transcription factor binding sites (TFBS) is a challenging but important problem of bioinformatics. In this study, improvement of a Gibbs sampling based technique for TFBS discovery is attempted through an approach that is widely known, but which has never been investigated before: reduction of the effect of local optima. RESULTS: To alleviate the vulnerability of Gibbs sampling to local optima trapping, we propose to combine a thermodynamic method, called simulated tempering, with Gibbs sampling. The resultant algorithm, GibbsST, is then validated using synthetic data and actual promoter sequences extracted from Saccharomyces cerevisiae. It is noteworthy that the marked improvement of the efficiency presented in this paper is attributable solely to the improvement of the search method. CONCLUSION: Simulated tempering is a powerful solution for local optima problems found in pattern discovery. Extended application of simulated tempering for various bioinformatic problems is promising as a robust solution against local optima problems

    Structure-guided selection of specificity determining positions in the human kinome

    Get PDF
    Background: The human kinome contains many important drug targets. It is well-known that inhibitors of protein kinases bind with very different selectivity profiles. This is also the case for inhibitors of many other protein families. The increased availability of protein 3D structures has provided much information on the structural variation within a given protein family. However, the relationship between structural variations and binding specificity is complex and incompletely understood. We have developed a structural bioinformatics approach which provides an analysis of key determinants of binding selectivity as a tool to enhance the rational design of drugs with a specific selectivity profile. Results: We propose a greedy algorithm that computes a subset of residue positions in a multiple sequence alignment such that structural and chemical variation in those positions helps explain known binding affinities. By providing this information, the main purpose of the algorithm is to provide experimentalists with possible insights into how the selectivity profile of certain inhibitors is achieved, which is useful for lead optimization. In addition, the algorithm can also be used to predict binding affinities for structures whose affinity for a given inhibitor is unknown. The algorithm’s performance is demonstrated using an extensive dataset for the human kinome. Conclusion: We show that the binding affinity of 38 different kinase inhibitors can be explained with consistently high precision and accuracy using the variation of at most six residue positions in the kinome binding site. We show for several inhibitors that we are able to identify residues that are known to be functionally important

    DNA Display Selection of Peptide Ligands for a Full-Length Human G Protein-Coupled Receptor on CHO-K1 Cells

    Get PDF
    The G protein-coupled receptors (GPCRs), which form the largest group of transmembrane proteins involved in signal transduction, are major targets of currently available drugs. Thus, the search for cognate and surrogate peptide ligands for GPCRs is of both basic and therapeutic interest. Here we describe the application of an in vitro DNA display technology to screening libraries of peptide ligands for full-length GPCRs expressed on whole cells. We used human angiotensin II (Ang II) type-1 receptor (hAT1R) as a model GPCR. Under improved selection conditions using hAT1R-expressing Chinese hamster ovary (CHO)-K1 cells as bait, we confirmed that Ang II gene could be enriched more than 10,000-fold after four rounds of selection. Further, we successfully selected diverse Ang II-like peptides from randomized peptide libraries. The results provide more precise information on the sequence-function relationships of hAT1R ligands than can be obtained by conventional alanine-scanning mutagenesis. Completely in vitro DNA display can overcome the limitations of current display technologies and is expected to prove widely useful for screening diverse libraries of mutant peptide and protein ligands for receptors that can be expressed functionally on the surface of CHO-K1 cells

    Search for Gravitational Waves from Primordial Black Hole Binary Coalescences in the Galactic Halo

    Get PDF
    We use data from the second science run of the LIGO gravitational-wave detectors to search for the gravitational waves from primordial black hole (PBH) binary coalescence with component masses in the range 0.2--1.0M1.0 M_\odot. The analysis requires a signal to be found in the data from both LIGO observatories, according to a set of coincidence criteria. No inspiral signals were found. Assuming a spherical halo with core radius 5 kpc extending to 50 kpc containing non-spinning black holes with masses in the range 0.2--1.0M1.0 M_\odot, we place an observational upper limit on the rate of PBH coalescence of 63 per year per Milky Way halo (MWH) with 90% confidence.Comment: 7 pages, 4 figures, to be submitted to Phys. Rev.

    Accurate and efficient gp120 V3 loop structure based models for the determination of HIV-1 co-receptor usage

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>HIV-1 targets human cells expressing both the CD4 receptor, which binds the viral envelope glycoprotein gp120, as well as either the CCR5 (R5) or CXCR4 (X4) co-receptors, which interact primarily with the third hypervariable loop (V3 loop) of gp120. Determination of HIV-1 affinity for either the R5 or X4 co-receptor on host cells facilitates the inclusion of co-receptor antagonists as a part of patient treatment strategies. A dataset of 1193 distinct gp120 V3 loop peptide sequences (989 R5-utilizing, 204 X4-capable) is utilized to train predictive classifiers based on implementations of random forest, support vector machine, boosted decision tree, and neural network machine learning algorithms. An <it>in silico </it>mutagenesis procedure employing multibody statistical potentials, computational geometry, and threading of variant V3 sequences onto an experimental structure, is used to generate a feature vector representation for each variant whose components measure environmental perturbations at corresponding structural positions.</p> <p>Results</p> <p>Classifier performance is evaluated based on stratified 10-fold cross-validation, stratified dataset splits (2/3 training, 1/3 validation), and leave-one-out cross-validation. Best reported values of sensitivity (85%), specificity (100%), and precision (98%) for predicting X4-capable HIV-1 virus, overall accuracy (97%), Matthew's correlation coefficient (89%), balanced error rate (0.08), and ROC area (0.97) all reach critical thresholds, suggesting that the models outperform six other state-of-the-art methods and come closer to competing with phenotype assays.</p> <p>Conclusions</p> <p>The trained classifiers provide instantaneous and reliable predictions regarding HIV-1 co-receptor usage, requiring only translated V3 loop genotypes as input. Furthermore, the novelty of these computational mutagenesis based predictor attributes distinguishes the models as orthogonal and complementary to previous methods that utilize sequence, structure, and/or evolutionary information. The classifiers are available online at <url>http://proteins.gmu.edu/automute</url>.</p

    Amelogenesis imperfecta

    Get PDF
    Amelogenesis imperfecta (AI) represents a group of developmental conditions, genomic in origin, which affect the structure and clinical appearance of enamel of all or nearly all the teeth in a more or less equal manner, and which may be associated with morphologic or biochemical changes elsewhere in the body. The prevalence varies from 1:700 to 1:14,000, according to the populations studied. The enamel may be hypoplastic, hypomineralised or both and teeth affected may be discoloured, sensitive or prone to disintegration. AI exists in isolation or associated with other abnormalities in syndromes. It may show autosomal dominant, autosomal recessive, sex-linked and sporadic inheritance patterns. In families with an X-linked form it has been shown that the disorder may result from mutations in the amelogenin gene, AMELX. The enamelin gene, ENAM, is implicated in the pathogenesis of the dominant forms of AI. Autosomal recessive AI has been reported in families with known consanguinity. Diagnosis is based on the family history, pedigree plotting and meticulous clinical observation. Genetic diagnosis is presently only a research tool. The condition presents problems of socialisation, function and discomfort but may be managed by early vigorous intervention, both preventively and restoratively, with treatment continued throughout childhood and into adult life. In infancy, the primary dentition may be protected by the use of preformed metal crowns on posterior teeth. The longer-term care involves either crowns or, more frequently these days, adhesive, plastic restorations

    HMMSplicer: A Tool for Efficient and Sensitive Discovery of Known and Novel Splice Junctions in RNA-Seq Data

    Get PDF
    Background: High-throughput sequencing of an organism’s transcriptome, or RNA-Seq, is a valuable and versatile new strategy for capturing snapshots of gene expression. However, transcriptome sequencing creates a new class of alignment problem: mapping short reads that span exon-exon junctions back to the reference genome, especially in the case where a splice junction is previously unknown. Methodology/Principal Findings: Here we introduce HMMSplicer, an accurate and efficient algorithm for discovering canonical and non-canonical splice junctions in short read datasets. HMMSplicer identifies more splice junctions than currently available algorithms when tested on publicly available A. thaliana, P. falciparum, and H. sapiens datasets without a reduction in specificity. Conclusions/Significance: HMMSplicer was found to perform especially well in compact genomes and on genes with low expression levels, alternative splice isoforms, or non-canonical splice junctions. Because HHMSplicer does not rely on prebuilt gene models, the products of inexact splicing are also detected. For H. sapiens, we find 3.6 % of 39 splice sites and 1.4% of 59 splice sites are inexact, typically differing by 3 bases in either direction. In addition, HMMSplicer provides a score for every predicted junction allowing the user to set a threshold to tune false positive rates depending on the needs of the experiment. HMMSplicer is implemented in Python. Code and documentation are freely available a
    corecore