19 research outputs found

    MEMOFinder: combining _de_ _novo_ motif prediction methods with a database of known motifs

    Get PDF
    *Background:* Methods for finding overrepresented sequence motifs are useful in several key areas of computational biology. They aim at detecting very weak signals responsible for biological processes requiring robust sequence identification like transcription-factor binding to DNA or docking sites in proteins. Currently, general performance of the model-based motif-finding methods is unsatisfactory; however, different methods are successful in different cases. This leads to the practical problem of combining results of different motif-finding tools, taking into account current knowledge collected in motif databases.
*Results:* We propose a new complete service allowing researchers to submit their sequences for analysis by four different motif-finding methods for clustering and comparison with a reference motif database. It is tailored for regulatory motif detection, however it allows for substantial amount of configuration regarding sequence background, motif database and parameters for motif-finding methods.
*Availability:* The method is available online as a webserver at: http://bioputer.mimuw.edu.pl/software/mmf/. In addition, the source code is released on a GNU General Public License

    Challenges for modeling global gene regulatory networks during development: Insights from Drosophila

    Get PDF
    AbstractDevelopment is regulated by dynamic patterns of gene expression, which are orchestrated through the action of complex gene regulatory networks (GRNs). Substantial progress has been made in modeling transcriptional regulation in recent years, including qualitative “coarse-grain” models operating at the gene level to very “fine-grain” quantitative models operating at the biophysical “transcription factor-DNA level”. Recent advances in genome-wide studies have revealed an enormous increase in the size and complexity or GRNs. Even relatively simple developmental processes can involve hundreds of regulatory molecules, with extensive interconnectivity and cooperative regulation. This leads to an explosion in the number of regulatory functions, effectively impeding Boolean-based qualitative modeling approaches. At the same time, the lack of information on the biophysical properties for the majority of transcription factors within a global network restricts quantitative approaches. In this review, we explore the current challenges in moving from modeling medium scale well-characterized networks to more poorly characterized global networks. We suggest to integrate coarse- and find-grain approaches to model gene regulatory networks in cis. We focus on two very well-studied examples from Drosophila, which likely represent typical developmental regulatory modules across metazoans

    RECORD: Reference-Assisted Genome Assembly for Closely Related Genomes

    Get PDF
    Background. Next-generation sequencing technologies are now producing multiple times the genome size in total reads from a single experiment. This is enough information to reconstruct at least some of the differences between the individual genome studied in the experiment and the reference genome of the species. However, in most typical protocols, this information is disregarded and the reference genome is used. Results. We provide a new approach that allows researchers to reconstruct genomes very closely related to the reference genome (e.g., mutants of the same species) directly from the reads used in the experiment. Our approach applies de novo assembly software to experimental reads and so-called pseudoreads and uses the resulting contigs to generate a modified reference sequence. In this way, it can very quickly, and at no additional sequencing cost, generate new, modified reference sequence that is closer to the actual sequenced genome and has a full coverage. In this paper, we describe our approach and test its implementation called RECORD. We evaluate RECORD on both simulated and real data. We made our software publicly available on sourceforge. Conclusion. Our tests show that on closely related sequences RECORD outperforms more general assisted-assembly software

    Finding evolutionarily conserved cis-regulatory modules with a universal set of motifs

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Finding functional regulatory elements in DNA sequences is a very important problem in computational biology and providing a reliable algorithm for this task would be a major step towards understanding regulatory mechanisms on genome-wide scale. Major obstacles in this respect are that the fact that the amount of non-coding DNA is vast, and that the methods for predicting functional transcription factor binding sites tend to produce results with a high percentage of false positives. This makes the problem of finding regions significantly enriched in binding sites difficult.</p> <p>Results</p> <p>We develop a novel method for predicting regulatory regions in DNA sequences, which is designed to exploit the evolutionary conservation of regulatory elements between species without assuming that the order of motifs is preserved across species. We have implemented our method and tested its predictive abilities on various datasets from different organisms.</p> <p>Conclusion</p> <p>We show that our approach enables us to find a majority of the known CRMs using only sequence information from different species together with currently publicly available motif data. Also, our method is robust enough to perform well in predicting CRMs, despite differences in tissue specificity and even across species, provided that the evolutionary distances between compared species do not change substantially. The complexity of the proposed algorithm is polynomial, and the observed running times show that it may be readily applied.</p

    Arabidopsis SWI/SNF chromatin remodeling complex binds both promoters and terminators to regulate gene expression

    Get PDF
    ATP-dependent chromatin remodeling complexes are important regulators of gene expression in Eukaryotes. In plants, SWI/SNF-type complexes have been shown critical for transcriptional control of key developmental processes, growth and stress responses. To gain insight into mechanisms underlying these roles, we performed whole genome mapping of the SWI/SNF catalytic subunit BRM in Arabidopsis thaliana, combined with transcript profiling experiments. Our data showthatBRM occupies thousands of sites in Arabidopsis genome, most of which located within or close to genes. Among identified direct BRM transcriptional targets almost equal numbers were up- and downregulated upon BRM depletion, suggesting that BRM can act as both activator and repressor of gene expression. Interestingly, in addition to genes showing canonical pattern of BRM enrichment near transcription start site, many other genes showed a transcription termination sitecentred BRM occupancy profile. We found that BRMbound 3ďż˝ gene regions have promoter-like features, including presence of TATA boxes and high H3K4me3 levels, and possess high antisense transcriptional activity which is subjected to both activation and repression by SWI/SNF complex. Our data suggest that binding to gene terminators and controlling transcription of non-coding RNAs is another way through which SWI/SNF complex regulates expression of its targets

    Additional file 1 of Taking promoters out of enhancers in sequence based predictions of tissue-specific mammalian enhancers

    No full text
    Supplementary Tables and Figures. This file contains additional tables and figures, such as table of datasets used for training, feature importance table and predictions for recently added VISTA sequences. (PDF 236 kb
    corecore