413,001 research outputs found

    MEME-LaB : motif analysis in clusters

    Get PDF
    Genome-wide expression analysis can result in large numbers of clusters of co-expressed genes. While there are tools for ab initio discovery of transcription factor binding sites, most do not provide a quick and easy way to study large numbers of clusters. To address this, we introduce a web-tool called MEME-LaB. The tool wraps MEME (an ab initio motif finder), providing an interface for users to input multiple gene clusters, retrieve promoter sequences, run motif finding, and then easily browse and condense the results, facilitating better interpretation of the results from large-scale datasets

    Sequence information gain based motif analysis

    Get PDF
    Background: The detection of regulatory regions in candidate sequences is essential for the understanding of the regulation of a particular gene and the mechanisms involved. This paper proposes a novel methodology based on information theoretic metrics for finding regulatory sequences in promoter regions. Results: This methodology (SIGMA) has been tested on genomic sequence data for Homo sapiens and Mus musculus. SIGMA has been compared with different publicly available alternatives for motif detection, such as MEME/MAST, Biostrings (Bioconductor package), MotifRegressor, and previous work such Qresiduals projections or information theoretic based detectors. Comparative results, in the form of Receiver Operating Characteristic curves, show how, in 70 % of the studied Transcription Factor Binding Sites, the SIGMA detector has a better performance and behaves more robustly than the methods compared, while having a similar computational time. The performance of SIGMA can be explained by its parametric simplicity in the modelling of the non-linear co-variability in the binding motif positions. Conclusions: Sequence Information Gain based Motif Analysis is a generalisation of a non-linear model of the cis-regulatory sequences detection based on Information Theory. This generalisation allows us to detect transcription factor binding sites with maximum performance disregarding the covariability observed in the positions of the training set of sequences. SIGMA is freely available to the public at http://b2slab.upc.edu.Postprint (published version

    Single-Molecule Analysis of i-motif Within Self-Assembled DNA Duplexes and Nanocircles

    Get PDF
    The cytosine (C)-rich sequences that can fold into tetraplex structures known as i-motif are prevalent in genomic DNA. Recent studies of i-motif–forming sequences have shown increasing evidence of their roles in gene regulation. However, most of these studies have been performed in short single-stranded oligonucleotides, far from the intracellular environment. In cells, i-motif–forming sequences are flanked by DNA duplexes and packed in the genome. Therefore, exploring the conformational dynamics and kinetics of i-motif under such topologically constrained environments is highly relevant in predicting their biological roles. Using single-molecule fluorescence analysis of self-assembled DNA duplexes and nanocircles, we show that the topological environments play a key role on i-motif stability and dynamics. While the human telomere sequence (C3TAA)3C3 assumes i-motif structure at pH 5.5 regardless of topological constraint, it undergoes conformational dynamics among unfolded, partially folded and fully folded states at pH 6.5. The lifetimes of i-motif and the partially folded state at pH 6.5 were determined to be 6 ± 2 and 31 ± 11 s, respectively. Consistent with the partially folded state observed in fluorescence analysis, interrogation of current versus time traces obtained from nanopore analysis at pH 6.5 shows long-lived shallow blockades with a mean lifetime of 25 ± 6 s. Such lifetimes are sufficient for the i-motif and partially folded states to interact with proteins to modulate cellular processes

    MEMOFinder: combining _de_ _novo_ motif prediction methods with a database of known motifs

    Get PDF
    *Background:* Methods for finding overrepresented sequence motifs are useful in several key areas of computational biology. They aim at detecting very weak signals responsible for biological processes requiring robust sequence identification like transcription-factor binding to DNA or docking sites in proteins. Currently, general performance of the model-based motif-finding methods is unsatisfactory; however, different methods are successful in different cases. This leads to the practical problem of combining results of different motif-finding tools, taking into account current knowledge collected in motif databases.
*Results:* We propose a new complete service allowing researchers to submit their sequences for analysis by four different motif-finding methods for clustering and comparison with a reference motif database. It is tailored for regulatory motif detection, however it allows for substantial amount of configuration regarding sequence background, motif database and parameters for motif-finding methods.
*Availability:* The method is available online as a webserver at: http://bioputer.mimuw.edu.pl/software/mmf/. In addition, the source code is released on a GNU General Public License

    The evolution and variety of RFamide-type neuropeptides: insights from deuterostomian invertebrates

    Get PDF
    Five families of neuropeptides that have a C-terminal RFamide motif have been identified in vertebrates: (1) gonadotropin-inhibitory hormone (GnIH), (2) neuropeptide FF (NPFF), (3) pyroglutamylated RFamide peptide (QRFP), (4) prolactin-releasing peptide (PrRP), and (5) Kisspeptin. Experimental demonstration of neuropeptide–receptor pairings combined with comprehensive analysis of genomic and/or transcriptomic sequence data indicate that, with the exception of the deuterostomian PrRP system, the evolutionary origins of these neuropeptides can be traced back to the common ancestor of bilaterians. Here, we review the occurrence of homologs of vertebrate RFamide-type neuropeptides and their receptors in deuterostomian invertebrates – urochordates, cephalochordates, hemichordates, and echinoderms. Extending analysis of the occurrence of the RFamide motif in other bilaterian neuropeptide families reveals RFamide-type peptides that have acquired modified C-terminal characteristics in the vertebrate lineage (e.g., NPY/NPF), neuropeptide families where the RFamide motif is unique to protostomian members (e.g., CCK/sulfakinins), and RFamide-type peptides that have been lost in the vertebrate lineage (e.g., luqins). Furthermore, the RFamide motif is also a feature of neuropeptide families with a more restricted phylogenetic distribution (e.g., the prototypical FMRFamide-related neuropeptides in protostomes). Thus, the RFamide motif is both an ancient and a convergent feature of neuropeptides, with conservation, acquisition, or loss of this motif occurring in different branches of the animal kingdom

    Motif Clustering and Overlapping Clustering for Social Network Analysis

    Full text link
    Motivated by applications in social network community analysis, we introduce a new clustering paradigm termed motif clustering. Unlike classical clustering, motif clustering aims to minimize the number of clustering errors associated with both edges and certain higher order graph structures (motifs) that represent "atomic units" of social organizations. Our contributions are two-fold: We first introduce motif correlation clustering, in which the goal is to agnostically partition the vertices of a weighted complete graph so that certain predetermined "important" social subgraphs mostly lie within the same cluster, while "less relevant" social subgraphs are allowed to lie across clusters. We then proceed to introduce the notion of motif covers, in which the goal is to cover the vertices of motifs via the smallest number of (near) cliques in the graph. Motif cover algorithms provide a natural solution for overlapping clustering and they also play an important role in latent feature inference of networks. For both motif correlation clustering and its extension introduced via the covering problem, we provide hardness results, algorithmic solutions and community detection results for two well-studied social networks
    • …
    corecore