413,001 research outputs found
MEME-LaB : motif analysis in clusters
Genome-wide expression analysis can result in large numbers of clusters of co-expressed genes. While there are tools for ab initio discovery of transcription factor binding sites, most do not provide a quick and easy way to study large numbers of clusters. To address this, we introduce a web-tool called MEME-LaB. The tool wraps MEME (an ab initio motif finder), providing an interface for users to input multiple gene clusters, retrieve promoter sequences, run motif finding, and then easily browse and condense the results, facilitating better interpretation of the results from large-scale datasets
Sequence information gain based motif analysis
Background: The detection of regulatory regions in candidate sequences is essential for the understanding of the regulation of a particular gene and the mechanisms involved. This paper proposes a novel methodology based on information theoretic metrics for finding regulatory sequences in promoter regions. Results: This methodology (SIGMA) has been tested on genomic sequence data for Homo sapiens and Mus musculus. SIGMA has been compared with different publicly available alternatives for motif detection, such as MEME/MAST, Biostrings (Bioconductor package), MotifRegressor, and previous work such Qresiduals projections or information theoretic based detectors. Comparative results, in the form of Receiver Operating Characteristic curves, show how, in 70 % of the studied Transcription Factor Binding Sites, the SIGMA detector has a better performance and behaves more robustly than the methods compared, while having a similar computational time. The performance of SIGMA can be explained by its parametric simplicity in the modelling of the non-linear co-variability in the binding motif positions. Conclusions: Sequence Information Gain based Motif Analysis is a generalisation of a non-linear model of the cis-regulatory sequences detection based on Information Theory. This generalisation allows us to detect transcription factor binding sites with maximum performance disregarding the covariability observed in the positions of the training set of sequences. SIGMA is freely available to the public at http://b2slab.upc.edu.Postprint (published version
Single-Molecule Analysis of i-motif Within Self-Assembled DNA Duplexes and Nanocircles
The cytosine (C)-rich sequences that can fold into tetraplex structures known as i-motif are prevalent in genomic DNA. Recent studies of i-motif–forming sequences have shown increasing evidence of their roles in gene regulation. However, most of these studies have been performed in short single-stranded oligonucleotides, far from the intracellular environment. In cells, i-motif–forming sequences are flanked by DNA duplexes and packed in the genome. Therefore, exploring the conformational dynamics and kinetics of i-motif under such topologically constrained environments is highly relevant in predicting their biological roles. Using single-molecule fluorescence analysis of self-assembled DNA duplexes and nanocircles, we show that the topological environments play a key role on i-motif stability and dynamics. While the human telomere sequence (C3TAA)3C3 assumes i-motif structure at pH 5.5 regardless of topological constraint, it undergoes conformational dynamics among unfolded, partially folded and fully folded states at pH 6.5. The lifetimes of i-motif and the partially folded state at pH 6.5 were determined to be 6 ± 2 and 31 ± 11 s, respectively. Consistent with the partially folded state observed in fluorescence analysis, interrogation of current versus time traces obtained from nanopore analysis at pH 6.5 shows long-lived shallow blockades with a mean lifetime of 25 ± 6 s. Such lifetimes are sufficient for the i-motif and partially folded states to interact with proteins to modulate cellular processes
MEMOFinder: combining _de_ _novo_ motif prediction methods with a database of known motifs
*Background:* Methods for finding overrepresented sequence motifs are useful in several key areas of computational biology. They aim at detecting very weak signals responsible for biological processes requiring robust sequence identification like transcription-factor binding to DNA or docking sites in proteins. Currently, general performance of the model-based motif-finding methods is unsatisfactory; however, different methods are successful in different cases. This leads to the practical problem of combining results of different motif-finding tools, taking into account current knowledge collected in motif databases.
*Results:* We propose a new complete service allowing researchers to submit their sequences for analysis by four different motif-finding methods for clustering and comparison with a reference motif database. It is tailored for regulatory motif detection, however it allows for substantial amount of configuration regarding sequence background, motif database and parameters for motif-finding methods.
*Availability:* The method is available online as a webserver at: http://bioputer.mimuw.edu.pl/software/mmf/. In addition, the source code is released on a GNU General Public License
The evolution and variety of RFamide-type neuropeptides: insights from deuterostomian invertebrates
Five families of neuropeptides that have a C-terminal RFamide motif have been identified in vertebrates: (1) gonadotropin-inhibitory hormone (GnIH), (2) neuropeptide FF (NPFF), (3) pyroglutamylated RFamide peptide (QRFP), (4) prolactin-releasing peptide (PrRP), and (5) Kisspeptin. Experimental demonstration of neuropeptide–receptor pairings combined with comprehensive analysis of genomic and/or transcriptomic sequence data indicate that, with the exception of the deuterostomian PrRP system, the evolutionary origins of these neuropeptides can be traced back to the common ancestor of bilaterians. Here, we review the occurrence of homologs of vertebrate RFamide-type neuropeptides and their receptors in deuterostomian invertebrates – urochordates, cephalochordates, hemichordates, and echinoderms. Extending analysis of the occurrence of the RFamide motif in other bilaterian neuropeptide families reveals RFamide-type peptides that have acquired modified C-terminal characteristics in the vertebrate lineage (e.g., NPY/NPF), neuropeptide families where the RFamide motif is unique to protostomian members (e.g., CCK/sulfakinins), and RFamide-type peptides that have been lost in the vertebrate lineage (e.g., luqins). Furthermore, the RFamide motif is also a feature of neuropeptide families with a more restricted phylogenetic distribution (e.g., the prototypical FMRFamide-related neuropeptides in protostomes). Thus, the RFamide motif is both an ancient and a convergent feature of neuropeptides, with conservation, acquisition, or loss of this motif occurring in different branches of the animal kingdom
Motif Clustering and Overlapping Clustering for Social Network Analysis
Motivated by applications in social network community analysis, we introduce
a new clustering paradigm termed motif clustering. Unlike classical clustering,
motif clustering aims to minimize the number of clustering errors associated
with both edges and certain higher order graph structures (motifs) that
represent "atomic units" of social organizations. Our contributions are
two-fold: We first introduce motif correlation clustering, in which the goal is
to agnostically partition the vertices of a weighted complete graph so that
certain predetermined "important" social subgraphs mostly lie within the same
cluster, while "less relevant" social subgraphs are allowed to lie across
clusters. We then proceed to introduce the notion of motif covers, in which the
goal is to cover the vertices of motifs via the smallest number of (near)
cliques in the graph. Motif cover algorithms provide a natural solution for
overlapping clustering and they also play an important role in latent feature
inference of networks. For both motif correlation clustering and its extension
introduced via the covering problem, we provide hardness results, algorithmic
solutions and community detection results for two well-studied social networks
- …