169 research outputs found

    Comparative Methods for Gene Structure Prediction in Homologous Sequences

    Get PDF
    The increasing number of sequenced genomes motivates the use of evolutionary patterns to detect genes. We present a series of comparative methods for gene finding in homologous prokaryotic or eukaryotic sequences. Based on a model of legal genes and a similarity measure between genes, we find the pair of legal genes of maximum similarity. We develop methods based on genes models and alignment based similarity measures of increasing complexity, which take into account many details of real gene structures, e.g. the similarity of the proteins encoded by the exons. When using a similarity measure based on an exiting alignment, the methods run in linear time. When integrating the alignment and prediction process which allows for more fine grained similarity measures, the methods run in quadratic time. We evaluate the methods in a series of experiments on synthetic and real sequence data, which show that all methods are competitive but that taking the similarity of the encoded proteins into account really boost the performance

    Analysis of computational approaches for motif discovery

    Get PDF
    Recently, we performed an assessment of 13 popular computational tools for discovery of transcription factor binding sites (M. Tompa, N. Li, et al., "Assessing Computational Tools for the Discovery of Transcription Factor Binding Sites", Nature Biotechnology, Jan. 2005). This paper contains follow-up analysis of the assessment results, and raises and discusses some important issues concerning the state of the art in motif discovery methods: 1. We categorize the objective functions used by existing tools, and design experiments to evaluate whether any of these objective functions is the right one to optimize. 2. We examine various features of the data sets that were used in the assessment, such as sequence length and motif degeneracy, and identify which features make data sets hard for current motif discovery tools. 3. We identify an important feature that has not yet been used by existing tools and propose a new objective function that incorporates this feature

    Selenoprofiles: profile-based scanning of eukaryotic genome sequences for selenoprotein genes

    Get PDF
    Motivation: Selenoproteins are a group of proteins that contain selenocysteine (Sec), a rare amino acid inserted co-translationally into the protein chain. The Sec codon is UGA, which is normally a stop codon. In selenoproteins, UGA is recoded to Sec in presence of specific features on selenoprotein gene transcripts. Due to the dual role of the UGA codon, selenoprotein prediction and annotation are difficult tasks, and even known selenoproteins are often misannotated in genome databases

    SVM clustering

    Get PDF

    Emergence of Superlattice Dirac Points in Graphene on Hexagonal Boron Nitride

    Get PDF
    The Schr\"odinger equation dictates that the propagation of nearly free electrons through a weak periodic potential results in the opening of band gaps near points of the reciprocal lattice known as Brillouin zone boundaries. However, in the case of massless Dirac fermions, it has been predicted that the chirality of the charge carriers prevents the opening of a band gap and instead new Dirac points appear in the electronic structure of the material. Graphene on hexagonal boron nitride (hBN) exhibits a rotation dependent Moir\'e pattern. In this letter, we show experimentally and theoretically that this Moir\'e pattern acts as a weak periodic potential and thereby leads to the emergence of a new set of Dirac points at an energy determined by its wavelength. The new massless Dirac fermions generated at these superlattice Dirac points are characterized by a significantly reduced Fermi velocity. The local density of states near these Dirac cones exhibits hexagonal modulations indicating an anisotropic Fermi velocity.Comment: 16 pages, 6 figure

    Fusion of the human gene for the polyubiquitination coeffector UEV1 with Kua, a newly identified gene

    Full text link
    UEV proteins are enzymatically inactive variants of the E2 ubiquitin-conjugating enzymes that regulate noncanonical elongation of ubiquitin chains. In Saccharomyces cerevisiae, UEV is part of the RAD6-mediated error-free DNA repair pathway. In mammalian cells, UEV proteins can modulate c-FOS transcription and the G2-M transition of the cell cycle. Here we show that the UEV genes from phylogenetically distant organisms present a remarkable conservation in their exon-intron structure. We also show that the human UEV1 gene is fused with the previously unknown geneKua. In Caenorhabditis elegans and Drosophila melanogaster, Kua and UEV are in separated loci, and are expressed as independent transcripts and proteins. In humans,Kua and UEV1 are adjacent genes, expressed either as separate transcripts encoding independent Kua and UEV1 proteins, or as a hybrid Kua-UEV transcript, encoding a two-domain protein. Kua proteins represent a novel class of conserved proteins with juxtamembrane histidine-rich motifs. Experiments with epitope-tagged proteins show that UEV1A is a nuclear protein, whereas both Kua and Kua-UEV localize to cytoplasmic structures, indicating that the Kua domain determines the cytoplasmic localization of Kua-UEV. Therefore, the addition of a Kua domain to UEV in the fused Kua-UEV protein confers new biological properties to this regulator of variant polyubiquitination

    Expression of ribosomal protein L22e family members in Drosophila melanogaster: rpL22-like is differentially expressed and alternatively spliced

    Get PDF
    Several ribosomal protein families contain paralogues whose roles may be equivalent or specialized to include extra-ribosomal functions. RpL22e family members rpL22 and rpL22-like are differentially expressed in Drosophila melanogaster: rpL22-like mRNA is gonad specific whereas rpL22 is expressed ubiquitously, suggesting distinctive paralogue functions. To determine if RpL22-like has a divergent role in gonads, rpL22-like expression was analysed by qRT-PCR and western blots, respectively, showing enrichment of rpL22-like mRNA and a 34 kDa (predicted) protein in testis, but not in ovary. Immunohistochemistry of the reproductive tract corroborated testis-specific expression. RpL22-like detection in 80S/polysome fractions from males establishes a role for this tissue-specific paralogue as a ribosomal component. Unpredictably, expression profiles revealed a low abundant, alternative mRNA variant (designated ‘rpL22-like short’) that would encode a novel protein lacking the C-terminal ribosomal protein signature but retaining part of the N-terminal domain. This variant results from splicing of a retained intron (defined by non-canonical splice sites) within rpL22-like mRNA. Polysome association and detection of a low abundant 13.5 kDa (predicted) protein in testis extracts suggests variant mRNA translation. Collectively, our data show that alternative splicing of rpL22-like generates structurally distinct protein products: ribosomal component RpL22-like and a novel protein with a role distinct from RpL22-like

    SEARCHPATTOOL: a new method for mining the most specific frequent patterns for binding sites with application to prokaryotic DNA sequences

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Computational methods to predict transcription factor binding sites (TFBS) based on exhaustive algorithms are guaranteed to find the best patterns but are often limited to short ones or impose some constraints on the pattern type. Many patterns for binding sites in prokaryotic species are not well characterized but are known to be large, between 16–30 base pairs (bp) and contain at least 2 conserved bases. The length of prokaryotic species promoters (about 400 bp) and our interest in studying a small set of genes that could be a cluster of co-regulated genes from microarray experiments led to the development of a new exhaustive algorithm targeting these large patterns.</p> <p>Results</p> <p>We present Searchpattool, a new method to search for and select the most specific (conservative) frequent patterns. This method does not impose restrictions on the density or the structure of the pattern. The best patterns (motifs) are selected using several statistics, including a new application of a z-score based on the number of matching sequences. We compared Searchpattool against other well known algorithms on a <it>Bacillus subtilis </it>group of 14 input sequences and found that in our experiments Searchpattool always performed the best based on performance scores.</p> <p>Conclusion</p> <p>Searchpattool is a new method for pattern discovery relative to transcription factor binding sites for species or genes with short promoters. It outputs the most specific significant patterns and helps the biologist to choose the best candidates.</p
    corecore