Search CORE

15 research outputs found

Finding Approximate Tandem Repeats in Genomic Sequences

Author: Dan Geiger
Dayhoff M.
Dietmaier W.
Guan X.
Kolpakov R.
Kolpakov R.
Stewart M.
Ydo Wexler
Yechezkel Kashi
Zohar Yakhini
Publication venue: 'Mary Ann Liebert Inc'
Publication date
Field of study

TRStalker: an Efficient Heuristic for Finding NP-Complete Tandem Repeats

Author: Pellegrini Marco
Renda Maria Elena
Vecchio Alessio
Publication venue
Publication date
Field of study

Genomic sequences in higher eucaryotic organisms contain a substantial amount of (almost) repeated sequences. Tandem Repeats (TRs) constitute a large class of repetitive sequences that are originated via phenomena such as replication slippage, are characterized by close spatial contiguity, and play an important role in several molecular regulatory mechanisms. Certain types of tandem repeats are highly polymorphic and constitute a fingerprint feature of individuals. Abnormal TRs are known to be linked to several diseases. Researchers in bio-informatics in the last 20 years have proposed many formal definitions for the rather loose notion of a Tandem Repeat and have proposed exact or heuristic algorithms to detect TRs in genomic sequences. The general trend has been to use formal (implicit or explicit) definitions of TR for which verification of the solution was easy (with complexity linear, or polynomial in the TR\u27s length and substitution+indel rates) while the effort was directed towards identifying efficiently the sub-strings of the input to submit to the verification phase (either implicitly or explicitly). In this paper we take a step forward: we use a definition of TR for which also the verification step is difficult (in effect, NP-complete) and we develop new filtering techniques for coping with high error levels. The resulting heuristic algorithm, christened TRStalker, is approximate since it cannot guarantee that all NP-Complete Tandem Repeats satisfying the target definition in the input string will be found. However, in synthetic experiments with 30% of errors allowed, TRStalker has demonstrated a very high recall (ranging from 100% to 60%, depending on motif length and repetition number) for the NP-complete TRs. TRStalker has consistently better performance than some stateof- the-art methods for a large range of parameters on the class of NP-complete Tandem Repeats. TRStalker aims at improving the capability of TR detection for classes of TRs for which existing methods do not perform well

PUblication MAnagement

TRStalker: an efficient heuristic for finding fuzzy tandem repeats

Author: Alessio Vecchio
Ames
Benson
Benson
Boeva
Brodzik
Buchner
Burkhardt
Burkhardt
Bussey
Campuzano
de la Higuera
Dujon
Elemento
Fischetti
Gelfand
Glusman
Grissa
Gupta
Gusfield
Gusfield
Hauth
Jiang
Jurka
Kelkar
Kolpakov
Kolpakov
Kolpakov
Krishnan
Kurtz
Kurtz
Landau
Leclercq
Legendre
M. Elena Renda
Marco Pellegrini
Motwani
Mudunuri
Mulmuley
O'Dushlaine
Parisi
Peterlongo
Rivals
Rivals
Rowen
Saha
Sammeth
Sharma
Sim
Smit
Sokol
Stolovitzky
Vissers
Vogler
Warburton
Wells
Wexler
Wexler
Wooster
Publication venue: Oxford University Press
Publication date: 01/01/2010
Field of study

Motivation: Genomes in higher eukaryotic organisms contain a substantial amount of repeated sequences. Tandem Repeats (TRs) constitute a large class of repetitive sequences that are originated via phenomena such as replication slippage and are characterized by close spatial contiguity. They play an important role in several molecular regulatory mechanisms, and also in several diseases (e.g. in the group of trinucleotide repeat disorders). While for TRs with a low or medium level of divergence the current methods are rather effective, the problem of detecting TRs with higher divergence (fuzzy TRs) is still open. The detection of fuzzy TRs is propaedeutic to enriching our view of their role in regulatory mechanisms and diseases. Fuzzy TRs are also important as tools to shed light on the evolutionary history of the genome, where higher divergence correlates with more remote duplication events

CiteSeerX

Crossref

PubMed Central

Archivio della Ricerca - Università di Pisa

TReaDS: Tandem Repeats Discovery Service

Author: Pellegrini Marco
Renda Maria Elena
Vecchio Alessio
Publication venue
Publication date
Field of study

Tandem repeats (TRs) are multiple duplications of substrings in the DNA that occur contiguously, or at a short distance, and may involve some mutations (such as substitutions, insertions, and deletions). The analysis of TRs is an important genetic profiling technique. In fact, TRs can be used, for instance, to detect evolutionary phenomena in populations, to identify the cause of several diseases, and to help in determining parentage. There are several web-based resources or downloadable packages for finding TRs, but such tools rarely give exactly the same result for a given input. Thus, biologists could be interested in a tool that, not only gives them the possibility of querying multiple systems at the same time, but also simplifies the burden of comparing and merging the results. TReaDS (Tandem Repeats Discovery Service) is a tandem repeat meta search engine that finds exact, approximate, short and long TRs. TReaDS queries several web-based tools and merges their outcome into a single report, providing a global, synthetic, and comparative view of the different results. Availability: TReaDS, the Tandem Repeats Discovery Service, is a web application free and open to all users without login requirement at the following URL: http://bioalgo.iit.cnr.it/treads

PUblication MAnagement

VNTRDB: a bacterial variable number tandem repeat locus database

Author: Chang Chia-Hung
Chang Yu-Chung
Chiou Chien-Shun
Kao Cheng-Yan
Underwood Anthony
Publication venue: Oxford University Press
Publication date: 14/12/2006
Field of study

Variable number tandem repeat-PCR (VNTR-PCR) is a novel method developed for molecular typing of microorganisms. This method has proven useful in epidemiological studies in medical microbiology. Although hundreds of bacterial genomes have been sequenced, variable number tandem repeats (TRs) derived from comparative genome analyses are scarce. This may hamper their application to the surveillance of bacteria in molecular epidemiology. Here, we present a freely accessible variable number tandem repeat database (VNTRDB) that is intended to be a resource for helping in the discovery of putatively polymorphic tandem repeat loci and to aid with assay design by providing the flanking sequences that can be used in subsequent PCR primer design. In order to reveal possible polymorphism, each TR locus was obtained by comparing the sequences between different sets of bacterial genera, species or strains. Through this comparison, TRs which are unique to a genus can also be identified. Moreover, a visualization tool is provided to ensure that the copy number and locus length of repeats are correct. The VNTRDB is available at

Crossref

PubMed Central

National Taiwan University Repository

NTRFinder: a software tool to find nested tandem repeats

Author: A. A. Matroud
C. P. Tuffley
Domanic
Fu
Hauth
Landau
M. D. Hendy
Matroud
Sagot
Wells
Wexler
Woodford
Publication venue: Oxford University Press
Publication date
Field of study

We introduce the software tool NTRFinder to search for a complex repetitive structure in DNA we call a nested tandem repeat (NTR). An NTR is a recurrence of two or more distinct tandem motifs interspersed with each other. We propose that NTRs can be used as phylogenetic and population markers. We have tested our algorithm on both real and simulated data, and present some real NTRs of interest. NTRFinder can be downloaded from http://www.maths.otago.ac.nz/~aamatroud/

Crossref

PubMed Central

A Monte Carlo Method for Assessing the Quality of Duplication-Aware Alignment Algorithms

Author: Bogliolo Alessandro
Freschi Valerio
Publication venue: Libertas Academica
Publication date: 01/01/2011
Field of study

The increasing availability of high throughput sequencing technologies poses several challenges concerning the analysis of genomic data. Within this context, duplication-aware sequence alignment taking into account complex mutation events is regarded as an important problem, particularly in light of recent evolutionary bioinformatics researches that highlighted the role of tandem duplications as one of the most important mutation events. Traditional sequence comparison algorithms do not take into account these events, resulting in poor alignments in terms of biological significance, mainly because of their assumption of statistical independence among contiguous residues. Several duplication-aware algorithms have been proposed in the last years which differ either for the type of duplications they consider or for the methods adopted to identify and compare them. However, there is no solution which clearly outperforms the others and no methods exist for assessing the reliability of the resulting alignments. This paper proposes a Monte Carlo method for assessing the quality of duplication-aware alignment algorithms and for driving the choice of the most appropriate alignment technique to be used in a specific context

Archivio istituzionale della ricerca - Università di Urbino

Crossref

Directory of Open Access Journals

PubMed Central

A Lossy Compression Technique Enabling Duplication-Aware Sequence Alignment

Author: Bogliolo Alessandro
Freschi Valerio
Publication venue: Libertas Academica
Publication date: 01/01/2012
Field of study

In spite of the recognized importance of tandem duplications in genome evolution, commonly adopted sequence comparison algorithms do not take into account complex mutation events involving more than one residue at the time, since they are not compliant with the underlying assumption of statistical independence of adjacent residues. As a consequence, the presence of tandem repeats in sequences under comparison may impair the biological significance of the resulting alignment. Although solutions have been proposed, repeat-aware sequence alignment is still considered to be an open problem and new efficient and effective methods have been advocated. The present paper describes an alternative lossy compression scheme for genomic sequences which iteratively collapses repeats of increasing length. The resulting approximate representations do not contain tandem duplications, while retaining enough information for making their comparison even more significant than the edit distance between the original sequences. This allows us to exploit traditional alignment algorithms directly on the compressed sequences. Results confirm the validity of the proposed approach for the problem of duplication-aware sequence alignment

Archivio istituzionale della ricerca - Università di Urbino

Crossref

Directory of Open Access Journals

PubMed Central

HERRAMIENTA WEB PARA LA CLASIFICACIÓN DE MICROSATÉLITES POLIMÓRFICOS EN GENOMAS BACTERIANOS

Author: Dra. Yordanka Cuza Ferrer
Ing. Yinette Wisdom Viña
MsC. Carlos M. Martínez Ortiz
MsC. Miguel Sautié Castellanos
Publication venue: ECIMED
Publication date: 01/09/2015
Field of study

Las secuencias repetidas en tándem, específicamente los mini y micro satélites, han demostrado ser muy eficaces en la clasificación de bacterias patogénicas como B. anthracis, M. tuberculosis y P. aeruginosa, entre otras. En humanos es manifiesta su participación estando relacionados con más de ochenta enfermedades, gran parte de ellas de tipo neurodegenerativas, musculares y algunos tipos de cáncer. La herramienta web que presentamos es el resultado de la detección computacional de estas secuencias en genomas bacterianos completos y su correspondiente anotación en la estructura genómica de acuerdo a las diferentes regiones donde estos se localizan. La herramienta tiene como fin primario brindar un sistema relacional que permita al investigador ubicar los microsatélites de diferentes especies bacterianas, con más de un genoma secuenciado para inferir su posible carácter polimórfico, dentro del contexto de la estructura genómica y así proveer un primer acercamiento al rol putativo que los microsatélites desempeñan desde el punto de vista funcional. La herramienta se puede aplicar no solo en estudios taxonómicos y epidemiológicos sino en la detección de posibles relaciones de estas secuencias con las funciones moleculares, procesos biológicos y, en última instancia, las diversas formas de evolución de estas especies. El sitio web brinda el servicio de consultas a la base de datos de microsatélites bacterianos de acuerdo al sistema de tablas relacionales y atributos propios de las mismas. Cuenta además con los servicios típicos de un sitio con estas características como: sistema de autenticación, foro, encuestas, enlaces y documentación sobre la metodología empleada y del tema en cuestión.PALABRAS CLAVE:Microsatélites, Repetidos en Tándem, Bacterias, Sistema de Base de Dato

Directory of Open Access Journals

Probabilistic approaches to alignment with tandem repeats

Author: Broňa Brejová
Michal Nánási
Tomáš Vinař
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

Crossref

Springer - Publisher Connector