17 research outputs found

    InSatDb: a microsatellite database of fully sequenced insect genomes

    Get PDF
    InSatDb presents an interactive interface to query information regarding microsatellite characteristics per se of five fully sequenced insect genomes (fruit-fly, honeybee, malarial mosquito, red-flour beetle and silkworm). InSatDb allows users to obtain microsatellites annotated with size (in base pairs and repeat units); genomic location (exon, intron, up-stream or transposon); nature (perfect or imperfect); and sequence composition (repeat motif and GC%). One can access microsatellite cluster (compound repeats) information and a list of microsatellites with conserved flanking sequences (microsatellite family or paralogs). InSatDb is complete with the insects information, web links to find details, methodology and a tutorial. A separate ‘Analysis’ section illustrates the comparative genomic analysis that can be carried out using the output. InSatDb is available at

    Improving accuracy of gene prediction programs of the genemark family by means of genome segmentation

    Get PDF
    Issued as final reportNational Institutes of Health (U.S.

    TReaDS: Tandem Repeats Discovery Service

    Get PDF
    Tandem repeats (TRs) are multiple duplications of substrings in the DNA that occur contiguously, or at a short distance, and may involve some mutations (such as substitutions, insertions, and deletions). The analysis of TRs is an important genetic profiling technique. In fact, TRs can be used, for instance, to detect evolutionary phenomena in populations, to identify the cause of several diseases, and to help in determining parentage. There are several web-based resources or downloadable packages for finding TRs, but such tools rarely give exactly the same result for a given input. Thus, biologists could be interested in a tool that, not only gives them the possibility of querying multiple systems at the same time, but also simplifies the burden of comparing and merging the results. TReaDS (Tandem Repeats Discovery Service) is a tandem repeat meta search engine that finds exact, approximate, short and long TRs. TReaDS queries several web-based tools and merges their outcome into a single report, providing a global, synthetic, and comparative view of the different results. Availability: TReaDS, the Tandem Repeats Discovery Service, is a web application free and open to all users without login requirement at the following URL: http://bioalgo.iit.cnr.it/treads

    Trinucleotide repeat diseases - antecipation diseases

    Get PDF
    Dynamic mutations involve expansion of the number of repeat units consisting of three or more nucleotides in tandem (i.e. adjacent to one another) present in a gene or in its neighborhood. These repeats may occur in different genes and may code for different aminoacids. According to expansions sizes, it is possible to have unaffected individuals that are carriers of a pre-mutation. Instability of triplet repeat size can lead to gradual expansion through generations, a phenomenon called anticipation. Genetic anticipation is characterized by the reduction in the age of disease onset and by a worsening of symptoms in affected individuals in successive generations. This work describes dynamic mutations giving emphasis on triplet repeats diseases, making the parallel with disease anticipation. Treatment strategies that have been developed during the last years are also discussed.info:eu-repo/semantics/publishedVersio

    Experimental DNA - or RNA-Directed therapies for Trinucleotide Repeat Disease

    Get PDF
    Some repeats of three or more nucleotides in tandem, which are present in a gene or in its vicinity, tend to increase in number and for this reason are called dynamic mutations. These triplet repeats are unstable and can expand from one generation to the next. According to the expansion size, an unaffected individual can carry a pre-mutation that will expand through generations leading to the development of triplet repeat expansion diseases. The increase in the number of repeats over time leads to earlier development and increased severity of symptoms in affected individuals in successive generations. Although there is still no treatment for this type of disease, several strategies are under investigation. Here, we describe treatment approaches for triplet repeat expansion diseases that have been developed over recent years, using DNA or RNA molecules as targets. Some of these strategies have the potential for future use in gene therapy for trinucleotide repeat disorders.info:eu-repo/semantics/publishedVersio

    Experimental DNA - or RNA-Directed therapies for Trinucleotide Repeat Disease

    Get PDF
    Some repeats of three or more nucleotides in tandem, which are present in a gene or in its vicinity, tend to increase in number and for this reason are called dynamic mutations. These triplet repeats are unstable and can expand from one generation to the next. According to the expansion size, an unaffected individual can carry a pre-mutation that will expand through generations leading to the development of triplet repeat expansion diseases. The increase in the number of repeats over time leads to earlier development and increased severity of symptoms in affected individuals in successive generations. Although there is still no treatment for this type of disease, several strategies are under investigation. Here, we describe treatment approaches for triplet repeat expansion diseases that have been developed over recent years, using DNA or RNA molecules as targets. Some of these strategies have the potential for future use in gene therapy for trinucleotide repeat disorders.info:eu-repo/semantics/publishedVersio

    MPI-dot2dot: A Parallel Tool to Find DNA Tandem Repeats on Multicore Clusters

    Get PDF
    Financiado para publicación en acceso aberto: Universidade da Coruña/CISUG[Abstract] Tandem Repeats (TRs) are segments that occur several times in a DNA sequence, and each copy is adjacent to other. In the last few years, TRs have gained significant attention as they are thought to be related with certain human diseases. Therefore, identifying and classifying TRs have become a highly important task in bioinformatics in order to analyze their disorders and relationships with illnesses. Dot2dot, a tool recently developed to find TRs, provides more accurate results than the previous state-of-the-art, but it requires a long execution time even when using multiple threads. This work presents MPI-dot2dot, a novel version of this tool that combines MPI and OpenMP so that it can be executed in a cluster of multicore nodes and thus reduces its execution time. The performance of this new parallel implementation has been tested using different real datasets. Depending on the characteristics of the input genomes, it is able to obtain the same biological results as Dot2dot but more than 100 times faster on a 16-node multicore cluster (384 cores). MPI-dot2dot is publicly available to download from https://sourceforge.net/projects/mpi-dot2dot.This work was supported by the Ministry of Science and Innovation of Spain (PID2019-104184RB-I00 / AEI / 10.13039/501100011033), and by Xunta de Galicia and FEDER funds (Centro de Investigación de Galicia accreditation 2019-2022 and Consolidation Program of Competitive Reference Groups, under Grants ED431G 2019/01 and ED431C 2021/30, respectively). The authors would like to thank the Galician Supercomputing Center (CESGA) for providing access to the Finis Terrae II supercomputer. Open Access funding provided thanks to the CRUE-CSIC agreement with Springer NatureXunta de Galicia; ED431G 2019/01Xunta de Galicia; ED431C 2021/3

    Ribonucleocapsid assembly/packaging signals in the genomes of the coronaviruses SARS-CoV and SARS-CoV-2: Detection, comparison and implications for therapeutic targeting

    Full text link
    The genomic ssRNA of coronaviruses is packaged within a helical nucleocapsid. Due to transitional symmetry of a helix, weakly specific cooperative interaction between ssRNA and nucleocapsid proteins leads to the natural selection of specific quasi-periodic assembly/packaging signals in the related genomic sequence. Such signals coordinated with the nucleocapsid helical structure were detected and reconstructed in the genomes of the coronaviruses SARS-CoV and SARS-CoV-2. The main period of the signals for both viruses was about 54 nt, that implies 6.75 nt per N protein. The complete coverage of ssRNA genome of length about 30,000 nt by the nucleocapsid would need 4,400 N proteins, that makes them the most abundant among the structural proteins. The repertoires of motifs for SARS-CoV and SARS-CoV-2 were divergent but nearly coincided for different isolates of SARS-CoV-2. We obtained the distributions of assembly/packaging signals over the genomes with non-overlapping windows of width 432 nt. Finally, using the spectral entropy, we compared the load from point mutations and indels during virus age for SARS-CoV and SARS-CoV-2. We found the higher mutational load on SARS-CoV. In this sense, SARS-CoV-2 can be treated as a "newborn" virus. These observations may be helpful in practical medical applications and are of basic interest.Comment: 31 pages, 6 figures, 3 table

    TRStalker: an Efficient Heuristic for Finding NP-Complete Tandem Repeats

    Get PDF
    Genomic sequences in higher eucaryotic organisms contain a substantial amount of (almost) repeated sequences. Tandem Repeats (TRs) constitute a large class of repetitive sequences that are originated via phenomena such as replication slippage, are characterized by close spatial contiguity, and play an important role in several molecular regulatory mechanisms. Certain types of tandem repeats are highly polymorphic and constitute a fingerprint feature of individuals. Abnormal TRs are known to be linked to several diseases. Researchers in bio-informatics in the last 20 years have proposed many formal definitions for the rather loose notion of a Tandem Repeat and have proposed exact or heuristic algorithms to detect TRs in genomic sequences. The general trend has been to use formal (implicit or explicit) definitions of TR for which verification of the solution was easy (with complexity linear, or polynomial in the TR\u27s length and substitution+indel rates) while the effort was directed towards identifying efficiently the sub-strings of the input to submit to the verification phase (either implicitly or explicitly). In this paper we take a step forward: we use a definition of TR for which also the verification step is difficult (in effect, NP-complete) and we develop new filtering techniques for coping with high error levels. The resulting heuristic algorithm, christened TRStalker, is approximate since it cannot guarantee that all NP-Complete Tandem Repeats satisfying the target definition in the input string will be found. However, in synthetic experiments with 30% of errors allowed, TRStalker has demonstrated a very high recall (ranging from 100% to 60%, depending on motif length and repetition number) for the NP-complete TRs. TRStalker has consistently better performance than some stateof- the-art methods for a large range of parameters on the class of NP-complete Tandem Repeats. TRStalker aims at improving the capability of TR detection for classes of TRs for which existing methods do not perform well

    Tandem repeats discovery service (TReaDS) applied to finding novel cis-acting factors in repeat expansion diseases

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Tandem repeats are multiple duplications of substrings in the DNA that occur contiguously, or at a short distance, and may involve some mutations (such as substitutions, insertions, and deletions). Tandem repeats have been extensively studied also for their association with the class of repeat expansion diseases (mostly affecting the nervous system). Comparative studies on the output of different tools for finding tandem repeats highlighted significant differences among the sets of detected tandem repeats, while many authors pointed up how critical it is the right choice of parameters.</p> <p>Results</p> <p>In this paper we present <it>TReaDS - Tandem Repeats Discovery Service</it>, a <it>tandem repeat meta search engine</it>. <it>TReaDS </it>forwards user requests to several state of the art tools for finding tandem repeats and merges their outcome into a single report, providing a global, synthetic, and comparative view of the results. In particular, <it>TReaDS </it>allows the user to (<it>i</it>) simultaneously run different algorithms on the same data set, (<it>ii</it>) choose for each algorithm a different setting of parameters, and (<it>iii</it>) obtain a report that can be downloaded for further, off-line, investigations. We used <it>TReaDS </it>to investigate sequences associated with repeat expansion diseases.</p> <p>Conclusions</p> <p>By using the tool <it>TReaDS </it>we discover that, for 27 repeat expansion diseases out of a currently known set of 29, <it>long fuzzy tandem repeats </it>are covering the expansion loci. Tests with control sets confirm the specificity of this association. This finding suggests that long fuzzy tandem repeats can be a new class of cis-acting elements involved in the mechanisms leading to the expansion instability.</p> <p>We strongly believe that biologists can be interested in a tool that, not only gives them the possibility of using multiple search algorithm at the same time, with the same effort exerted in using just one of the systems, but also simplifies the burden of comparing and merging the results, thus expanding our capabilities in detecting important phenomena related to tandem repeats.</p
    corecore