Search CORE

13 research outputs found

PLOTREP: a web tool for defragmentation and visual analysis of dispersed genomic repeats

Author: Barta Endre
Deák Gábor
Kiss György B.
Tóth Gábor
Publication venue: Oxford University Press
Publication date: 01/01/2006
Field of study

Identification of dispersed or interspersed repeats, most of which are derived from transposons, retrotransposons or retrovirus-like elements, is an important step in genome annotation. Software tools that compare genomic sequences with precompiled repeat reference libraries using sensitive similarity-based methods provide reliable means of finding the positions of fragments homologous to known repeats. However, their output is often incomplete and fragmented owing to the mutations (nucleotide substitutions, deletions or insertions) that can result in considerable divergence from the reference sequence. Merging these fragments to identify the whole region that represents an ancient copy of a mobile element is challenging, particularly if the element is large and suffered multiple deletions or insertions. Here we report PLOTREP, a tool designed to post-process results obtained by sequence similarity search and merge fragments belonging to the same copy of a repeat. The software allows rapid visual inspection of the results using a dot-plot like graphical output. The web implementation of PLOTREP is available at

CiteSeerX

Crossref

University of Debrecen Electronic Archive

PubMed Central

Bioinformatics: Strategies, Trends, and Perspectives

Author: Adriane Beatriz de Souza Serapião
Carlos Norberto Fischer
Publication venue: 'IntechOpen'
Publication date: 01/03/2010
Field of study

IntechOpen

Recommended from our members

Transposable Element Abundance and Variability in 28 Different Species in the Family Solanaceae

Author: Mendieta John P
Publication venue: CU Scholar
Publication date: 01/01/2015
Field of study

Transposable Elements (TEs) are small nucleic acid parasites that replicate and reinsert themselves into the genome of their host organism. These small genetic parasites have in recent times been seen as possible evolutionary drivers in the development and evolution of genomic adaptations as well as genomic architecture. While much is known about the possible effects of TEs on an individual organism, little is known about their dynamics on a family level scale. In order to investigate this relationship, TE types and abundances were analyzed for 28 species in the highly diverse plant family Solanaceae. Transposable Elements were identified and investigated by running the program RepeatExplorer on whole genome shotgun data sets from 28 different species in the Physaleae and Solanaea tribes in the Solanacea family. I identified the genomic proportion of repetitive elements in all species and found that on a family level, two TE types, LTR gypsy and unclassified repetitive content were the most abundant for all species. On a family level, class II TEs were found to be far less numerous in genomic proportion, but were far more variable on an individual level. These results indicated that while LTR gypsy and Unclassified TEs are more important for long-term genomic dynamics, Class II TEs act more significantly in the short term. Clades also appear to have a relationship on TE abundances with more closely related species having similar genomic percentage of TEs, but due to our lack of branch lengths in the phylogeny I was unable to calculate this metric. Finally, while these results are interesting, there is currently no all-encompassing biological explanation as to exactly why these family level genomic trends are being exhibited

CU Scholar Institutional Repository

T-lex: a program for fast and accurate assessment of transposable element presence using next-generation sequencing data

Author: Abad
Ackerman
Adams
Agrawal
Anna-Sophie Fiston-Lavier
Biemont
Buisine
Charlesworth
Cordaux
Craig
Dmitri A. Petrov
Du
Gonzalez
Gonzalez
Gonzalez
Josefa González
Juretic
Jurka
Kaminker
Kidwell
Kordis
Lander
Levis
Lexa
Li
Lockton
Matthew Carrigan
Naito
Naito
Petrov
Rozen
Rumble
Slotkin
Wang
Weigel
Wicker
Wicker
Yang
Publication venue: Oxford University Press
Publication date: 01/01/2011
Field of study

Transposable elements (TEs) are repetitive DNA sequences that are ubiquitous, extremely abundant and dynamic components of practically all genomes. Much effort has gone into annotation of TE copies in reference genomes. The sequencing cost reduction and the newly available next-generation sequencing (NGS) data from multiple strains within a species offer an unprecedented opportunity to study population genomics of TEs in a range of organisms. Here, we present a computational pipeline (T-lex) that uses NGS data to detect the presence/absence of annotated TE copies. T-lex can use data from a large number of strains and returns estimates of population frequencies of individual TE insertions in a reasonable time. We experimentally validated the accuracy of T-lex detecting presence or absence of 768 previously identified TE copies in two resequenced Drosophila melanogaster strains. Approximately 95% of the TE insertions were detected with 100% sensitivity and 97% specificity. We show that even at low levels of coverage T-lex produces accurate results for TE copies that it can identify reliably but that the rate of ‘no data’ calls increases as the coverage falls below 15×. T-lex is a broadly applicable and flexible tool that can be used in any genome provided the availability of the reference genome, individual TE copy annotation and NGS data

Crossref

PubMed Central

Digital.CSIC

Organization and evolution of two SIDER retroposon subfamilies and their impact on the Leishmania genome

Author: Bringaud Frédéric
Papadopoulou Barbara
Smith Martin
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Crossref

Springer - Publisher Connector

PubMed Central

Global Identification and Characterization of Transcriptionally Active Regions in the Rice Genome

Author: B Lehner
BJ Haas
Brian Dilkes
CM Farrell
DD Shoemaker
DW Selinger
H Matsumura
H Matsumura
H Wang
Hang He
HE Kauffman
J Cheng
J Yu
J Yu
Jan Korbel
JM Johnson
JM Rouillard
JT Lee
K Jabbari
K Yamada
L Li
L Li
Lei Li
M Nakano
M Ronemus
Mark Gerstein
N Jiang
N Juretic
N Juretic
N Kitagawa
N Osato
O Borsani
P Bertone
P Kapranov
Pamela Ronald
Q Feng
Q Yuan
R Yelin
Rajkumar Sasidharan
Runsheng Chen
S Kikuchi
S Singh-Gasson
S Washietl
SA Goff
SH Munroe
T Sasaki
TC Mockler
TE Royce
TR Hughes
V Stolc
Viktor Stolc
W Deng
Waraporn Tongprasit
Wei Deng
X Wang
Xiangfeng Wang
Xing Wang Deng
XJ Wang
Xuewei Chen
Z Lippman
Publication venue: Public Library of Science
Publication date: 01/01/2007
Field of study

Genome tiling microarray studies have consistently documented rich transcriptional activity beyond the annotated genes. However, systematic characterization and transcriptional profiling of the putative novel transcripts on the genome scale are still lacking. We report here the identification of 25,352 and 27,744 transcriptionally active regions (TARs) not encoded by annotated exons in the rice (Oryza. sativa) subspecies japonica and indica, respectively. The non-exonic TARs account for approximately two thirds of the total TARs detected by tiling arrays and represent transcripts likely conserved between japonica and indica. Transcription of 21,018 (83%) japonica non-exonic TARs was verified through expression profiling in 10 tissue types using a re-array in which annotated genes and TARs were each represented by five independent probes. Subsequent analyses indicate that about 80% of the japonica TARs that were not assigned to annotated exons can be assigned to various putatively functional or structural elements of the rice genome, including splice variants, uncharacterized portions of incompletely annotated genes, antisense transcripts, duplicated gene fragments, and potential non-coding RNAs. These results provide a systematic characterization of non-exonic transcripts in rice and thus expand the current view of the complexity and dynamics of the rice transcriptome

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

Combined Evidence Annotation of Transposable Elements in Genome Sequences

Author: Allen
Altschul
Altschul
Andrieu
Ashurst
Bao
Bedell
Benson
Biedler
Casey M. Bergman
Celniker
Chao
Danielle Nouaud
Delphine Autard
Ding
Dominique Anxolabehere
Durbin
Edgar
Gish
Gusfield
Haas
Hadi Quesneville
Hoskins
Juretic
Jurka
Kaminker
Kidwell
Kolpakov
Lander
Lewis
Locke
McCarthy
Meyerowitz
Michael Ashburner
Misra
Mungall
Olivier Andrieu
Potter
Price
Quesneville
Sagot
Wilder
Zhang
Publication venue: Public Library of Science
Publication date: 01/01/2005
Field of study

Transposable elements (TEs) are mobile, repetitive sequences that make up significant fractions of metazoan genomes. Despite their near ubiquity and importance in genome and chromosome biology, most efforts to annotate TEs in genome sequences rely on the results of a single computational program, RepeatMasker. In contrast, recent advances in gene annotation indicate that high-quality gene models can be produced from combining multiple independent sources of computational evidence. To elevate the quality of TE annotations to a level comparable to that of gene models, we have developed a combined evidence-model TE annotation pipeline, analogous to systems used for gene annotation, by integrating results from multiple homology-based and de novo TE identification methods. As proof of principle, we have annotated “TE models” in Drosophila melanogaster Release 4 genomic sequences using the combined computational evidence derived from RepeatMasker, BLASTER, TBLASTX, all-by-all BLASTN, RECON, TE-HMM and the previous Release 3.1 annotation. Our system is designed for use with the Apollo genome annotation tool, allowing automatic results to be curated manually to produce reliable annotations. The euchromatic TE fraction of D. melanogaster is now estimated at 5.3% (cf. 3.86% in Release 3.1), and we found a substantially higher number of TEs (n = 6,013) than previously identified (n = 1,572). Most of the new TEs derive from small fragments of a few hundred nucleotides long and highly abundant families not previously annotated (e.g., INE-1). We also estimated that 518 TE copies (8.6%) are inserted into at least one other TE, forming a nest of elements. The pipeline allows rapid and thorough annotation of even the most complex TE models, including highly deleted and/or nested elements such as those often found in heterochromatic sequences. Our pipeline can be easily adapted to other genome sequences, such as those of the D. melanogaster heterochromatin or other species in the genus Drosophila

Public Library of Science (PLOS)

CiteSeerX

Crossref

Directory of Open Access Journals

HAL-Inserm

PubMed Central

HAL Descartes

The University of Manchester - Institutional Repository

Hal-Diderot

Abundant Degenerate Miniature Inverted-Repeat Transposable Elements in Genomes of Epichloid Fungal Endophytes of Grasses

Author: Anar K. Khan
Barry Scott
Benjak
Benson
Bergemann
Blankenship
Bureau
Bureau
Bureau
Byrd
Cambareri
Carolyn A. Young
Christensen
Christopher L. Schardl
Chung
Damien J. Fleetwood
DeMarco
Dufresne
Eddy
Edgar
Feschotte
Feschotte
Fierro
Fleetwood
Fleetwood
Hua-Van
Jiang
Jiang
Juretic
Khaldi
Kidwell
Kurtz
Mao
Marek
Marek
Martin
Mieczkowski
Moon
Moon
Morgulis
Notredame
Ohmori
Ramussen
Richard D. Johnson
Ruth E. Wrenn
Santiago
Scott
Shaaban
Shipra Mittal
Simon J. Foster
Spanu
Sung
Tsai
Tu
Uljana Hesse
van Dongen
Waterhouse
Wicker
Yang
Yang
Yeadon
Young
Young
Young
Publication venue: Oxford University Press
Publication date: 01/01/2011
Field of study

Miniature inverted-repeat transposable elements (MITEs) are abundant repeat elements in plant and animal genomes; however, there are few analyses of these elements in fungal genomes. Analysis of the draft genome sequence of the fungal endophyte Epichloë festucae revealed 13 MITE families that make up almost 1% of the E. festucae genome, and relics of putative autonomous parent elements were identified for three families. Sequence and DNA hybridization analyses suggest that at least some of the MITEs identified in the study were active early in the evolution of Epichloë but are not found in closely related genera. Analysis of MITE integration sites showed that these elements have a moderate integration site preference for 5′ genic regions of the E. festucae genome and are particularly enriched near genes for secondary metabolism. Copies of the EFT-3m/Toru element appear to have mediated recombination events that may have abolished synthesis of two fungal alkaloids in different epichloae. This work provides insight into the potential impact of MITEs on epichloae evolution and provides a foundation for analysis in other fungal genomes

Crossref

PubMed Central

University of Kentucky

“One code to find them all”: a perl tool to conveniently parse RepeatMasker output files

Author: Annabelle Haudry
Emmanuelle Lerat
Marc Bailly-Bechet
Publication venue: Springer Nature
Publication date: 01/05/2014
Field of study

International audienceBackground: Of the different bioinformatic methods used to recover transposable elements (TEs) in genome sequences, one of the most commonly used procedures is the homology-based method proposed by the RepeatMasker program. RepeatMasker generates several output files, including the .out file, which provides annotations for all detected repeats in a query sequence. However, a remaining challenge consists of identifying the different copies of TEs that correspond to the identified hits. This step is essential for any evolutionary/comparative analysis of the different copies within a family. Different possibilities can lead to multiple hits corresponding to a unique copy of an element, such as the presence of large deletions/insertions or undetermined bases, and distinct consensus corresponding to a single full-length sequence (like for long terminal repeat (LTR)-retrotransposons). These possibilities must be taken into account to determine the exact number of TE copies. Results: We have developed a perl tool that parses the RepeatMasker .out file to better determine the number and positions of TE copies in the query sequence, in addition to computing quantitative information for the different families. To determine the accuracy of the program, we tested it on several RepeatMasker .out files corresponding to two organisms (Drosophila melanogaster and Homo sapiens) for which the TE content has already been largely described and which present great differences in genome size, TE content, and TE families. Conclusions: Our tool provides access to detailed information concerning the TE content in a genome at the family level from the .out file of RepeatMasker. This information includes the exact position and orientation of each copy, its proportion in the query sequence, and its quality compared to the reference element. In addition, our tool allows a user to directly retrieve the sequence of each copy and obtain the same detailed information at the family level when a local library with incomplete TE class/subclass information was used with RepeatMasker. We hope that this tool will be helpful for people working on the distribution and evolution of TEs within genomes

Crossref

Springer - Publisher Connector

INRIA a CCSD electronic archive server

PubMed Central

Hal-Diderot

Tiling microarray analysis of rice chromosome 10 to identify the transcriptome and relate its expression to chromosomal architecture

Author: Deng Xing Wang
Li Lei
Li Songgang
Peng Zhiyu
Stolc Viktor
Su Ning
Tongprasit Waraporn
Wang Jun
Wang Xiangfeng
Wang Xiping
Xia Mian
Publication venue: BioMed Central
Publication date: 01/01/2005
Field of study

BACKGROUND: Sequencing and annotation of the genome of rice (Oryza sativa) have generated gene models in numbers that top all other fully sequenced species, with many lacking recognizable sequence homology to known genes. Experimental evaluation of these gene models and identification of new models will facilitate rice genome annotation and the application of this knowledge to other more complex cereal genomes. RESULTS: We report here an analysis of the chromosome 10 transcriptome of the two major rice subspecies, japonica and indica, using oligonucleotide tiling microarrays. This analysis detected expression of approximately three-quarters of the gene models without previous experimental evidence in both subspecies. Cloning and sequence analysis of the previously unsupported models suggests that the predicted gene structure of nearly half of those models needs improvement. Coupled with comparative gene model mapping, the tiling microarray analysis identified 549 new models for the japonica chromosome, representing an 18% increase in the annotated protein-coding capacity. Furthermore, an asymmetric distribution of genome elements along the chromosome was found that coincides with the cytological definition of the heterochromatin and euchromatin domains. The heterochromatin domain appears to associate with distinct chromosome level transcriptional activities under normal and stress conditions. CONCLUSION: These results demonstrated the utility of genome tiling microarray in evaluating annotated rice gene models and in identifying novel transcriptional units. The tiling microarray sanalysis further revealed a chromosome-wide transcription pattern that suggests a role for transposable element-enriched heterochromatin in shaping global transcription in response to environmental changes in rice

Springer - Publisher Connector

PubMed Central

University of Southern Denmark Research Output