4 research outputs found

    Variability of the Pr77 sequence of L1Tc retrotransposon among six T. cruzi strains belonging to different discrete typing units (DTUs)

    Full text link
    All trypanosomatid genomes are colonized by non-LTR retrotransposons which exhibit a highly conserved 77-nt sequence at their 5′ ends, known as the Pr77-hallmark (Pr77). The wide distribution of Pr77 is expected to be related to the gene regulation processes in these organisms as it has promoter and HDV-like ribozyme activities at the DNA and RNA levels, respectively. The identification of Pr77 hallmark-bearing retrotransposons and the study of the associations of mobile elements with relevant genes have been analyzed in the genomes of six strains of Trypanosoma cruzi belonging to different discrete typing units (DTUs) and with different geographical origins and host/vectors. The genomes have been sequenced, assembled and annotated. BUSCO analyses indicated a good quality for the assemblies that were used in comparative analyses. The results show differences among the six genomes in the copy number of genes related to virulence processes, the abundance of retrotransposons bearing the Pr77 sequence and the presence of the Pr77 hallmarks not associated with retroelements. The analyses also show frequent associations of Pr77-bearing retrotransposons and single Pr77 hallmarks with genes coding for trans-sialidases, RHS, MASP or hypothetical proteins, showing variable proportion depending on the type of retroelement, gene class and parasite strain. These differences in the genomic distribution of active retroelements and other Pr77-containing elements have shaped the genome architecture of these six strains and might be contributing to the phenotypic variability existing among the

    Leishmania infantum (JPCM5) transcriptome, gene models and resources for an active curation of gene annotations

    Full text link
    Leishmania infantum is one of the causative agents of visceral leishmaniases, the most severe form of leishmaniasis. An improved assembly for the L. infantum genome was published five years ago, yet delineation of its transcriptome remained to be accomplished. In this work, the transcriptome annotation was attained by a combination of both short and long RNA-seq reads. The good agreement between the results derived from both methodologies confirmed that transcript assembly based on Illumina RNA-seq and further delimitation according to the positions of spliced leader (SAS) and poly-A (PAS) addition sites is an adequate strategy to annotate the transcriptomes of Leishmania, a procedure previously used for transcriptome annotation in other Leishmania species and related trypanosomatids. These analyses also confirmed that the Leishmania transcripts boundaries are relatively slippery, showing extensive heterogeneity at the 5′- and 3′-ends. However, the use of RNA-seq reads derived from the PacBio technology (referred to as Iso-Seq) allowed the authors to uncover some complex transcription patterns occurring at particular loci that would be unnoticed by the use of short RNA-seq reads alone. Thus, Iso-Seq analysis provided evidence that transcript processing at particular loci would be more dynamic than expected. Another noticeable finding was the observation of a case of allelic heterozygosity based on the existence of chimeric Iso-Seq reads that might be generated by an event of intrachromosomal recombination. In addition, we are providing the L. infantum gene models, including both UTRs and CDS regions, that would be helpful for undertaking whole-genome expression studies. Moreover, we have built the foundations of a communal database for the active curation of both gene/transcript models and functional annotations for genes and proteinsThis research was supported by the Spanish Ministerio de Ciencia, Innovación (MICINN), Agencia Estatal deInvestigación(AEI), grant number PID2020-117916RB-I00, and Instituto de Salud Carlos III, grant CB21/13/00018 (CIBERINFEC). An institutional grant from Fundacion RamonAreces is also acknowledge

    ARAMIS: From systematic errors of NGS long reads to accurate assemblies

    Full text link
    NGS long-reads sequencing technologies (or third generation) such as Pacific BioSciences (PacBio) have revolutionized the sequencing field over the last decade improving multiple genomic applications like de novo genome assemblies. However, their error rate, mostly involving insertions and deletions (indels), is currently an important concern that requires special attention to be solved. Multiple algorithms are available to fix these sequencing errors using short reads (such as Illumina), although they require long processing times and some errors may persist. Here, we present Accurate long-Reads Assembly correction Method for Indel errorS (ARAMIS), the first NGS long-reads indels correction pipeline that combines several correction software in just one step using accurate short reads. As a proof OF concept, six organisms were selected based on their different GC content, size and genome complexity, and their PacBio-assembled genomes were corrected thoroughly by this pipeline. We found that the presence of systematic sequencing errors in long-reads PacBio sequences affecting homopolymeric regions, and that the type of indel error introduced during PacBio sequencing are related to the GC content of the organism. The lack of knowledge of this fact leads to the existence of numerous published studies where such errors have been found and should be resolved since they may contain incorrect biological information. ARAMIS yields better results with less computational resources needed than other correction tools and gives the possibility of detecting the nature of the found indel errors found and its distribution along the genome. The source code of ARAMIS is available at https://github.com/genomics-ngsCBMSO/ARAMIS.gi

    Secuencia parcial del genoma del maxicírculo de Leishmania braziliensis, comparación con otros tripanosomátidos

    No full text
    Maxicircle genome partial sequence of Leishmania braziliensis: assembling and comparison with other trypanosomatids. Objective.With the aim to provide new insights for genotyping and phylogenetic studies of the Leishmania genus, in this study the sequence of themaxicircle in Leishmania braziliensis, strain MHOM-BR-75-M2904, was determined and compared with those reported in othertrypanosomatids species. Materials and methods. Searches for maxicircle sequences were performed in the unassembled sequences ofGeneDB database version 2.1, as well as in the GenBank, using the ND8 and RPS12 genes of L. braziliensis as the initial probes. Thesesequences were assembled and compared with the homologous sequences of trypanosomatids using the bioinformatics tools LALIGN andClustalW2. The size of maxicircle was determined by Southern blot assays. Results. Two maxicircle fragments of 6535 and 4257nucleotides were assembled. The sequences of these genes showed high synteny and similarity with the sequences in other Leishmaniaspecies. This similarity even was extended to the editing patterns of these molecules. Conclusions. Although L. braziliensis is the mostdivergent species of the Leishmania genus in their nuclear genome, the maxicicircle has a high conservation. This result suggests that thepattern of editing present in the different Leishmania species studied has been conserved also in the subgenus Viannia. These results indicatea high conservation in the editing of mitochondrial transcripts at the genus level
    corecore