911 research outputs found

    Progressive Mauve: Multiple alignment of genomes with gene flux and rearrangement

    Full text link
    Multiple genome alignment remains a challenging problem. Effects of recombination including rearrangement, segmental duplication, gain, and loss can create a mosaic pattern of homology even among closely related organisms. We describe a method to align two or more genomes that have undergone large-scale recombination, particularly genomes that have undergone substantial amounts of gene gain and loss (gene flux). The method utilizes a novel alignment objective score, referred to as a sum-of-pairs breakpoint score. We also apply a probabilistic alignment filtering method to remove erroneous alignments of unrelated sequences, which are commonly observed in other genome alignment methods. We describe new metrics for quantifying genome alignment accuracy which measure the quality of rearrangement breakpoint predictions and indel predictions. The progressive genome alignment algorithm demonstrates markedly improved accuracy over previous approaches in situations where genomes have undergone realistic amounts of genome rearrangement, gene gain, loss, and duplication. We apply the progressive genome alignment algorithm to a set of 23 completely sequenced genomes from the genera Escherichia, Shigella, and Salmonella. The 23 enterobacteria have an estimated 2.46Mbp of genomic content conserved among all taxa and total unique content of 15.2Mbp. We document substantial population-level variability among these organisms driven by homologous recombination, gene gain, and gene loss. Free, open-source software implementing the described genome alignment approach is available from http://gel.ahabs.wisc.edu/mauve .Comment: Revision dated June 19, 200

    Structuring the bacterial genome: Y1-transposases associated with REP-BIME sequences†

    Get PDF
    REPs are highly repeated intergenic palindromic sequences often clustered into structures called BIMEs including two individual REPs separated by short linker of variable length. They play a variety of key roles in the cell. REPs also resemble the sub-terminal hairpins of the atypical IS200/605 family of insertion sequences which encode Y1 transposases (TnpAIS200/IS605). These belong to the HUH endonuclease family, carry a single catalytic tyrosine (Y) and promote single strand transposition. Recently, a new clade of Y1 transposases (TnpAREP) was found associated with REP/BIME in structures called REPtrons. It has been suggested that TnpAREP is responsible for REP/BIME proliferation over genomes. We analysed and compared REP distribution and REPtron structure in numerous available E. coli and Shigella strains. Phylogenetic analysis clearly indicated that tnpAREP was acquired early in the species radiation and was lost later in some strains. To understand REP/BIME behaviour within the host genome, we also studied E. coli K12 TnpAREP activity in vitro and demonstrated that it catalyses cleavage and recombination of BIMEs. While TnpAREP shared the same general organization and similar catalytic characteristics with TnpAIS200/IS605 transposases, it exhibited distinct properties potentially important in the creation of BIME variability and in their amplification. TnpAREP may therefore be one of the first examples of transposase domestication in prokaryotes

    Small Open Reading Frames, Non-Coding RNAs and Repetitive Elements in Bradyrhizobium japonicum USDA 110

    Get PDF
    Small open reading frames (sORFs) and genes for non-coding RNAs are poorly investigated components of most genomes. Our analysis of 1391 ORFs recently annotated in the soybean symbiont Bradyrhizobium japonicum USDA 110 revealed that 78% of them contain less than 80 codons. Twenty-one of these sORFs are conserved in or outside Alphaproteobacteria and most of them are similar to genes found in transposable elements, in line with their broad distribution. Stabilizing selection was demonstrated for sORFs with proteomic evidence and bll1319_ISGA which is conserved at the nucleotide level in 16 alphaproteobacterial species, 79 species from other taxa and 49 other Proteobacteria. Further we used Northern blot hybridization to validate ten small RNAs (BjsR1 to BjsR10) belonging to new RNA families. We found that BjsR1 and BjsR3 have homologs outside the genus Bradyrhizobium, and BjsR5, BjsR6, BjsR7, and BjsR10 have up to four imperfect copies in Bradyrhizobium genomes. BjsR8, BjsR9, and BjsR10 are present exclusively in nodules, while the other sRNAs are also expressed in liquid cultures. We also found that the level of BjsR4 decreases after exposure to tellurite and iron, and this down-regulation contributes to survival under high iron conditions. Analysis of additional small RNAs overlapping with 3’-UTRs revealed two new repetitive elements named Br-REP1 and Br-REP2. These REP elements may play roles in the genomic plasticity and gene regulation and could be useful for strain identification by PCR-fingerprinting. Furthermore, we studied two potential toxin genes in the symbiotic island and confirmed toxicity of the yhaV homolog bll1687 but not of the newly annotated higB homolog blr0229_ISGA in E. coli. Finally, we revealed transcription interference resulting in an antisense RNA complementary to blr1853, a gene induced in symbiosis. The presented results expand our knowledge on sORFs, non-coding RNAs and repetitive elements in B. japonicum and related bacteria

    The Influence of a Human Repetitive Dna on Genome Stability

    Get PDF
    A uniquely human interspersed repetitive DNA sequence family, the L2Hs, are highly polymorphic in human genomes. Several features of interspersed repeated DNA may contribute to the instability observed. Certain motifs (direct repeats, palindromes, and inverted repeats) comprising L2Hs elements may adopt unusual secondary structures such as cruciforms or hairpins. These motifs have been associated with features of genome instability in recombination, insertions and deletions. The L2Hs elements also are AT-rich (76%) compared to the bulk of human DNA (52%). That their dynamic nature (i.e. polymorphisms) may arise from recombination, insertions and deletions has led to the hypothesis that the L2Hs element is intrinsically dynamic and may influence the stability of the surrounding genome. Thus, the stability of the L2Hs element was tested in a bacterial model system. A cloned 0.6 kb L2Hs element forms non-B-form structures in recombinant plasmids pN6 and pN2, which differ only in insert orientation. Instability of pN6 and pN2 plasmids was observed in serial propagation studies in which E. coli cells containing the plasmids were cultured every 24 hours for 28 days. The vector plasmid pTZ19U, as control, was found to be stable in all passages while the two L2Hs recombinants developed deletions of the L2Hs insert as well as adjacent vector sequences. The isolated deletion mutants have been characterized via restriction cleavage studies and sequencing to map the boundaries of the deletions. Direct repeats and potential stem-loop structures have been discovered at or within close proximity to the deletion boundaries. The data demonstrate that the L2Hs recombinants\u27 unusual sequence features with potential for non-B-form secondary structures, influence genome stability via their involvement in generating errors during DNA replication and DNA repair

    Genetic diversity of Escherichia coli isolates from surface water and groundwater in a rural environment

    Get PDF
    The genetic characteristics among Escherichia coli strains can be grouped by origin of isolation. Then, it is possible to use the genotypes as a tool to determine the source of water contamination. The aim of this study was to define water aptitude for human consumption in a rural basin and to assess the diversity of E. coli water populations. Thus, it was possible to identify the main sources of fecal contamination and to explore linkages with the hydrogeological environment and land uses. The bacteriological analysis showed that more than 50% of samples were unfit for human consumption. DNA fingerprinting analysis by BOX-PCR indicated low genotypic diversity of E. coli isolates taken from surface water and groundwater. The results suggested the presence of a dominant source of fecal contamination. The relationship between low genotypic diversity and land use would prove that water contamination comes from livestock. The genetic diversity of E. coli isolated from surface water was less than that identified in groundwater because of the different hydraulic features of both environments. Furthermore, each one of the two big strain groups identified in this basin is located in different sub-basins, showing that hydrological dynamics exerts selective pressure on bacteria DNA.Fil: Gambero, Maria Laura. Universidad Nacional de Río Cuarto. Facultad de Ciencias Exactas, Fisicoquímicas y Naturales. Departamento de Microbiología e Inmunología; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; ArgentinaFil: Blarasin, Mónica Teresa. Universidad Nacional de Rio Cuarto. Facultad de Cs.exactas Fisicoquimicas y Naturales. Departamento de Geologia. Cat.de Hidrogeologia; ArgentinaFil: Bettera, Susana Gertrudis. Universidad Nacional de Río Cuarto. Facultad de Ciencias Exactas, Fisicoquímicas y Naturales. Departamento de Microbiología e Inmunología; ArgentinaFil: Giuliano Albo, María Jesica. Universidad Nacional de Rio Cuarto. Facultad de Cs.exactas Fisicoquimicas y Naturales. Departamento de Geologia. Cat.de Hidrogeologia; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentin

    progressiveMauve: Multiple Genome Alignment with Gene Gain, Loss and Rearrangement

    Get PDF
    Multiple genome alignment remains a challenging problem. Effects of recombination including rearrangement, segmental duplication, gain, and loss can create a mosaic pattern of homology even among closely related organisms.We describe a new method to align two or more genomes that have undergone rearrangements due to recombination and substantial amounts of segmental gain and loss (flux). We demonstrate that the new method can accurately align regions conserved in some, but not all, of the genomes, an important case not handled by our previous work. The method uses a novel alignment objective score called a sum-of-pairs breakpoint score, which facilitates accurate detection of rearrangement breakpoints when genomes have unequal gene content. We also apply a probabilistic alignment filtering method to remove erroneous alignments of unrelated sequences, which are commonly observed in other genome alignment methods. We describe new metrics for quantifying genome alignment accuracy which measure the quality of rearrangement breakpoint predictions and indel predictions. The new genome alignment algorithm demonstrates high accuracy in situations where genomes have undergone biologically feasible amounts of genome rearrangement, segmental gain and loss. We apply the new algorithm to a set of 23 genomes from the genera Escherichia, Shigella, and Salmonella. Analysis of whole-genome multiple alignments allows us to extend the previously defined concepts of core- and pan-genomes to include not only annotated genes, but also non-coding regions with potential regulatory roles. The 23 enterobacteria have an estimated core-genome of 2.46Mbp conserved among all taxa and a pan-genome of 15.2Mbp. We document substantial population-level variability among these organisms driven by segmental gain and loss. Interestingly, much variability lies in intergenic regions, suggesting that the Enterobacteriacae may exhibit regulatory divergence.The multiple genome alignments generated by our software provide a platform for comparative genomic and population genomic studies. Free, open-source software implementing the described genome alignment approach is available from http://gel.ahabs.wisc.edu/mauve

    Virulence Characteristics and Genetic Affinities of Multiple Drug Resistant Uropathogenic Escherichia coli from a Semi Urban Locality in India

    Get PDF
    Extraintestinal pathogenic Escherichia coli (ExPEC) are of significant health concern. The emergence of drug resistant E. coli with high virulence potential is alarming. Lack of sufficient data on transmission dynamics, virulence spectrum and antimicrobial resistance of certain pathogens such as the uropathogenic E. coli (UPEC) from countries with high infection burden, such as India, hinders the infection control and management efforts. In this study, we extensively genotyped and phenotyped a collection of 150 UPEC obtained from patients belonging to a semi-urban, industrialized setting near Pune, India. The isolates representing different clinical categories were analyzed in comparison with 50 commensal E. coli isolates from India as well as 50 ExPEC strains from Germany. Virulent strains were identified based on hemolysis, haemagglutination, cell surface hydrophobicity, serum bactericidal activity as well as with the help of O serotyping. We generated antimicrobial resistance profiles for all the clinical isolates and carried out phylogenetic analysis based on repetitive extragenic palindromic (rep)-PCR. E. coli from urinary tract infection cases expressed higher percentages of type I (45%) and P fimbriae (40%) when compared to fecal isolates (25% and 8% respectively). Hemolytic group comprised of 60% of UPEC and only 2% of E. coli from feces. Additionally, we found that serum resistance and cell surface hydrophobicity were not significantly (p = 0.16/p = 0.51) associated with UPEC from clinical cases. Moreover, clinical isolates exhibited highest resistance against amoxicillin (67.3%) and least against nitrofurantoin (57.3%). We also observed that 31.3% of UPEC were extended-spectrum beta-lactamase (ESBL) producers belonging to serotype O25, of which four were also positive for O25b subgroup that is linked to B2-O25b-ST131-CTX-M-15 virulent/multiresistant type. Furthermore, isolates from India and Germany (as well as global sources) were found to be genetically distinct with no evidence to espouse expansion of E. coli from India to the west or vice-versa

    PCRi praimeridisaini parendamine

    Get PDF
    Väitekirja elektrooniline versioon ei sisalda publikatsioone.Polümeraasi ahelreaktsioon ehk PCR on molekulaarbioloogia meetod, mis võimaldab paljundada spetsiifilist DNA lõiku. Protsess toimub tsükliliselt ja kätkeb endas järgmisi etappe: kaheahelalise DNA järjestuse (sihtmärkjärjestuse) sulatamist kõrgel temperatuuril, kahe spetsiifilise DNA järjestuse (PCRi praimeri) seondumist sihtmärkjärjestusele konkreetsest PCRi katsest sõltuval madalamal temperatuuril (ehk praimerite sulamistemperatuuril Tm) ning seondunud praimerite pikendamist vastavalt spetsiifilisele DNA lõigu järjestusele kindla valgulise ensüümi abil. Paljundatud DNA-d detekteeritakse, kas spetsiifiliselt geelil pikkuse järgi või reaalajas produkti paljundamise käigus tekkiva signaali abil. PCR võimaldab sel viisil tuvastada suvalisest DNA proovist (kliiniline-, veterinaar-, toidu-, keskkonnaproov jne) vaid kindlale liigile iseäralikku DNA järjestust ning seetõttu on tehnoloogia leidnud rakendust erinevates valdkondades erinevate liikide või tüvede tuvastamiseks. Üks olulisemaid eeldusi edukaks PCRi teostamiseks on täpsete ja tundlikke praimerite disainimine (PCRi praimeridisain). PCRi praimeridisain sisaldab endas erinevaid etappe muuhulgas sihtmärkjärjestuse välja valimist (nt kindla järjestuse valimist bakterigenoomist) ning PCRi praimerijärjestuste disaini. Käesolev töö ongi keskendunud PCRi praimeridisaini erinevate etappide parendamisele: laialdaselt kasutusel oleva PCRi praimeridisaini programmi Primer3-e poolt kasutatava praimerite sulamistemperatuuri Tm arvutamise valemi täiustamisele ning sellele, kuidas prokarüootseid liigispetsiifilisi kordusjärjestusi PCRi sihtmärkjärjestustena kasutada ning, millise effekti see PCRi tulemustele annab. Viimase osana sisaldab töö prokarüootsete liigispetsiifiliste kordusjärjestuste iseloomustamist 613 erinevas prokarüootses liigis. Antud töö tulemused aitavad kaasa uute ja paremate molekulaardiagnostika testide loomisele esiteks lihtsustades nende väljatöötamist tänu antud töös leitud ja kirjeldatud PCRi sihtmärkjärjestustele, teiseks tõstes nende töökindlust, kuna liigispetsiifilistele kordusjärjestustele disainitud praimerid on kõrgendatud tundlikkusega ning, kuna parendatud praimeridisaini-programm Primer3 võimaldab disainida praimereid, mis vastavad täpsemalt etteantud molekulaardiagnostika testide tingimustele .Polymerase chain reaction or PCR is a method in molecular biology, which enables amplification of specific DNA regions. It is a cyclic process, that comprises of heat denaturation of double-stranded DNA (target sequence), hybridization of two short oligonucleotides (called PCR primers) to the denaturated target sequence at temperature specific to the PCR reaction (termed melting temperature Tm) and extension of hybridized primers by the enzyme DNA polymerase. The amplified DNA can be identified either specifically on electrophoresis gel by its product length or by monitoring the signal generated through activation of primer attached flourecent labels during the amplification in real-time. Thus, PCR enables identification of the DNA sequence of interest from an arbitrary DNA sample (e.g. clinical, veterinary, food, environmental sample) and therefore it is widely applied in different areas for particular species or strain identification. The design of specific and sensitive PCR primers is very important for the success of PCR. PCR primer design comprises of different steps, among which choice of the target sequence (e.g. certain DNA region in a bacterial genome) and the design of the primer sequences are crucial. Current thesis is focused on the improvement of the different steps of PCR primer design. First, enhancements to widely used PCR primer design program Primer3 are introduced. These enhancements enable to calculate primer melting temperature more accurately (includes modernization of the primer melting temperature formula as well as the auxiliary formulas that enable to take the concentrations of salt ions in the PCR reaction buffer into consideration). Second, we provide a methodology for finding the prokaryotic species-specific repeats and their application as PCR targets in primer design. Also, the positive impact to PCR sensitivity of inclusion of repetitive sequences as PCR targets is ascertained. The last part of this thesis covers the characterization of species-specific repetitive sequences in 613 different prokaryotic species. The results of current thesis facilitate the design of new and more reliable tests to molecular diagnostics because of the followings: first, new PCR target sequences are introduced, second, we have shown that primers designed to species-specific repetitive sequences increase the sensitivity of PCR and third enhanced primer design program enables to design primers that follow more precisely the preset conditions of tests in molecular diagnostics

    Detection of cruciform DNA in vivo

    Get PDF

    Molecular characterization of Bacillus thuringiensis using rep-PCR

    Get PDF
    The genetic divergence of 65 strains of Bacillus thuringiensis (Bt) was determined using Rep-PCR. Based on the repetitive sequences the BOX primer was the most informative with 26 fragments, followed by ERIC (19) and REP (10), generating a total of 55 fragments. The dendogram shows that ten groups were formed when 45% was the average distance of the population: group 1 with 41,5% of the isolates, 33,8% of the isolates were distributed in other groups and 24,6% did not formed distinct group. 53,2% of the isolates from Embrapa are in the group 1, and 29,8% of the isolates are distributed in other groups. Bt strains from USDA and Institute Pasteur showed more variability
    corecore