9 research outputs found

    Comparing Elastic-Degenerate Strings: Algorithms, Lower Bounds, and Applications

    Get PDF
    An elastic-degenerate (ED) string T is a sequence of n sets T[1], . . ., T[n] containing m strings in total whose cumulative length is N. We call n, m, and N the length, the cardinality and the size of T, respectively. The language of T is defined as L(T) = {S1 · · · Sn : Si ∈ T[i] for all i ∈ [1, n]}. ED strings have been introduced to represent a set of closely-related DNA sequences, also known as a pangenome. The basic question we investigate here is: Given two ED strings, how fast can we check whether the two languages they represent have a nonempty intersection? We call the underlying problem the ED String Intersection (EDSI) problem. For two ED strings T1 and T2 of lengths n1 and n2, cardinalities m1 and m2, and sizes N1 and N2, respectively, we show the following: There is no O((N1N2)1−ϵ)-time algorithm, thus no O ((N1m2 + N2m1)1−ϵ)-time algorithm and no O ((N1n2 + N2n1)1−ϵ)-time algorithm, for any constant ϵ > 0, for EDSI even when T1 and T2 are over a binary alphabet, unless the Strong Exponential-Time Hypothesis is false. There is no combinatorial O((N1 + N2)1.2−ϵf(n1, n2))-time algorithm, for any constant ϵ > 0 and any function f, for EDSI even when T1 and T2 are over a binary alphabet, unless the Boolean Matrix Multiplication conjecture is false. An O(N1 log N1 log n1 + N2 log N2 log n2)-time algorithm for outputting a compact (RLE) representation of the intersection language of two unary ED strings. In the case when T1 and T2 are given in a compact representation, we show that the problem is NP-complete. An O(N1m2 + N2m1)-time algorithm for EDSI. An Õ(N1ω−1n2 + N2ω−1n1)-time algorithm for EDSI, where ω is the exponent of matrix multiplication; the Õ notation suppresses factors that are polylogarithmic in the input size. We also show that the techniques we develop have applications outside of ED string comparison

    Antimicrobial resistance and plasmid profiles of Aeromonas hydrophila isolated from River Njoro, Kenya

    Get PDF
    The purpose of this study was to investigate the presence of Aeromonas hydrophila at commonly used water collection points on the River Njoro and to determine the in-vitro antimicrobial susceptibility and plasmid profiles of isolates. In total, 126 samples were collected and 36.5% of them were positive for A. hydrophila. The A. hydrophila were recovered on membrane filters, cultured on Trypticase Soy agar, Bile aesculin agar and Aeromonas Medium agar. They were further characterized using cytochrome oxidase and API 20E tests. Detection of drug susceptibility was determined using modified disc diffusion method to ampicillin (25 ìg), cefaclor (30 ìg), ceftizoxime (30 ìg), cefixime (5 ìg), cefazidime (30 ìg), gentamicin (200 ìg), streptomycin (25 ìg), chloramphenicol (50 ìg), nalidixic acid (30 ìg) and ciprofloxacin (1 ìg). Most of the isolates showed multi-drug resistance to two or more antibiotics. Chloramphenicol, nalidixic acid, ciprofloxacin, cefazidime and cefixime were the most sensitive drugs with 100% efficacy whereas ampicillin, cefaclor and streptomycin were the most resistant drugs having 100, 67 and 50 resistance, respectively. There was low resistance against ceftizoxime (16.7%) and gentamicin (23.3%). These results indicates that all A. hydrophila isolated from River Njoro had complete resistance to ampicillin and showed variable resistance to cefaclor, streptomycin, gentamycin and ceftizoxime. R-plasmids were extracted from multi-drug resistance strains and separated by agarose gel (0.8%) electrophoresis for profiling. Plasmid profiling revealed that most of the multi-drug resistant isolates contained one plasmid of 21.0 kb. Although some strains exhibited different antimicrobial resistance patterns, all of their plasmids were of the same size (21.0 kb). However, there were no plasmids in the antimicrobial sensitive isolates. This study also indicates that plasmid 21.0 kb is common in A. hydrophila and is important for antimicrobial resistance and virulence. Further studies are required to ascertain the role of this plasmid as a virulence marker.Key words: Aeromonas hydrophila, antimicrobial resistance, plasmid profile

    Antimicrobial resistance and plasmid profiles of Aeromonas hydrophila isolated from River Njoro, Kenya

    Get PDF
    The purpose of this study was to investigate the presence of Aeromonas hydrophila at commonly used water collection points on the River Njoro and to determine the in-vitro antimicrobial susceptibility and plasmid profiles of isolates. In total, 126 samples were collected and 36.5% of them were positive for A. hydrophila. The A. hydrophila were recovered on membrane filters, cultured on Trypticase Soy agar, Bile aesculin agar and Aeromonas Medium agar. They were further characterized using cytochrome oxidase and API 20E tests. Detection of drug susceptibility was determined using modified disc diffusion method to ampicillin (25 μg), cefaclor (30 μg), ceftizoxime (30 μg), cefixime (5 μg), cefazidime (30 μg), gentamicin (200 μg), streptomycin (25 μg), chloramphenicol (50 μg), nalidixic acid (30 μg) and ciprofloxacin (1 μg). Most of the isolates showed multi-drug resistance to two or more antibiotics. Chloramphenicol, nalidixic acid, ciprofloxacin, cefazidime and cefixime were the most sensitive drugs with 100% efficacy whereas ampicillin, cefaclor and streptomycin were the most resistant drugs having 100, 67 and 50 resistance, respectively. There was low resistance against ceftizoxime (16.7%) and gentamicin (23.3%). These results indicates that all A. hydrophila isolated from River Njoro had complete resistance to ampicillin and showed variable resistance to cefaclor, streptomycin, gentamycin and ceftizoxime. R-plasmids were extracted from multi-drug resistance strains and separated by agarose gel (0.8%) electrophoresis for profiling. Plasmid profiling revealed that most of the multi-drug resistant isolates contained one plasmid of 21.0 kb. Although some strains exhibited different antimicrobial resistance patterns, all of their plasmids were of the same size (21.0 kb). However, there were no plasmids in the antimicrobial sensitive isolates. This study also indicates that plasmid 21.0 kb is common in A. hydrophila and is important for antimicrobial resistance and virulence. Further studies are required to ascertain the role of this plasmid as a virulence marker

    Fast Exact String to D-Texts Alignments

    Full text link
    In recent years, aligning a sequence to a pangenome has become a central problem in genomics and pangenomics. A fast and accurate solution to this problem can serve as a toolkit to many crucial tasks such as read-correction, Multiple Sequences Alignment (MSA), genome assemblies, variant calling, just to name a few. In this paper we propose a new, fast and exact method to align a string to a D-string, the latter possibly representing an MSA, a pan-genome or a partial assembly. An implementation of our tool dsa is publicly available at https://github.com/urbanslug/ds

    A draft human pangenome reference

    Get PDF
    Here the Human Pangenome Reference Consortium presents a first draft of the human pangenome reference. The pangenome contains 47 phased, diploid assemblies from a cohort of genetically diverse individuals1. These assemblies cover more than 99% of the expected sequence in each genome and are more than 99% accurate at the structural and base pair levels. Based on alignments of the assemblies, we generate a draft pangenome that captures known variants and haplotypes and reveals new alleles at structurally complex loci. We also add 119 million base pairs of euchromatic polymorphic sequences and 1,115 gene duplications relative to the existing reference GRCh38. Roughly 90 million of the additional base pairs are derived from structural variation. Using our draft pangenome to analyse short-read data reduced small variant discovery errors by 34% and increased the number of structural variants detected per haplotype by 104% compared with GRCh38-based workflows, which enabled the typing of the vast majority of structural variant alleles per sample

    A draft human pangenome reference

    No full text
    Here the Human Pangenome Reference Consortium presents a first draft of the human pangenome reference. The pangenome contains 47 phased, diploid assemblies from a cohort of genetically diverse individuals 1. These assemblies cover more than 99% of the expected sequence in each genome and are more than 99% accurate at the structural and base pair levels. Based on alignments of the assemblies, we generate a draft pangenome that captures known variants and haplotypes and reveals new alleles at structurally complex loci. We also add 119 million base pairs of euchromatic polymorphic sequences and 1,115 gene duplications relative to the existing reference GRCh38. Roughly 90 million of the additional base pairs are derived from structural variation. Using our draft pangenome to analyse short-read data reduced small variant discovery errors by 34% and increased the number of structural variants detected per haplotype by 104% compared with GRCh38-based workflows, which enabled the typing of the vast majority of structural variant alleles per sample.</p
    corecore