1,011 research outputs found

    Faster algorithms for 1-mappability of a sequence

    Full text link
    In the k-mappability problem, we are given a string x of length n and integers m and k, and we are asked to count, for each length-m factor y of x, the number of other factors of length m of x that are at Hamming distance at most k from y. We focus here on the version of the problem where k = 1. The fastest known algorithm for k = 1 requires time O(mn log n/ log log n) and space O(n). We present two algorithms that require worst-case time O(mn) and O(n log^2 n), respectively, and space O(n), thus greatly improving the state of the art. Moreover, we present an algorithm that requires average-case time and space O(n) for integer alphabets if m = {\Omega}(log n/ log {\sigma}), where {\sigma} is the alphabet size

    Multi-domain comparison of safety standards

    Get PDF
    International audienceThis paper presents an analysis of safety standards and their implementation in certification strategies from different domains such as aeronautics, automation, automotive, nuclear, railway and space. This work, performed in the context of the CG2E ("Club des Grandes Entreprises de l'Embarqué"), aims at identifying the main similarities and dissimilarities, for potential cross-domain harmonization. We strive to find the most comprehensive 'trans-sectorial' approach, within a large number of industrial domains. Exhibiting the 'true goals' of their numerous applicable standards, related to the safety of system and software, is a first important step towards harmonization, sharing common approaches, methods and tools whenever possible

    A robust SNP barcode for typing Mycobacterium tuberculosis complex strains

    Get PDF
    Strain-specific genomic diversity in the Mycobacterium tuberculosis complex (MTBC) is an important factor in pathogenesis that may affect virulence, transmissibility, host response and emergence of drug resistance. Several systems have been proposed to classify MTBC strains into distinct lineages and families. Here, we investigate single-nucleotide polymorphisms (SNPs) as robust (stable) markers of genetic variation for phylogenetic analysis. We identify ~92k SNP across a global collection of 1,601 genomes. The SNP-based phylogeny is consistent with the gold-standard regions of difference (RD) classification system. Of the ~7k strain-specific SNPs identified, 62 markers are proposed to discriminate known circulating strains. This SNP-based barcode is the first to cover all main lineages, and classifies a greater number of sublineages than current alternatives. It may be used to classify clinical isolates to evaluate tools to control the disease, including therapeutics and vaccines whose effectiveness may vary by strain type

    Longest Common Prefixes with kk-Errors and Applications

    Full text link
    Although real-world text datasets, such as DNA sequences, are far from being uniformly random, average-case string searching algorithms perform significantly better than worst-case ones in most applications of interest. In this paper, we study the problem of computing the longest prefix of each suffix of a given string of length nn over a constant-sized alphabet that occurs elsewhere in the string with kk-errors. This problem has already been studied under the Hamming distance model. Our first result is an improvement upon the state-of-the-art average-case time complexity for non-constant kk and using only linear space under the Hamming distance model. Notably, we show that our technique can be extended to the edit distance model with the same time and space complexities. Specifically, our algorithms run in O(nlogknloglogn)\mathcal{O}(n \log^k n \log \log n) time on average using O(n)\mathcal{O}(n) space. We show that our technique is applicable to several algorithmic problems in computational biology and elsewhere

    Unique cellular organization in the oldest root meristem

    Get PDF
    Roots and shoots of plant bodies develop from meristems—cell populations that self-renew and produce cells that undergo differentiation—located at the apices of axes [1].The oldest preserved root apices in which cellular anatomy can be imaged are found in nodules of permineralized fossil soils called coal balls [2], which formed in the Carboniferous coal swamp forests over 300 million years ago [3, 4, 5, 6, 7, 8 and 9]. However, no fossil root apices described to date were actively growing at the time of preservation [3, 4, 5, 6, 7, 8, 9 and 10]. Because the cellular organization of meristems changes when root growth stops, it has been impossible to compare cellular dynamics as stem cells transition to differentiated cells in extinct and extant taxa [11]. We predicted that meristems of actively growing roots would be preserved in coal balls. Here we report the discovery of the first fossilized remains of an actively growing root meristem from permineralized Carboniferous soil with detail of the stem cells and differentiating cells preserved. The cellular organization of the meristem is unique. The position of the Körper-Kappe boundary, discrete root cap, and presence of many anticlinal cell divisions within a broad promeristem distinguish it from all other known root meristems. This discovery is important because it demonstrates that the same general cellular dynamics are conserved between the oldest extinct and extant root meristems. However, its unique cellular organization demonstrates that extant root meristem organization and development represents only a subset of the diversity that has existed since roots first evolved.</p

    Application of bio-based solvents for biocatalysed synthesis of amides with Pseudomonas stutzeri lipase (PSL)

    Get PDF
    Bio-based solvents were investigated for the biocatalysed amidation reactions of various ester-amine combinations by Pseudomonas stutzeri lipase (PSL). Reactions were undertaken in a range of green and potentially bio-based solvents including terpinolene, p-cymene, limonene, 2-methyl THF, ɣ-valerolactone, propylene carbonate, dimethyl isosorbide, glycerol triacetate and water. Solvent screenings demonstrated the importance and potential of using non-polar bio-based solvents for favouring aminolysis over hydrolysis; whilst substrate screenings highlighted the unfavourable impact of reactants bearing bulky para- or 4-substituents. Renewable terpene-based solvents (terpinolene, p-cymene, D-limonene) were demonstrated to be suitable bio-based media for PSL amidation reactions. Such solvents could provide a greener and more sustainable alternative to traditional petrochemical derived non-polar solvents. Importantly, once the enzyme (either PSL or CALB) binds with a bulky para-substituted substrate, only small reagents are able to access the active site. This therefore limits the possibility for aminolysis to take place, thereby promoting the hydrolysis. This mechanism of binding supports the widely accepted 'Ping Pong - Bi Bi' mechanism used to describe enzyme kinetics. The work highlights the need to further investigate enzyme activity in relation to para- or 4-substituted substrates. A priority in PSL chemistry remains a methodology to tackle the competing hydrolysis reaction

    The landscape of Neandertal ancestry in present-day humans

    Get PDF
    Analyses of Neandertal genomes have revealed that Neandertals have contributed genetic variants to modern humans1–2. The antiquity of Neandertal gene flow into modern humans means that regions that derive from Neandertals in any one human today are usually less than a hundred kilobases in size. However, Neandertal haplotypes are also distinctive enough that several studies have been able to detect Neandertal ancestry at specific loci1,3–8. Here, we have systematically inferred Neandertal haplotypes in the genomes of 1,004 present-day humans12. Regions that harbor a high frequency of Neandertal alleles in modern humans are enriched for genes affecting keratin filaments suggesting that Neandertal alleles may have helped modern humans adapt to non-African environments. Neandertal alleles also continue to shape human biology, as we identify multiple Neandertal-derived alleles that confer risk for disease. We also identify regions of millions of base pairs that are nearly devoid of Neandertal ancestry and enriched in genes, implying selection to remove genetic material derived from Neandertals. Neandertal ancestry is significantly reduced in genes specifically expressed in testis, and there is an approximately 5-fold reduction of Neandertal ancestry on chromosome X, which is known to harbor a disproportionate fraction of male hybrid sterility genes20–22. These results suggest that part of the reduction in Neandertal ancestry near genes is due to Neandertal alleles that reduced fertility in males when moved to a modern human genetic background
    corecore