1,011 research outputs found
Faster algorithms for 1-mappability of a sequence
In the k-mappability problem, we are given a string x of length n and
integers m and k, and we are asked to count, for each length-m factor y of x,
the number of other factors of length m of x that are at Hamming distance at
most k from y. We focus here on the version of the problem where k = 1. The
fastest known algorithm for k = 1 requires time O(mn log n/ log log n) and
space O(n). We present two algorithms that require worst-case time O(mn) and
O(n log^2 n), respectively, and space O(n), thus greatly improving the state of
the art. Moreover, we present an algorithm that requires average-case time and
space O(n) for integer alphabets if m = {\Omega}(log n/ log {\sigma}), where
{\sigma} is the alphabet size
Multi-domain comparison of safety standards
International audienceThis paper presents an analysis of safety standards and their implementation in certification strategies from different domains such as aeronautics, automation, automotive, nuclear, railway and space. This work, performed in the context of the CG2E ("Club des Grandes Entreprises de l'Embarqué"), aims at identifying the main similarities and dissimilarities, for potential cross-domain harmonization. We strive to find the most comprehensive 'trans-sectorial' approach, within a large number of industrial domains. Exhibiting the 'true goals' of their numerous applicable standards, related to the safety of system and software, is a first important step towards harmonization, sharing common approaches, methods and tools whenever possible
A robust SNP barcode for typing Mycobacterium tuberculosis complex strains
Strain-specific genomic diversity in the Mycobacterium tuberculosis complex (MTBC) is an important factor in pathogenesis that may affect virulence, transmissibility, host response and emergence of drug resistance. Several systems have been proposed to classify MTBC strains into distinct lineages and families. Here, we investigate single-nucleotide polymorphisms (SNPs) as robust (stable) markers of genetic variation for phylogenetic analysis. We identify ~92k SNP across a global collection of 1,601 genomes. The SNP-based phylogeny is consistent with the gold-standard regions of difference (RD) classification system. Of the ~7k strain-specific SNPs identified, 62 markers are proposed to discriminate known circulating strains. This SNP-based barcode is the first to cover all main lineages, and classifies a greater number of sublineages than current alternatives. It may be used to classify clinical isolates to evaluate tools to control the disease, including therapeutics and vaccines whose effectiveness may vary by strain type
Longest Common Prefixes with -Errors and Applications
Although real-world text datasets, such as DNA sequences, are far from being
uniformly random, average-case string searching algorithms perform
significantly better than worst-case ones in most applications of interest. In
this paper, we study the problem of computing the longest prefix of each suffix
of a given string of length over a constant-sized alphabet that occurs
elsewhere in the string with -errors. This problem has already been studied
under the Hamming distance model. Our first result is an improvement upon the
state-of-the-art average-case time complexity for non-constant and using
only linear space under the Hamming distance model. Notably, we show that our
technique can be extended to the edit distance model with the same time and
space complexities. Specifically, our algorithms run in time on average using space. We show that our
technique is applicable to several algorithmic problems in computational
biology and elsewhere
Unique cellular organization in the oldest root meristem
Roots and shoots of plant bodies develop from meristems—cell populations that self-renew and produce cells that undergo differentiation—located at the apices of axes [1].The oldest preserved root apices in which cellular anatomy can be imaged are found in nodules of permineralized fossil soils called coal balls [2], which formed in the Carboniferous coal swamp forests over 300 million years ago [3, 4, 5, 6, 7, 8 and 9]. However, no fossil root apices described to date were actively growing at the time of preservation [3, 4, 5, 6, 7, 8, 9 and 10]. Because the cellular organization of meristems changes when root growth stops, it has been impossible to compare cellular dynamics as stem cells transition to differentiated cells in extinct and extant taxa [11]. We predicted that meristems of actively growing roots would be preserved in coal balls. Here we report the discovery of the first fossilized remains of an actively growing root meristem from permineralized Carboniferous soil with detail of the stem cells and differentiating cells preserved. The cellular organization of the meristem is unique. The position of the Körper-Kappe boundary, discrete root cap, and presence of many anticlinal cell divisions within a broad promeristem distinguish it from all other known root meristems. This discovery is important because it demonstrates that the same general cellular dynamics are conserved between the oldest extinct and extant root meristems. However, its unique cellular organization demonstrates that extant root meristem organization and development represents only a subset of the diversity that has existed since roots first evolved.</p
Application of bio-based solvents for biocatalysed synthesis of amides with Pseudomonas stutzeri lipase (PSL)
Bio-based solvents were investigated for the biocatalysed amidation reactions of various ester-amine combinations by Pseudomonas stutzeri lipase (PSL). Reactions were undertaken in a range of green and potentially bio-based solvents including terpinolene, p-cymene, limonene, 2-methyl THF, ɣ-valerolactone, propylene carbonate, dimethyl isosorbide, glycerol triacetate and water. Solvent screenings demonstrated the importance and potential of using non-polar bio-based solvents for favouring aminolysis over hydrolysis; whilst substrate screenings highlighted the unfavourable impact of reactants bearing bulky para- or 4-substituents. Renewable terpene-based solvents (terpinolene, p-cymene, D-limonene) were demonstrated to be suitable bio-based media for PSL amidation reactions. Such solvents could provide a greener and more sustainable alternative to traditional petrochemical derived non-polar solvents. Importantly, once the enzyme (either PSL or CALB) binds with a bulky para-substituted substrate, only small reagents are able to access the active site. This therefore limits the possibility for aminolysis to take place, thereby promoting the hydrolysis. This mechanism of binding supports the widely accepted 'Ping Pong - Bi Bi' mechanism used to describe enzyme kinetics. The work highlights the need to further investigate enzyme activity in relation to para- or 4-substituted substrates. A priority in PSL chemistry remains a methodology to tackle the competing hydrolysis reaction
The landscape of Neandertal ancestry in present-day humans
Analyses of Neandertal genomes have revealed that Neandertals have contributed genetic variants to modern humans1–2. The antiquity of Neandertal gene flow into modern humans means that regions that derive from Neandertals in any one human today are usually less than a hundred kilobases in size. However, Neandertal haplotypes are also distinctive enough that several studies have been able to detect Neandertal ancestry at specific loci1,3–8. Here, we have systematically inferred Neandertal haplotypes in the genomes of 1,004 present-day humans12. Regions that harbor a high frequency of Neandertal alleles in modern humans are enriched for genes affecting keratin filaments suggesting that Neandertal alleles may have helped modern humans adapt to non-African environments. Neandertal alleles also continue to shape human biology, as we identify multiple Neandertal-derived alleles that confer risk for disease. We also identify regions of millions of base pairs that are nearly devoid of Neandertal ancestry and enriched in genes, implying selection to remove genetic material derived from Neandertals. Neandertal ancestry is significantly reduced in genes specifically expressed in testis, and there is an approximately 5-fold reduction of Neandertal ancestry on chromosome X, which is known to harbor a disproportionate fraction of male hybrid sterility genes20–22. These results suggest that part of the reduction in Neandertal ancestry near genes is due to Neandertal alleles that reduced fertility in males when moved to a modern human genetic background
- …