Search CORE

38 research outputs found

Multiple seeds sensitivity using a single seed with threshold

Author: Egidi Lavinia
Manzini Giovanni
Publication venue: 'World Scientific Pub Co Pte Lt'
Publication date: 01/01/2015
Field of study

Spaced seeds are a fundamental tool for similarity search in biosequences. The best sensitivity/selectivity trade-offs are obtained using many seeds simultaneously: This is known as the multiple seed approach. Unfortunately, spaced seeds use a large amount of memory and the available RAM is a practical limit to the number of seeds one can use simultaneously. Inspired by some recent results on lossless seeds, we revisit the approach of using a single spaced seed and considering two regions homologous if the seed hits in at least t sufficiently close positions. We show that by choosing the locations of the don't care symbols in the seed using quadratic residues modulo a prime number, we derive single seeds that when used with a threshold t > 1 have competitive sensitivity/selectivity trade-offs, indeed close to the best multiple seeds known in the literature. In addition, the choice of the threshold t can be adjusted to modify sensitivity and selectivity a posteriori, thus enabling a more accurate search in the specific instance at issue. The seeds we propose also exhibit robustness and allow flexibility in usage

Crossref

Archivio della Ricerca - Università di Pisa

Archivio Istituzionale della Ricerca- Università del Piemonte Orientale

Space efficient merging of de Bruijn graphs and Wheeler graphs

Author: Egidi Lavinia
Louza Felipe A.
Manzini Giovanni
Publication venue
Publication date: 12/07/2021
Field of study

The merging of succinct data structures is a well established technique for the space efficient construction of large succinct indexes. In the first part of the paper we propose a new algorithm for merging succinct representations of de Bruijn graphs. Our algorithm has the same asymptotic cost of the state of the art algorithm for the same problem but it uses less than half of its working space. A novel important feature of our algorithm, not found in any of the existing tools, is that it can compute the Variable Order succinct representation of the union graph within the same asymptotic time/space bounds. In the second part of the paper we consider the more general problem of merging succinct representations of Wheeler graphs, a recently introduced graph family which includes as special cases de Bruijn graphs and many other known succinct indexes based on the BWT or one of its variants. We show that Wheeler graphs merging is in general a much more difficult problem, and we provide a space efficient algorithm for the slightly simplified problem of determining whether the union graph has an ordering that satisfies the Wheeler conditions.Comment: 24 pages, 10 figures. arXiv admin note: text overlap with arXiv:1902.0288

arXiv.org e-Print Archive

Archivio Istituzionale della Ricerca- Università del Piemonte Orientale

External memory BWT and LCP computation for sequence collections with applications

Author: Egidi Lavinia
Louza Felipe A.
Manzini Giovanni
Telles Guilherme P.
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 18th International Workshop on Algorithms in Bioinformatics (WABI 2018)
Publication date: 01/01/2018
Field of study

We propose an external memory algorithm for the computation of the BWT and LCP array for a collection of sequences. Our algorithm takes the amount of available memory as an input parameter, and tries to make the best use of it by splitting the input collection into subcollections sufficiently small that it can compute their BWT in RAM using an optimal linear time algorithm. Next, it merges the partial BWTs in external memory and in the process it also computes the LCP values. We show that our algorithm performs O(n maxlcp) sequential I/Os, where n is the total length of the collection and maxlcp is the maximum LCP value. The experimental results show that our algorithm outperforms the current best algorithm for collections of sequences with different lengths and when the average LCP of the collection is relatively small compared to the length of the sequences. In the second part of the paper, we show that our algorithm can be modified to output two additional arrays that, combined with the BWT and LCP arrays, provide simple, scan based, external memory algorithms for three well known problems in bioinformatics: the computation of the all pairs suffix-prefix overlaps, the computation of maximal repeats, and the construction of succinct de Bruijn graphs

arXiv.org e-Print Archive

Directory of Open Access Journals

Archivio della Ricerca - Università di Pisa

Dagstuhl Research Online Publication Server

Archivio Istituzionale della Ricerca- Università del Piemonte Orientale

A Bayesian Network Approach for the Interpretation of Cyber Attacks to Power Systems

Author: Cerotti Davide
Codetta Raiteri Daniele
Dondossola Giovanna
Egidi Lavinia
Franceschinis Giuliana
Portinale Luigi
Terruggia Roberta
Publication venue
Publication date: 01/01/2019
Field of study

The focus of this paper is on the analysis of the cyber security resilience of digital infrastructures deployed by power grids, internationally recognized as a priority since several recent cyber attacks targeted energy systems and in particular the power service. In response to the regulatory framework, this paper presents an analysis approach based on the Bayesian Networks formalism and on real world threat scenarios. Our approach enables analyses oriented to planning of security measures and monitoring, and to forecasting of adversarial behaviours

Archivio Istituzionale della Ricerca- Università del Piemonte Orientale

Analisi e rilevamento intelligente di processi di attacco alle Smart-Grid

Author: Daniele Codetta-Raiteri
Davide Cerotti
Giovanna Dondossola
Giuliana Franceschinis
Lavinia Egidi
Luigi Portinale
Roberta Terruggia
Publication venue
Publication date: 01/01/2019
Field of study

Proponiamo una metodologia basata sulle Reti Bayesiane come strumento di supporto all’analisi della sicurezza di Smart Grid, ed in particolare per la previsione di intrusioni e attività ostili

Archivio Istituzionale della Ricerca- Università del Piemonte Orientale

From digital audiobook to secure digital multimedia-book

Author: Bar
Biddle P.
Cheng S.
He S.
Karamitroglou
Lavinia Egidi
Li X.
Li X.
Marco Furini
Mihcak M.
Pfitzmann B.
Pfitzmann B.
Windows Media
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

Quantifier Elimination for the Theory of p-adic numbers

Author: EGIDI Lavinia
Publication venue
Publication date: 01/01/1998
Field of study

Archivio Istituzionale della Ricerca- Università del Piemonte Orientale

A Quantifier Elimination For The Theory Of p-Adic Numbers

Author: Lavinia Egidi
Publication venue
Publication date: 01/01/1998
Field of study

This paper presents a detailed analysis of a quantifier elimination algorithm for the first order theory of p-adic numbers based on a p-adic analogue of the cylindric algebraic decomposition. It is believed that such method should lead to an elementary upper bound for the theory

CiteSeerX

Archivio Istituzionale della Ricerca- Università del Piemonte Orientale

Better Spaced Seeds Using Quadratic Residues

Author: Egidi Lavinia
Manzini Giovanni
Publication venue: 'Elsevier BV'
Publication date: 01/01/2013
Field of study

Spaced seeds are used in approximate pattern matching algorithms to quickly discard regions where a match is not likely to occur. We propose a family of lossless spaced seeds based on Quadratic Residues modulo a prime number. Our seeds work with a threshold t>1 in the sense that two regions are considered similar only if the seed hits t times within the regions. We prove that, for any number of errors, our seeds have an exponentially smaller probability of producing false positive matches than any traditional seed using a threshold t=1. To establish our result we introduce a formal notion of selectivity that generalizes the concept of seed weight, and we relate it to the minimum coverage and to a new structural property defined in terms on seed rotations. This groundwork will be useful for further analysis on seeds with threshold and we use it to provide improved bounds for approximate matching with 2 or 3 errors. Our results show that the use of a single seed with a threshold t>1 should be considered as a possible alternative to single or multiple seeds with t=1

Crossref

Archivio della Ricerca - Università di Pisa

Archivio Istituzionale della Ricerca- Università del Piemonte Orientale