Search CORE

49 research outputs found

Sorting suffixes of a text via its Lyndon Factorization

Author: Mantaci Sabrina
Restivo Antonio
Rosone Giovanna
Sciortino Marinella
Publication venue
Publication date: 01/01/2013
Field of study

The process of sorting the suffixes of a text plays a fundamental role in Text Algorithms. They are used for instance in the constructions of the Burrows-Wheeler transform and the suffix array, widely used in several fields of Computer Science. For this reason, several recent researches have been devoted to finding new strategies to obtain effective methods for such a sorting. In this paper we introduce a new methodology in which an important role is played by the Lyndon factorization, so that the local suffixes inside factors detected by this factorization keep their mutual order when extended to the suffixes of the whole word. This property suggests a versatile technique that easily can be adapted to different implementative scenarios.Comment: Submitted to the Prague Stringology Conference 2013 (PSC 2013

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Università di Palermo

Lightweight LCP Construction for Very Large Collections of Strings

Author: Cox Anthony J.
Garofalo Fabio
Rosone Giovanna
Sciortino Marinella
Publication venue: 'Elsevier BV'
Publication date: 01/01/2016
Field of study

The longest common prefix array is a very advantageous data structure that, combined with the suffix array and the Burrows-Wheeler transform, allows to efficiently compute some combinatorial properties of a string useful in several applications, especially in biological contexts. Nowadays, the input data for many problems are big collections of strings, for instance the data coming from "next-generation" DNA sequencing (NGS) technologies. In this paper we present the first lightweight algorithm (called extLCP) for the simultaneous computation of the longest common prefix array and the Burrows-Wheeler transform of a very large collection of strings having any length. The computation is realized by performing disk data accesses only via sequential scans, and the total disk space usage never needs more than twice the output size, excluding the disk space required for the input. Moreover, extLCP allows to compute also the suffix array of the strings of the collection, without any other further data structure is needed. Finally, we test our algorithm on real data and compare our results with another tool capable to work in external memory on large collections of strings.Comment: This manuscript version is made available under the CC-BY-NC-ND 4.0 license http://creativecommons.org/licenses/by-nc-nd/4.0/ The final version of this manuscript is in press in Journal of Discrete Algorithm

arXiv.org e-Print Archive

Archivio della Ricerca - Università di Pisa

Archivio istituzionale della ricerca - Università di Palermo

A New Class of Searchable and Provably Highly Compressible String Transformations

Author: Giancarlo Raffaele
Manzini Giovanni
Rosone Giovanna
Sciortino Marinella
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 30th Annual Symposium on Combinatorial Pattern Matching (CPM 2019)
Publication date: 01/01/2019
Field of study

The Burrows-Wheeler Transform is a string transformation that plays a fundamental role for the design of self-indexing compressed data structures. Over the years, researchers have successfully extended this transformation outside the domains of strings. However, efforts to find non-trivial alternatives of the original, now 25 years old, Burrows-Wheeler string transformation have met limited success. In this paper we bring new lymph to this area by introducing a whole new family of transformations that have all the "myriad virtues" of the BWT: they can be computed and inverted in linear time, they produce provably highly compressible strings, and they support linear time pattern search directly on the transformed string. This new family is a special case of a more general class of transformations based on context adaptive alphabet orderings, a concept introduced here. This more general class includes also the Alternating BWT, another invertible string transforms recently introduced in connection with a generalization of Lyndon words

arXiv.org e-Print Archive

Archivio della Ricerca - Università di Pisa

Dagstuhl Research Online Publication Server

Archivio Istituzionale della Ricerca- Università del Piemonte Orientale

Archivio istituzionale della ricerca - Università di Palermo

Detecting Mutations by eBWT

Author: Pisanti Nadia
PREZZA NICOLA
Rosone Giovanna
Sciortino Marinella
Publication venue: place:Leibniz
Publication date: 01/01/2018
Field of study

In this paper we develop a theory describing how the extended Burrows-Wheeler Transform (EBWT) of a collection of DNA fragments tends to cluster together the copies of nucleotides sequenced from a genome G. Our theory accurately predicts how many copies of any nucleotide are expected inside each such cluster, and how an elegant and precise LCP array based procedure can locate these clusters in the EBWT. Our findings are very general and can be applied to a wide range of different problems. In this paper, we consider the case of alignment-free and reference-free SNPs discovery in multiple collections of reads. We note that, in accordance with our theoretical results, SNPs are clustered in the EBWT of the reads collection, and we develop a tool finding SNPs with a simple scan of the EBWT and LCP arrays. Preliminary results show that our method requires much less coverage than state-of-the-art tools while drastically improving precision and sensitivity

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

Archivio della Ricerca - Università di Pisa

Dagstuhl Research Online Publication Server

Archivio istituzionale della ricerca - Università degli Studi di Venezia Ca' Foscari

Archivio della ricerca- LUISS Libera Università Internazionale degli Studi Sociali Guido Carli di Roma

Archivio istituzionale della ricerca - Università di Palermo

Lightweight Reference-Free Variation Detection using the Burrows-Wheeler Transform

Author: Giovanna Rosone
Marinella Sciortino
Nadia Pisanti
Nicola Prezza
Publication venue
Publication date: 01/01/2019
Field of study

Lightweight Reference-Free Variation Detection using the Burrows-Wheeler Transfor

Archivio della Ricerca - Università di Pisa

Variable-order reference-free variant discovery with the Burrows-Wheeler Transform

Author: Pisanti Nadia
Prezza Nicola
Rosone Giovanna
Sciortino Marinella
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2020
Field of study

International audienceBackground: In [Prezza et al., AMB 2019], a new reference-free and alignment-free framework for the detection of SNPs was suggested and tested. The framework, based on the Burrows-Wheeler Transform (BWT), significantly improves sensitivity and precision of previous de Bruijn graphs based tools by overcoming several of their limitations, namely: (i) the need to establish a fixed value, usually small, for the order k, (ii) the loss of important information such as k-mer coverage and adjacency of k-mers within the same read, and (iii) bad performance in repeated regions longer than k bases. The preliminary tool, however, was able to identify only SNPs and it was too slow and memory consuming due to the use of additional heavy data structures (namely, the Suffix and LCP arrays), besides the BWT. Results: In this paper, we introduce a new algorithm and the corresponding tool ebwt2InDel that (i) extend the framework of [Prezza et al., AMB 2019] to detect also INDELs, and (ii) implements recent algorithmic findings that allow to perform the whole analysis using just the BWT, thus reducing the working space by one order of magnitude and allowing the analysis of full genomes. Finally, we describe a simple strategy for effectively parallelizing our tool for SNP detection only. On a 24-cores machine, the parallel version of our tool is one order of magnitude faster than the sequential one. The tool ebwt2InDel is available at github.com/nicolaprezza/ebwt2InDel. Conclusions: Results on a synthetic dataset covered at 30x (Human chromosome 1) show that our tool is indeed able to find up to 83% of the SNPs and 72% of the existing INDELs. These percentages considerably improve the 71% of SNPs and 51% of INDELs found by the state-of-the art tool based on de Bruijn graphs. We furthermore repor

INRIA a CCSD electronic archive server

Archivio istituzionale della ricerca - Università degli Studi di Venezia Ca' Foscari

Implant replacement and anaplastic large cell lymphoma associated with breast implants: a quantitative analysis

Author: Belluzzo Miriam
Bonaccorso Nicole
Contiero Paolo
Costantino Claudio
Costanza Davide
De Bella Daniele Domenico
Di Napoli Arianna
Fruscione Santo
Immordino Palmira
Mazzola Sergio
Mazzucco Walter
Savatteri Alessandra
Sciortino Martina
Tagliabue Giovanna
Tramuto Fabio
Vitale Francesco
Vittorietti Martina
Publication venue
Publication date: 19/10/2023
Field of study

Breast implant-associated anaplastic large-cell lymphoma (BIAALCL) is a rare form of non-Hodgkin T-cell lymphoma associated with breast reconstruction post-mastectomy or cosmetic-additive mammoplasty. The increasing use of implants for cosmetic purposes is expected to lead to an increase in BIA-ALCL cases. This study investigated the main characteristics of the disease and the factors predicting BIA-ALCL onset in patients with and without an implant replacement

Archivio istituzionale della ricerca - Università di Palermo

Effects of somatostatin analogues on muscle sympathetic nerve activity in acromegaly

Author: Attanasio Roberto
Carzaniga Chiara
Cavagnini Francesco
Cozzi Renato
Damanti Sarah
Grassi Guido
Mancia Giuseppe
Maria Fatti Letizia
Montini Marcella
Persani Luca
Scacchi Massimo
Sciortino Giovanna
Seravalle Gino
Vitale Giovanni
Publication venue
Publication date: 01/04/2013
Field of study

Crossref

Open Access Repository