Search CORE

12 research outputs found

Parameterized Complexity of the k-anonymity Problem

Author: A Gionis
A Meyerson
G Aggarwal
G Ausiello
Gianluca Della Vedova
H Alt
H Park
J Blocki
L Sweeney
P Alimonti
P Bonizzoni
P Samarati
P Samarati
PA Evans
Paola Bonizzoni
R Diestel
R Downey
R Niedermeier
RG Downey
Riccardo Dondi
W Du
Yuri Pirola
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 17/05/2010
Field of study

The problem of publishing personal data without giving up privacy is becoming increasingly important. An interesting formalization that has been recently proposed is the

k

-anonymity. This approach requires that the rows of a table are partitioned in clusters of size at least

k

and that all the rows in a cluster become the same tuple, after the suppression of some entries. The natural optimization problem, where the goal is to minimize the number of suppressed entries, is known to be APX-hard even when the records values are over a binary alphabet and

k=3

, and when the records have length at most 8 and

k=4

. In this paper we study how the complexity of the problem is influenced by different parameters. In this paper we follow this direction of research, first showing that the problem is W[1]-hard when parameterized by the size of the solution (and the value

k

). Then we exhibit a fixed parameter algorithm, when the problem is parameterized by the size of the alphabet and the number of columns. Finally, we investigate the computational (and approximation) complexity of the

k

-anonymity problem, when restricting the instance to records having length bounded by 3 and

k=3

. We show that such a restriction is APX-hard.Comment: 22 pages, 2 figure

arXiv.org e-Print Archive

Crossref

MALVA: Genotyping by Mapping-free ALlele Detection of Known VAriants

Author: Bernardini G. (Giulia)
Bonizzoni P. (Paola)
Denti L. (Luca)
Previtali M. (Marco)
Schönhuth A. (Alexander)
Publication venue: 'Elsevier BV'
Publication date: 30/08/2019
Field of study

The amount of genetic variation discovered in human populations is growing rapidly leading to challenging computational tasks, such as variant calling. Standard methods for addressing this problem include read mapping, a computationally expensive procedure; thus, mapping-free tools have been proposed in recent years. These tools focus on isolated, biallelic SNPs, providing limited support for multi-allelic SNPs and short insertions and deletions of nucleotides (indels). Here we introduce MALVA, a mapping-free method to genotype an individual from a sample of reads. MALVA is the first mapping-free tool able to genotype multi-allelic SNPs and indels, even in high-density genomic regions, and to effectively handle a huge number of variants. MALVA requires one order of magnitude less time to genotype a donor than alignment-based pipelines, providing similar accuracy. Remarkably, on indels, MALVA provides even better results than the most widely adopted variant discovery tools. Biological Sciences; Genetics; Genomics; Bioinformatic

CWI's Institutional Repository

Approximating Clustering of Fingerprint Vectors with Missing Values

Author: A. Figueroa
C.H. Papadimitriou
G. Ausiello
Giancarlo Mauri
Gianluca Della Vedova
L. Valinsky
L. Valinsky
M. Chlebík
P. Alimonti
Paola Bonizzoni
R. Drmanac
Riccardo Dondi
S. Drmanac
S. Drmanac
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 23/11/2005
Field of study

The problem of clustering fingerprint vectors is an interesting problem in Computational Biology that has been proposed in (Figureroa et al. 2004). In this paper we show some improvements in closing the gaps between the known lower bounds and upper bounds on the approximability of some variants of the biological problem. Namely we are able to prove that the problem is APX-hard even when each fingerprint contains only two unknown position. Moreover we have studied some variants of the orginal problem, and we give two 2-approximation algorithm for the IECMV and OECMV problems when the number of unknown entries for each vector is at most a constant.Comment: 13 pages, 4 figure

arXiv.org e-Print Archive

Crossref

Explaining evolution via constrained persistent perfect phylogeny

Author: A Dress
A Subramanian
Anna Paola Carrieri
C Benham
D Fernández-Baca
D Gusfield
D Gusfield
D Gusfield
F Pan
Gabriella Trucco
Gianluca Della Vedova
HL Bodlaender
I Peer
J Felsenstein
J Manuch
J Maňuch
J Zheng
P Bonizzoni
P Bonizzoni
P Bonizzoni
Paola Bonizzoni
RG Downey
RR Hudson
RV Satya
S Kannan
SK Kannan
T Przytycka
T Przytycka
Z Ding
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Triplet-based similarity score for fully multilabeled trees with poly-occurring labels

Author: Bernardini G. (Giulia)
Bonizzoni P. (Paola)
Ciccolella S. (Simone)
Della Vedova G. (Gianluca)
Denti L. (Luca)
Previtali M. (Marco)
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2021
Field of study

Motivation: The latest advances in cancer sequencing, and the availability of a wide range of methods to infer the evolutionary history of tumors, have made it important to evaluate, reconcile and cluster different tumor phylogenies. Recently, several notions of distance or similarities have been proposed in the literature, but none of them has emerged as the golden standard. Moreover, none of the known similarity measures is able to manage mutations occurring multiple times in the tree, a circumstance often occurring in real cases. Results: To overcome these limitations, in this article, we propose MP3, the first similarity measure for tumor phylogenies able to effectively manage cases where multiple mutations can occur at the same time and mutations can occur multiple times. Moreover, a comparison of MP3 with other measures shows that it is able to classify correctly similar and dissimilar trees, both on simulated and on real data

Archivio istituzionale della ricerca - Università di Trieste

Crossref

CWI's Institutional Repository