624 research outputs found
Covering Pairs in Directed Acyclic Graphs
The Minimum Path Cover problem on directed acyclic graphs (DAGs) is a
classical problem that provides a clear and simple mathematical formulation for
several applications in different areas and that has an efficient algorithmic
solution. In this paper, we study the computational complexity of two
constrained variants of Minimum Path Cover motivated by the recent introduction
of next-generation sequencing technologies in bioinformatics. The first problem
(MinPCRP), given a DAG and a set of pairs of vertices, asks for a minimum
cardinality set of paths "covering" all the vertices such that both vertices of
each pair belong to the same path. For this problem, we show that, while it is
NP-hard to compute if there exists a solution consisting of at most three
paths, it is possible to decide in polynomial time whether a solution
consisting of at most two paths exists. The second problem (MaxRPSP), given a
DAG and a set of pairs of vertices, asks for a path containing the maximum
number of the given pairs of vertices. We show its NP-hardness and also its
W[1]-hardness when parametrized by the number of covered pairs. On the positive
side, we give a fixed-parameter algorithm when the parameter is the maximum
overlapping degree, a natural parameter in the bioinformatics applications of
the problem
Parameterized Complexity of the k-anonymity Problem
The problem of publishing personal data without giving up privacy is becoming
increasingly important. An interesting formalization that has been recently
proposed is the -anonymity. This approach requires that the rows of a table
are partitioned in clusters of size at least and that all the rows in a
cluster become the same tuple, after the suppression of some entries. The
natural optimization problem, where the goal is to minimize the number of
suppressed entries, is known to be APX-hard even when the records values are
over a binary alphabet and , and when the records have length at most 8
and . In this paper we study how the complexity of the problem is
influenced by different parameters. In this paper we follow this direction of
research, first showing that the problem is W[1]-hard when parameterized by the
size of the solution (and the value ). Then we exhibit a fixed parameter
algorithm, when the problem is parameterized by the size of the alphabet and
the number of columns. Finally, we investigate the computational (and
approximation) complexity of the -anonymity problem, when restricting the
instance to records having length bounded by 3 and . We show that such a
restriction is APX-hard.Comment: 22 pages, 2 figure
The zero exemplar distance problem
Given two genomes with duplicate genes, \textsc{Zero Exemplar Distance} is
the problem of deciding whether the two genomes can be reduced to the same
genome without duplicate genes by deleting all but one copy of each gene in
each genome. Blin, Fertin, Sikora, and Vialette recently proved that
\textsc{Zero Exemplar Distance} for monochromosomal genomes is NP-hard even if
each gene appears at most two times in each genome, thereby settling an
important open question on genome rearrangement in the exemplar model. In this
paper, we give a very simple alternative proof of this result. We also study
the problem \textsc{Zero Exemplar Distance} for multichromosomal genomes
without gene order, and prove the analogous result that it is also NP-hard even
if each gene appears at most two times in each genome. For the positive
direction, we show that both variants of \textsc{Zero Exemplar Distance} admit
polynomial-time algorithms if each gene appears exactly once in one genome and
at least once in the other genome. In addition, we present a polynomial-time
algorithm for the related problem \textsc{Exemplar Longest Common Subsequence}
in the special case that each mandatory symbol appears exactly once in one
input sequence and at least once in the other input sequence. This answers an
open question of Bonizzoni et al. We also show that \textsc{Zero Exemplar
Distance} for multichromosomal genomes without gene order is fixed-parameter
tractable if the parameter is the maximum number of chromosomes in each genome.Comment: Strengthened and reorganize
On the Complexity of -Closeness Anonymization and Related Problems
An important issue in releasing individual data is to protect the sensitive
information from being leaked and maliciously utilized. Famous privacy
preserving principles that aim to ensure both data privacy and data integrity,
such as -anonymity and -diversity, have been extensively studied both
theoretically and empirically. Nonetheless, these widely-adopted principles are
still insufficient to prevent attribute disclosure if the attacker has partial
knowledge about the overall sensitive data distribution. The -closeness
principle has been proposed to fix this, which also has the benefit of
supporting numerical sensitive attributes. However, in contrast to
-anonymity and -diversity, the theoretical aspect of -closeness has
not been well investigated.
We initiate the first systematic theoretical study on the -closeness
principle under the commonly-used attribute suppression model. We prove that
for every constant such that , it is NP-hard to find an optimal
-closeness generalization of a given table. The proof consists of several
reductions each of which works for different values of , which together
cover the full range. To complement this negative result, we also provide exact
and fixed-parameter algorithms. Finally, we answer some open questions
regarding the complexity of -anonymity and -diversity left in the
literature.Comment: An extended abstract to appear in DASFAA 201
Approximating Clustering of Fingerprint Vectors with Missing Values
The problem of clustering fingerprint vectors is an interesting problem in
Computational Biology that has been proposed in (Figureroa et al. 2004). In
this paper we show some improvements in closing the gaps between the known
lower bounds and upper bounds on the approximability of some variants of the
biological problem. Namely we are able to prove that the problem is APX-hard
even when each fingerprint contains only two unknown position. Moreover we have
studied some variants of the orginal problem, and we give two 2-approximation
algorithm for the IECMV and OECMV problems when the number of unknown entries
for each vector is at most a constant.Comment: 13 pages, 4 figure
Migration and Legal Precarity in the Time of Pandemic : Qualitative Research on the Italian Case
The COVID-19 pandemic has unequally impacted the lives of Italian subjects. The article uses evidence from forty-seven semi-structured interviews with various migrant groups to illuminate how temporalities embedded in Italy’s migration governance shape migrants’ precarious legal status and access to welfare. The authors show that whereas migrants with secure legal status or citizenship have not engaged significantly with Italian bureaucracies, they have no easy access to welfare as it is contingent on their employment and financial status. Migrants with precarious status have been the worst hit by the pandemic’s secondary effects across several fronts. These findings have implications for policy and future research
Reconciliation Revisited: Handling Multiple Optima when Reconciling with Duplication, Transfer, and Loss
Phylogenetic tree reconciliation is a powerful approach for inferring evolutionary events like gene duplication, horizontal gene transfer, and gene loss, which are fundamental to our understanding of molecular evolution. While duplication–loss (DL) reconciliation leads to a unique maximum-parsimony solution, duplication-transfer-loss (DTL) reconciliation yields a multitude of optimal solutions, making it difficult to infer the true evolutionary history of the gene family. This problem is further exacerbated by the fact that different event cost assignments yield different sets of optimal reconciliations. Here, we present an effective, efficient, and scalable method for dealing with these fundamental problems in DTL reconciliation. Our approach works by sampling the space of optimal reconciliations uniformly at random and aggregating the results. We show that even gene trees with only a few dozen genes often have millions of optimal reconciliations and present an algorithm to efficiently sample the space of optimal reconciliations uniformly at random in O(mn[superscript 2]) time per sample, where m and n denote the number of genes and species, respectively. We use these samples to understand how different optimal reconciliations vary in their node mappings and event assignments and to investigate the impact of varying event costs. We apply our method to a biological dataset of approximately 4700 gene trees from 100 taxa and observe that 93% of event assignments and 73% of mappings remain consistent across different multiple optima. Our analysis represents the first systematic investigation of the space of optimal DTL reconciliations and has many important implications for the study of gene family evolution.National Science Foundation (U.S.) (CAREER Award 0644282)National Institutes of Health (U.S.) (Grant RC2 HG005639)National Science Foundation (U.S.). Assembling the Tree of Life (Program) (Grant 0936234
Mariages mixtes, migration féminine et travail domestique: un regard sur la situation italienne
The article stimulates a reflection on the theme of mixed marriages in Italy, with special reference to the so-called \u201ccaregivers\u2019 marriages\u201d. In recent years, these have been subject to an increasing stigmatization in the Italian public debate, contributing to legitimize some recent reforms in the national pension system. Drawing on the narrative of a young domestic worker married to an older Italian man, the article calls for a more complex and multifaceted vision of these processes, which have so far received little attention from the Italian and international research in the field.L\u2019article propose une r\ue9flexion sur le th\ue8me des mariages mixtes en Italie, et en particulier desdits \uab mariages des assistantes de vie \ue9trang\ue8res \ue0 domicile \ubb. Ceux-ci, en effet, ces derni\ue8res ann\ue9es, ont fait l\u2019objet de lectures posant probl\ue8me dans le d\ue9bat public italien, contribuant \ue0 l\ue9gitimer certaines r\ue9formes r\ue9centes du syst\ue8me des retraites. Prenant comme base des points qui \ue9mergent de l\u2019exp\ue9rience directe d\u2019une jeune employ\ue9e de maison \ue9trang\ue8re, mari\ue9e \ue0 un homme italien plus \ue2g\ue9, l\u2019article vise \ue0 livrer une vision plus complexe de ces processus qui ont encore peu suscit\ue9, dans l\u2019ensemble, l\u2019attention des chercheurs italiens et internationaux sp\ue9cialis\ue9s
Beyond Perfect phylogeny: Multisample Phylogeny reconstruction via ILP
Most of the evolutionary history reconstruction approaches are based on the infinite site assumption which is underlying the Perfect Phylogeny model. This is one of the most used models in cancer genomics. Recent results gives a strong evidence that recurrent and back mutations are present in the evolutionary history of tumors [19], thus showing that more general models then the Perfect phylogeny are required. To address this problem we propose a framework based on the notion of Incomplete Perfect Phylogeny. Our framework incorporates losing and gaining mutations, hence including the Dollo and the Camin-Sokal models, and is described with an Integer Linear Programming (ILP) formulation. Our approach generalizes the notion of persistent phylogeny [1] and the ILP approach [14, 15] proposed to solve the corresponding phylogeny reconstruction problem on character data. The final goal of our paper is to integrate our approach into an ILP formulation of the problem of reconstructing trees on mixed populations, where the input data consists of the fraction of cells in a set of samples that have a certain mutation. This is a fundamental problem in cancer genomics, where the goal is to study the evolutionary history of a tumor. An experimental analysis shows that our ILP approach is able to explain data that do not fit the perfect phylogeny assumption, thereby allowing (1) multiple losses and gains of mutations, and (2) a number of subpopulations that is smaller than the number of input mutations
Accepting splicing systems with permitting and forbidding words
Abstract: In this paper we propose a generalization of the accepting splicingsystems introduced in Mitrana et al. (Theor Comput Sci 411:2414?2422,2010). More precisely, the input word is accepted as soon as a permittingword is obtained provided that no forbidding word has been obtained sofar, otherwise it is rejected. Note that in the new variant of acceptingsplicing system the input word is rejected if either no permitting word isever generated (like in Mitrana et al. in Theor Comput Sci 411:2414?2422,2010) or a forbidding word has been generated and no permitting wordhad been generated before. We investigate the computational power ofthe new variants of accepting splicing systems and the interrelationshipsamong them. We show that the new condition strictly increases thecomputational power of accepting splicing systems. Although there areregular languages that cannot be accepted by any of the splicing systemsconsidered here, the new variants can accept non-regular and even non-context-free languages, a situation that is not very common in the case of(extended) finite splicing systems without additional restrictions. We alsoshow that the smallest class of languages out of the four classes definedby accepting splicing systems is strictly included in the class of context-free languages. Solutions to a few decidability problems are immediatelyderived from the proof of this result
- …
