524 research outputs found
Mariages mixtes, migration féminine et travail domestique: un regard sur la situation italienne
The article stimulates a reflection on the theme of mixed marriages in Italy, with special reference to the so-called \u201ccaregivers\u2019 marriages\u201d. In recent years, these have been subject to an increasing stigmatization in the Italian public debate, contributing to legitimize some recent reforms in the national pension system. Drawing on the narrative of a young domestic worker married to an older Italian man, the article calls for a more complex and multifaceted vision of these processes, which have so far received little attention from the Italian and international research in the field.L\u2019article propose une r\ue9flexion sur le th\ue8me des mariages mixtes en Italie, et en particulier desdits \uab mariages des assistantes de vie \ue9trang\ue8res \ue0 domicile \ubb. Ceux-ci, en effet, ces derni\ue8res ann\ue9es, ont fait l\u2019objet de lectures posant probl\ue8me dans le d\ue9bat public italien, contribuant \ue0 l\ue9gitimer certaines r\ue9formes r\ue9centes du syst\ue8me des retraites. Prenant comme base des points qui \ue9mergent de l\u2019exp\ue9rience directe d\u2019une jeune employ\ue9e de maison \ue9trang\ue8re, mari\ue9e \ue0 un homme italien plus \ue2g\ue9, l\u2019article vise \ue0 livrer une vision plus complexe de ces processus qui ont encore peu suscit\ue9, dans l\u2019ensemble, l\u2019attention des chercheurs italiens et internationaux sp\ue9cialis\ue9s
Parameterized Complexity of the k-anonymity Problem
The problem of publishing personal data without giving up privacy is becoming
increasingly important. An interesting formalization that has been recently
proposed is the -anonymity. This approach requires that the rows of a table
are partitioned in clusters of size at least and that all the rows in a
cluster become the same tuple, after the suppression of some entries. The
natural optimization problem, where the goal is to minimize the number of
suppressed entries, is known to be APX-hard even when the records values are
over a binary alphabet and , and when the records have length at most 8
and . In this paper we study how the complexity of the problem is
influenced by different parameters. In this paper we follow this direction of
research, first showing that the problem is W[1]-hard when parameterized by the
size of the solution (and the value ). Then we exhibit a fixed parameter
algorithm, when the problem is parameterized by the size of the alphabet and
the number of columns. Finally, we investigate the computational (and
approximation) complexity of the -anonymity problem, when restricting the
instance to records having length bounded by 3 and . We show that such a
restriction is APX-hard.Comment: 22 pages, 2 figure
Covering Pairs in Directed Acyclic Graphs
The Minimum Path Cover problem on directed acyclic graphs (DAGs) is a
classical problem that provides a clear and simple mathematical formulation for
several applications in different areas and that has an efficient algorithmic
solution. In this paper, we study the computational complexity of two
constrained variants of Minimum Path Cover motivated by the recent introduction
of next-generation sequencing technologies in bioinformatics. The first problem
(MinPCRP), given a DAG and a set of pairs of vertices, asks for a minimum
cardinality set of paths "covering" all the vertices such that both vertices of
each pair belong to the same path. For this problem, we show that, while it is
NP-hard to compute if there exists a solution consisting of at most three
paths, it is possible to decide in polynomial time whether a solution
consisting of at most two paths exists. The second problem (MaxRPSP), given a
DAG and a set of pairs of vertices, asks for a path containing the maximum
number of the given pairs of vertices. We show its NP-hardness and also its
W[1]-hardness when parametrized by the number of covered pairs. On the positive
side, we give a fixed-parameter algorithm when the parameter is the maximum
overlapping degree, a natural parameter in the bioinformatics applications of
the problem
On the Complexity of -Closeness Anonymization and Related Problems
An important issue in releasing individual data is to protect the sensitive
information from being leaked and maliciously utilized. Famous privacy
preserving principles that aim to ensure both data privacy and data integrity,
such as -anonymity and -diversity, have been extensively studied both
theoretically and empirically. Nonetheless, these widely-adopted principles are
still insufficient to prevent attribute disclosure if the attacker has partial
knowledge about the overall sensitive data distribution. The -closeness
principle has been proposed to fix this, which also has the benefit of
supporting numerical sensitive attributes. However, in contrast to
-anonymity and -diversity, the theoretical aspect of -closeness has
not been well investigated.
We initiate the first systematic theoretical study on the -closeness
principle under the commonly-used attribute suppression model. We prove that
for every constant such that , it is NP-hard to find an optimal
-closeness generalization of a given table. The proof consists of several
reductions each of which works for different values of , which together
cover the full range. To complement this negative result, we also provide exact
and fixed-parameter algorithms. Finally, we answer some open questions
regarding the complexity of -anonymity and -diversity left in the
literature.Comment: An extended abstract to appear in DASFAA 201
The zero exemplar distance problem
Given two genomes with duplicate genes, \textsc{Zero Exemplar Distance} is
the problem of deciding whether the two genomes can be reduced to the same
genome without duplicate genes by deleting all but one copy of each gene in
each genome. Blin, Fertin, Sikora, and Vialette recently proved that
\textsc{Zero Exemplar Distance} for monochromosomal genomes is NP-hard even if
each gene appears at most two times in each genome, thereby settling an
important open question on genome rearrangement in the exemplar model. In this
paper, we give a very simple alternative proof of this result. We also study
the problem \textsc{Zero Exemplar Distance} for multichromosomal genomes
without gene order, and prove the analogous result that it is also NP-hard even
if each gene appears at most two times in each genome. For the positive
direction, we show that both variants of \textsc{Zero Exemplar Distance} admit
polynomial-time algorithms if each gene appears exactly once in one genome and
at least once in the other genome. In addition, we present a polynomial-time
algorithm for the related problem \textsc{Exemplar Longest Common Subsequence}
in the special case that each mandatory symbol appears exactly once in one
input sequence and at least once in the other input sequence. This answers an
open question of Bonizzoni et al. We also show that \textsc{Zero Exemplar
Distance} for multichromosomal genomes without gene order is fixed-parameter
tractable if the parameter is the maximum number of chromosomes in each genome.Comment: Strengthened and reorganize
Linear splicing and syntactic monoid
AbstractSplicing systems were introduced by Head in 1987 as a formal counterpart of a biological mechanism of DNA recombination under the action of restriction and ligase enzymes. Despite the intensive studies on linear splicing systems, some elementary questions about their computational power are still open. In particular, in this paper we face the problem of characterizing the proper subclass of regular languages which are generated by finite (Paun) linear splicing systems. We introduce here the class of marker languages L, i.e., regular languages with the form L=L1[x]1L2, where L1,L2 are regular languages, [x] is a syntactic congruence class satisfying special conditions and [x]1 is either equal to [x] or equal to [x]∪{1}, 1 being the empty word. Using classical properties of formal language theory, we give an algorithm which allows us to decide whether a regular language is a marker language. Furthermore, for each marker language L we exhibit a finite Paun linear splicing system and we prove that this system generates L
Approximating Clustering of Fingerprint Vectors with Missing Values
The problem of clustering fingerprint vectors is an interesting problem in
Computational Biology that has been proposed in (Figureroa et al. 2004). In
this paper we show some improvements in closing the gaps between the known
lower bounds and upper bounds on the approximability of some variants of the
biological problem. Namely we are able to prove that the problem is APX-hard
even when each fingerprint contains only two unknown position. Moreover we have
studied some variants of the orginal problem, and we give two 2-approximation
algorithm for the IECMV and OECMV problems when the number of unknown entries
for each vector is at most a constant.Comment: 13 pages, 4 figure
MALVA: Genotyping by Mapping-free ALlele Detection of Known VAriants
The amount of genetic variation discovered in human populations is growing rapidly leading to challenging computational tasks, such as variant calling. Standard methods for addressing this problem include read mapping, a computationally expensive procedure; thus, mapping-free tools have been proposed in recent years. These tools focus on isolated, biallelic SNPs, providing limited support for multi-allelic SNPs and short insertions and deletions of nucleotides (indels). Here we introduce MALVA, a mapping-free method to genotype an individual from a sample of reads. MALVA is the first mapping-free tool able to genotype multi-allelic SNPs and indels, even in high-density genomic regions, and to effectively handle a huge number of variants. MALVA requires one order of magnitude less time to genotype a donor than alignment-based pipelines, providing similar accuracy. Remarkably, on indels, MALVA provides even better results than the most widely adopted variant discovery tools. Biological Sciences; Genetics; Genomics; Bioinformatic
Insights into an unexplored component of the mosquito repeatome: Distribution and variability of viral sequences integrated into the genome of the arboviral vector aedes albopictus
The Asian tiger mosquito Aedes albopictus is an invasive mosquito and a competent vector for public-health relevant arboviruses such as Chikungunya (Alphavirus), Dengue and Zika (Flavivirus) viruses. Unexpectedly, the sequencing of the genome of this mosquito revealed an unusually high number of integrated sequences with similarities to non-retroviral RNA viruses of the Flavivirus and Rhabdovirus genera. These Non-retroviral Integrated RNA Virus Sequences (NIRVS) are enriched in piRNA clusters and coding sequences and have been proposed to constitute novel mosquito immune factors. However, given the abundance of NIRVS and their variable viral origin, their relative biological roles remain unexplored. Here we used an analytical approach that intersects computational, evolutionary and molecular methods to study the genomic landscape of mosquito NIRVS. We demonstrate that NIRVS are differentially distributed across mosquito genomes, with a core set of seemingly the oldest integrations with similarity to Rhabdoviruses. Additionally, we compare the polymorphisms of NIRVS with respect to that of fast and slow-evolving genes within the Ae. albopictus genome. Overall, NIRVS appear to be less polymorphic than slow-evolving genes, with differences depending on whether they occur in intergenic regions or in piRNA clusters. Finally, two NIRVS that map within the coding sequences of genes annotated as Rhabdovirus RNA-dependent RNA polymerase and the nucleocapsid-encoding gene, respectively, are highly polymorphic and are expressed, suggesting exaptation possibly to enhance the mosquito's antiviral responses. These results greatly advance our understanding of the complexity of the mosquito repeatome and the biology of viral integrations in mosquito genomes
ASPicDB: a database of annotated transcript and protein variants generated by alternative splicing
Alternative splicing is emerging as a major mechanism for the expansion of the transcriptome and proteome diversity, particularly in human and other vertebrates. However, the proportion of alternative transcripts and proteins actually endowed with functional activity is currently highly debated. We present here a new release of ASPicDB which now provides a unique annotation resource of human protein variants generated by alternative splicing. A total of 256 939 protein variants from 17 191 multi-exon genes have been extensively annotated through state of the art machine learning tools providing information of the protein type (globular and transmembrane), localization, presence of PFAM domains, signal peptides, GPI-anchor propeptides, transmembrane and coiled-coil segments. Furthermore, full-length variants can be now specifically selected based on the annotation of CAGE-tags and polyA signal and/or polyA sites, marking transcription initiation and termination sites, respectively. The retrieval can be carried out at gene, transcript, exon, protein or splice site level allowing the selection of data sets fulfilling one or more features settled by the user. The retrieval interface also enables the selection of protein variants showing specific differences in the annotated features. ASPicDB is available at http://www.caspur.it/ASPicDB/
- …