215,071 research outputs found
Rough sets approach to symbolic value partition
AbstractIn data mining, searching for simple representations of knowledge is a very important issue. Attribute reduction, continuous attribute discretization and symbolic value partition are three preprocessing techniques which are used in this regard. This paper investigates the symbolic value partition technique, which divides each attribute domain of a data table into a family for disjoint subsets, and constructs a new data table with fewer attributes and smaller attribute domains. Specifically, we investigates the optimal symbolic value partition (OSVP) problem of supervised data, where the optimal metric is defined by the cardinality sum of new attribute domains. We propose the concept of partition reducts for this problem. An optimal partition reduct is the solution to the OSVP-problem. We develop a greedy algorithm to search for a suboptimal partition reduct, and analyze major properties of the proposed algorithm. Empirical studies on various datasets from the UCI library show that our algorithm effectively reduces the size of attribute domains. Furthermore, it assists in computing smaller rule sets with better coverage compared with the attribute reduction approach
On the Complexity of -Closeness Anonymization and Related Problems
An important issue in releasing individual data is to protect the sensitive
information from being leaked and maliciously utilized. Famous privacy
preserving principles that aim to ensure both data privacy and data integrity,
such as -anonymity and -diversity, have been extensively studied both
theoretically and empirically. Nonetheless, these widely-adopted principles are
still insufficient to prevent attribute disclosure if the attacker has partial
knowledge about the overall sensitive data distribution. The -closeness
principle has been proposed to fix this, which also has the benefit of
supporting numerical sensitive attributes. However, in contrast to
-anonymity and -diversity, the theoretical aspect of -closeness has
not been well investigated.
We initiate the first systematic theoretical study on the -closeness
principle under the commonly-used attribute suppression model. We prove that
for every constant such that , it is NP-hard to find an optimal
-closeness generalization of a given table. The proof consists of several
reductions each of which works for different values of , which together
cover the full range. To complement this negative result, we also provide exact
and fixed-parameter algorithms. Finally, we answer some open questions
regarding the complexity of -anonymity and -diversity left in the
literature.Comment: An extended abstract to appear in DASFAA 201
Complexity of scheduling multiprocessor tasks with prespecified processor allocations
We investigate the computational complexity of scheduling multiprocessor tasks with prespecified processor allocations. We consider two criteria: minimizing schedule length and minimizing the sum of the task completion times. In addition, we investigate the complexity of problems when precedence constraints or release dates are involved
Subset feedback vertex set is fixed parameter tractable
The classical Feedback Vertex Set problem asks, for a given undirected graph
G and an integer k, to find a set of at most k vertices that hits all the
cycles in the graph G. Feedback Vertex Set has attracted a large amount of
research in the parameterized setting, and subsequent kernelization and
fixed-parameter algorithms have been a rich source of ideas in the field.
In this paper we consider a more general and difficult version of the
problem, named Subset Feedback Vertex Set (SUBSET-FVS in short) where an
instance comes additionally with a set S ? V of vertices, and we ask for a set
of at most k vertices that hits all simple cycles passing through S. Because of
its applications in circuit testing and genetic linkage analysis SUBSET-FVS was
studied from the approximation algorithms perspective by Even et al.
[SICOMP'00, SIDMA'00].
The question whether the SUBSET-FVS problem is fixed-parameter tractable was
posed independently by Kawarabayashi and Saurabh in 2009. We answer this
question affirmatively. We begin by showing that this problem is
fixed-parameter tractable when parametrized by |S|. Next we present an
algorithm which reduces the given instance to 2^k n^O(1) instances with the
size of S bounded by O(k^3), using kernelization techniques such as the
2-Expansion Lemma, Menger's theorem and Gallai's theorem. These two facts allow
us to give a 2^O(k log k) n^O(1) time algorithm solving the Subset Feedback
Vertex Set problem, proving that it is indeed fixed-parameter tractable.Comment: full version of a paper presented at ICALP'1
On vanishing of Kronecker coefficients
We show that the problem of deciding positivity of Kronecker coefficients is
NP-hard. Previously, this problem was conjectured to be in P, just as for the
Littlewood-Richardson coefficients. Our result establishes in a formal way that
Kronecker coefficients are more difficult than Littlewood-Richardson
coefficients, unless P=NP.
We also show that there exists a #P-formula for a particular subclass of
Kronecker coefficients whose positivity is NP-hard to decide. This is an
evidence that, despite the hardness of the positivity problem, there may well
exist a positive combinatorial formula for the Kronecker coefficients. Finding
such a formula is a major open problem in representation theory and algebraic
combinatorics.
Finally, we consider the existence of the partition triples such that the Kronecker coefficient but the
Kronecker coefficient for some integer
. Such "holes" are of great interest as they witness the failure of the
saturation property for the Kronecker coefficients, which is still poorly
understood. Using insight from computational complexity theory, we turn our
hardness proof into a positive result: We show that not only do there exist
many such triples, but they can also be found efficiently. Specifically, we
show that, for any , there exists such that, for all
, there exist partition triples in the
Kronecker cone such that: (a) the Kronecker coefficient
is zero, (b) the height of is , (c) the height of is , and (d) . The proof of the last result
illustrates the effectiveness of the explicit proof strategy of GCT.Comment: 43 pages, 1 figur
- …