215,071 research outputs found

    Rough sets approach to symbolic value partition

    Get PDF
    AbstractIn data mining, searching for simple representations of knowledge is a very important issue. Attribute reduction, continuous attribute discretization and symbolic value partition are three preprocessing techniques which are used in this regard. This paper investigates the symbolic value partition technique, which divides each attribute domain of a data table into a family for disjoint subsets, and constructs a new data table with fewer attributes and smaller attribute domains. Specifically, we investigates the optimal symbolic value partition (OSVP) problem of supervised data, where the optimal metric is defined by the cardinality sum of new attribute domains. We propose the concept of partition reducts for this problem. An optimal partition reduct is the solution to the OSVP-problem. We develop a greedy algorithm to search for a suboptimal partition reduct, and analyze major properties of the proposed algorithm. Empirical studies on various datasets from the UCI library show that our algorithm effectively reduces the size of attribute domains. Furthermore, it assists in computing smaller rule sets with better coverage compared with the attribute reduction approach

    On the Complexity of tt-Closeness Anonymization and Related Problems

    Full text link
    An important issue in releasing individual data is to protect the sensitive information from being leaked and maliciously utilized. Famous privacy preserving principles that aim to ensure both data privacy and data integrity, such as kk-anonymity and ll-diversity, have been extensively studied both theoretically and empirically. Nonetheless, these widely-adopted principles are still insufficient to prevent attribute disclosure if the attacker has partial knowledge about the overall sensitive data distribution. The tt-closeness principle has been proposed to fix this, which also has the benefit of supporting numerical sensitive attributes. However, in contrast to kk-anonymity and ll-diversity, the theoretical aspect of tt-closeness has not been well investigated. We initiate the first systematic theoretical study on the tt-closeness principle under the commonly-used attribute suppression model. We prove that for every constant tt such that 0≤t<10\leq t<1, it is NP-hard to find an optimal tt-closeness generalization of a given table. The proof consists of several reductions each of which works for different values of tt, which together cover the full range. To complement this negative result, we also provide exact and fixed-parameter algorithms. Finally, we answer some open questions regarding the complexity of kk-anonymity and ll-diversity left in the literature.Comment: An extended abstract to appear in DASFAA 201

    Complexity of scheduling multiprocessor tasks with prespecified processor allocations

    Get PDF
    We investigate the computational complexity of scheduling multiprocessor tasks with prespecified processor allocations. We consider two criteria: minimizing schedule length and minimizing the sum of the task completion times. In addition, we investigate the complexity of problems when precedence constraints or release dates are involved

    Subset feedback vertex set is fixed parameter tractable

    Full text link
    The classical Feedback Vertex Set problem asks, for a given undirected graph G and an integer k, to find a set of at most k vertices that hits all the cycles in the graph G. Feedback Vertex Set has attracted a large amount of research in the parameterized setting, and subsequent kernelization and fixed-parameter algorithms have been a rich source of ideas in the field. In this paper we consider a more general and difficult version of the problem, named Subset Feedback Vertex Set (SUBSET-FVS in short) where an instance comes additionally with a set S ? V of vertices, and we ask for a set of at most k vertices that hits all simple cycles passing through S. Because of its applications in circuit testing and genetic linkage analysis SUBSET-FVS was studied from the approximation algorithms perspective by Even et al. [SICOMP'00, SIDMA'00]. The question whether the SUBSET-FVS problem is fixed-parameter tractable was posed independently by Kawarabayashi and Saurabh in 2009. We answer this question affirmatively. We begin by showing that this problem is fixed-parameter tractable when parametrized by |S|. Next we present an algorithm which reduces the given instance to 2^k n^O(1) instances with the size of S bounded by O(k^3), using kernelization techniques such as the 2-Expansion Lemma, Menger's theorem and Gallai's theorem. These two facts allow us to give a 2^O(k log k) n^O(1) time algorithm solving the Subset Feedback Vertex Set problem, proving that it is indeed fixed-parameter tractable.Comment: full version of a paper presented at ICALP'1

    On vanishing of Kronecker coefficients

    Full text link
    We show that the problem of deciding positivity of Kronecker coefficients is NP-hard. Previously, this problem was conjectured to be in P, just as for the Littlewood-Richardson coefficients. Our result establishes in a formal way that Kronecker coefficients are more difficult than Littlewood-Richardson coefficients, unless P=NP. We also show that there exists a #P-formula for a particular subclass of Kronecker coefficients whose positivity is NP-hard to decide. This is an evidence that, despite the hardness of the positivity problem, there may well exist a positive combinatorial formula for the Kronecker coefficients. Finding such a formula is a major open problem in representation theory and algebraic combinatorics. Finally, we consider the existence of the partition triples (λ,μ,π)(\lambda, \mu, \pi) such that the Kronecker coefficient kμ,πλ=0k^\lambda_{\mu, \pi} = 0 but the Kronecker coefficient klμ,lπlλ>0k^{l \lambda}_{l \mu, l \pi} > 0 for some integer l>1l>1. Such "holes" are of great interest as they witness the failure of the saturation property for the Kronecker coefficients, which is still poorly understood. Using insight from computational complexity theory, we turn our hardness proof into a positive result: We show that not only do there exist many such triples, but they can also be found efficiently. Specifically, we show that, for any 0<ϵ≤10<\epsilon\leq1, there exists 0<a<10<a<1 such that, for all mm, there exist Ω(2ma)\Omega(2^{m^a}) partition triples (λ,μ,μ)(\lambda,\mu,\mu) in the Kronecker cone such that: (a) the Kronecker coefficient kμ,μλk^\lambda_{\mu,\mu} is zero, (b) the height of μ\mu is mm, (c) the height of λ\lambda is ≤mϵ\le m^\epsilon, and (d) ∣λ∣=∣μ∣≤m3|\lambda|=|\mu| \le m^3. The proof of the last result illustrates the effectiveness of the explicit proof strategy of GCT.Comment: 43 pages, 1 figur
    • …
    corecore