151,080 research outputs found

    Concept-free Causal Disentanglement with Variational Graph Auto-Encoder

    Full text link
    In disentangled representation learning, the goal is to achieve a compact representation that consists of all interpretable generative factors in the observational data. Learning disentangled representations for graphs becomes increasingly important as graph data rapidly grows. Existing approaches often rely on Variational Auto-Encoder (VAE) or its causal structure learning-based refinement, which suffer from sub-optimality in VAEs due to the independence factor assumption and unavailability of concept labels, respectively. In this paper, we propose an unsupervised solution, dubbed concept-free causal disentanglement, built on a theoretically provable tight upper bound approximating the optimal factor. This results in an SCM-like causal structure modeling that directly learns concept structures from data. Based on this idea, we propose Concept-free Causal VGAE (CCVGAE) by incorporating a novel causal disentanglement layer into Variational Graph Auto-Encoder. Furthermore, we prove concept consistency under our concept-free causal disentanglement framework, hence employing it to enhance the meta-learning framework, called concept-free causal Meta-Graph (CC-Meta-Graph). We conduct extensive experiments to demonstrate the superiority of the proposed models: CCVGAE and CC-Meta-Graph, reaching up to 29%29\% and 11%11\% absolute improvements over baselines in terms of AUC, respectively

    User-Entity Differential Privacy in Learning Natural Language Models

    Full text link
    In this paper, we introduce a novel concept of user-entity differential privacy (UeDP) to provide formal privacy protection simultaneously to both sensitive entities in textual data and data owners in learning natural language models (NLMs). To preserve UeDP, we developed a novel algorithm, called UeDP-Alg, optimizing the trade-off between privacy loss and model utility with a tight sensitivity bound derived from seamlessly combining user and sensitive entity sampling processes. An extensive theoretical analysis and evaluation show that our UeDP-Alg outperforms baseline approaches in model utility under the same privacy budget consumption on several NLM tasks, using benchmark datasets.Comment: Accepted at IEEE BigData 202

    Two new results about quantum exact learning

    Get PDF
    We present two new results about exact learning by quantum computers. First, we show how to exactly learn a kk-Fourier-sparse nn-bit Boolean function from O(k1.5(logk)2)O(k^{1.5}(\log k)^2) uniform quantum examples for that function. This improves over the bound of Θ~(kn)\widetilde{\Theta}(kn) uniformly random classical examples (Haviv and Regev, CCC'15). Our main tool is an improvement of Chang's lemma for the special case of sparse functions. Second, we show that if a concept class C\mathcal{C} can be exactly learned using QQ quantum membership queries, then it can also be learned using O(Q2logQlogC)O\left(\frac{Q^2}{\log Q}\log|\mathcal{C}|\right) classical membership queries. This improves the previous-best simulation result (Servedio and Gortler, SICOMP'04) by a logQ\log Q-factor.Comment: v3: 21 pages. Small corrections and clarification

    Pre-Reduction Graph Products: Hardnesses of Properly Learning DFAs and Approximating EDP on DAGs

    Full text link
    The study of graph products is a major research topic and typically concerns the term f(GH)f(G*H), e.g., to show that f(GH)=f(G)f(H)f(G*H)=f(G)f(H). In this paper, we study graph products in a non-standard form f(R[GH]f(R[G*H] where RR is a "reduction", a transformation of any graph into an instance of an intended optimization problem. We resolve some open problems as applications. (1) A tight n1ϵn^{1-\epsilon}-approximation hardness for the minimum consistent deterministic finite automaton (DFA) problem, where nn is the sample size. Due to Board and Pitt [Theoretical Computer Science 1992], this implies the hardness of properly learning DFAs assuming NPRPNP\neq RP (the weakest possible assumption). (2) A tight n1/2ϵn^{1/2-\epsilon} hardness for the edge-disjoint paths (EDP) problem on directed acyclic graphs (DAGs), where nn denotes the number of vertices. (3) A tight hardness of packing vertex-disjoint kk-cycles for large kk. (4) An alternative (and perhaps simpler) proof for the hardness of properly learning DNF, CNF and intersection of halfspaces [Alekhnovich et al., FOCS 2004 and J. Comput.Syst.Sci. 2008]

    Weighted Polynomial Approximations: Limits for Learning and Pseudorandomness

    Get PDF
    Polynomial approximations to boolean functions have led to many positive results in computer science. In particular, polynomial approximations to the sign function underly algorithms for agnostically learning halfspaces, as well as pseudorandom generators for halfspaces. In this work, we investigate the limits of these techniques by proving inapproximability results for the sign function. Firstly, the polynomial regression algorithm of Kalai et al. (SIAM J. Comput. 2008) shows that halfspaces can be learned with respect to log-concave distributions on Rn\mathbb{R}^n in the challenging agnostic learning model. The power of this algorithm relies on the fact that under log-concave distributions, halfspaces can be approximated arbitrarily well by low-degree polynomials. We ask whether this technique can be extended beyond log-concave distributions, and establish a negative result. We show that polynomials of any degree cannot approximate the sign function to within arbitrarily low error for a large class of non-log-concave distributions on the real line, including those with densities proportional to exp(x0.99)\exp(-|x|^{0.99}). Secondly, we investigate the derandomization of Chernoff-type concentration inequalities. Chernoff-type tail bounds on sums of independent random variables have pervasive applications in theoretical computer science. Schmidt et al. (SIAM J. Discrete Math. 1995) showed that these inequalities can be established for sums of random variables with only O(log(1/δ))O(\log(1/\delta))-wise independence, for a tail probability of δ\delta. We show that their results are tight up to constant factors. These results rely on techniques from weighted approximation theory, which studies how well functions on the real line can be approximated by polynomials under various distributions. We believe that these techniques will have further applications in other areas of computer science.Comment: 22 page
    corecore