545 research outputs found
LIPIcs, Volume 251, ITCS 2023, Complete Volume
LIPIcs, Volume 251, ITCS 2023, Complete Volum
Polynomial Identity Testing and the Ideal Proof System: PIT is in NP if and only if IPS can be p-simulated by a Cook-Reckhow proof system
The Ideal Proof System (IPS) of Grochow & Pitassi (FOCS 2014, J. ACM, 2018)
is an algebraic proof system that uses algebraic circuits to refute the
solvability of unsatisfiable systems of polynomial equations. One potential
drawback of IPS is that verifying an IPS proof is only known to be doable using
Polynomial Identity Testing (PIT), which is solvable by a randomized algorithm,
but whose derandomization, even into NSUBEXP, is equivalent to strong lower
bounds. However, the circuits that are used in IPS proofs are not arbitrary,
and it is conceivable that one could get around general PIT by leveraging some
structure in these circuits. This proposal may be even more tempting when IPS
is used as a proof system for Boolean Unsatisfiability, where the equations
themselves have additional structure.
Our main result is that, on the contrary, one cannot get around PIT as above:
we show that IPS, even as a proof system for Boolean Unsatisfiability, can be
p-simulated by a deterministically verifiable (Cook-Reckhow) proof system if
and only if PIT is in NP. We use our main result to propose a potentially new
approach to derandomizing PIT into NP
Improved Techniques for Maximum Likelihood Estimation for Diffusion ODEs
Diffusion models have exhibited excellent performance in various domains. The
probability flow ordinary differential equation (ODE) of diffusion models
(i.e., diffusion ODEs) is a particular case of continuous normalizing flows
(CNFs), which enables deterministic inference and exact likelihood evaluation.
However, the likelihood estimation results by diffusion ODEs are still far from
those of the state-of-the-art likelihood-based generative models. In this work,
we propose several improved techniques for maximum likelihood estimation for
diffusion ODEs, including both training and evaluation perspectives. For
training, we propose velocity parameterization and explore variance reduction
techniques for faster convergence. We also derive an error-bounded high-order
flow matching objective for finetuning, which improves the ODE likelihood and
smooths its trajectory. For evaluation, we propose a novel training-free
truncated-normal dequantization to fill the training-evaluation gap commonly
existing in diffusion ODEs. Building upon these techniques, we achieve
state-of-the-art likelihood estimation results on image datasets (2.56 on
CIFAR-10, 3.43/3.69 on ImageNet-32) without variational dequantization or data
augmentation.Comment: Accepted in ICML202
Sharp Bounds for Generalized Causal Sensitivity Analysis
Causal inference from observational data is crucial for many disciplines such
as medicine and economics. However, sharp bounds for causal effects under
relaxations of the unconfoundedness assumption (causal sensitivity analysis)
are subject to ongoing research. So far, works with sharp bounds are restricted
to fairly simple settings (e.g., a single binary treatment). In this paper, we
propose a unified framework for causal sensitivity analysis under unobserved
confounding in various settings. For this, we propose a flexible generalization
of the marginal sensitivity model (MSM) and then derive sharp bounds for a
large class of causal effects. This includes (conditional) average treatment
effects, effects for mediation analysis and path analysis, and distributional
effects. Furthermore, our sensitivity model is applicable to discrete,
continuous, and time-varying treatments. It allows us to interpret the partial
identification problem under unobserved confounding as a distribution shift in
the latent confounders while evaluating the causal effect of interest. In the
special case of a single binary treatment, our bounds for (conditional) average
treatment effects coincide with recent optimality results for causal
sensitivity analysis. Finally, we propose a scalable algorithm to estimate our
sharp bounds from observational data.Comment: Accepted at NeurIPS 202
Doubly Robust Proximal Causal Learning for Continuous Treatments
Proximal causal learning is a promising framework for identifying the causal
effect under the existence of unmeasured confounders. Within this framework,
the doubly robust (DR) estimator was derived and has shown its effectiveness in
estimation, especially when the model assumption is violated. However, the
current form of the DR estimator is restricted to binary treatments, while the
treatment can be continuous in many real-world applications. The primary
obstacle to continuous treatments resides in the delta function present in the
original DR estimator, making it infeasible in causal effect estimation and
introducing a heavy computational burden in nuisance function estimation. To
address these challenges, we propose a kernel-based DR estimator that can well
handle continuous treatments. Equipped with its smoothness, we show that its
oracle form is a consistent approximation of the influence function. Further,
we propose a new approach to efficiently solve the nuisance functions. We then
provide a comprehensive convergence analysis in terms of the mean square error.
We demonstrate the utility of our estimator on synthetic datasets and
real-world applications.Comment: Preprint, under revie
McFIL: Model Counting Functionality-Inherent Leakage
Protecting the confidentiality of private data and using it for useful
collaboration have long been at odds. Modern cryptography is bridging this gap
through rapid growth in secure protocols such as multi-party computation,
fully-homomorphic encryption, and zero-knowledge proofs. However, even with
provable indistinguishability or zero-knowledgeness, confidentiality loss from
leakage inherent to the functionality may partially or even completely
compromise secret values without ever falsifying proofs of security. In this
work, we describe McFIL, an algorithmic approach and accompanying software
implementation which automatically quantifies intrinsic leakage for a given
functionality. Extending and generalizing the Chosen-Ciphertext attack
framework of Beck et al. with a practical heuristic, our approach not only
quantifies but maximizes functionality-inherent leakage using Maximum Model
Counting within a SAT solver. As a result, McFIL automatically derives
approximately-optimal adversary inputs that, when used in secure protocols,
maximize information leakage of private values.Comment: To appear in USENIX Security 202
Generalising weighted model counting
Given a formula in propositional or (finite-domain) first-order logic and some non-negative weights, weighted model counting (WMC) is a function problem that asks to compute the sum of the weights of the models of the formula. Originally used as a flexible way of performing probabilistic inference on graphical models, WMC has found many applications across artificial intelligence (AI), machine learning, and other domains. Areas of AI that rely on WMC include explainable AI, neural-symbolic AI, probabilistic programming, and statistical relational AI. WMC also has applications in bioinformatics, data mining, natural language processing, prognostics, and robotics.
In this work, we are interested in revisiting the foundations of WMC and considering generalisations of some of the key definitions in the interest of conceptual clarity and practical efficiency. We begin by developing a measure-theoretic perspective on WMC, which suggests a new and more general way of defining the weights of an instance. This new representation can be as succinct as standard WMC but can also expand as needed to represent less-structured probability distributions. We demonstrate the performance benefits of the new format by developing a novel WMC encoding for Bayesian networks. We then show how existing WMC encodings for Bayesian networks can be transformed into this more general format and what conditions ensure that the transformation is correct (i.e., preserves the answer). Combining the strengths of the more flexible representation with the tricks used in existing encodings yields further efficiency improvements in Bayesian network probabilistic inference.
Next, we turn our attention to the first-order setting. Here, we argue that the capabilities of practical model counting algorithms are severely limited by their inability to perform arbitrary recursive computations. To enable arbitrary recursion, we relax the restrictions that typically accompany domain recursion and generalise circuits (used to express a solution to a model counting problem) to graphs that are allowed to have cycles. These improvements enable us to find efficient solutions to counting fundamental structures such as injections and bijections that were previously unsolvable by any available algorithm.
The second strand of this work is concerned with synthetic data generation. Testing algorithms across a wide range of problem instances is crucial to ensure the validity of any claim about one algorithm’s superiority over another. However, benchmarks are often limited and fail to reveal differences among the algorithms. First, we show how random instances of probabilistic logic programs (that typically use WMC algorithms for inference) can be generated using constraint programming. We also introduce a new constraint to control the independence structure of the underlying probability distribution and provide a combinatorial argument for the correctness of the constraint model. This model allows us to, for the first time, experimentally investigate inference algorithms on more than just a handful of instances. Second, we introduce a random model for WMC instances with a parameter that influences primal treewidth—the parameter most commonly used to characterise the difficulty of an instance. We show that the easy-hard-easy pattern with respect to clause density is different for algorithms based on dynamic programming and algebraic decision diagrams than for all other solvers. We also demonstrate that all WMC algorithms scale exponentially with respect to primal treewidth, although at differing rates
Locally Covert Learning
The goal of a covert learning algorithm is to learn a function by querying it, while ensuring that an adversary, who sees all queries and their responses, is unable to (efficiently) learn any more about than they could learn from random input-output pairs. We focus on a relaxation that we call local covertness, in which queries are distributed across servers and we only limit what is learnable by colluding servers.
For any constant , we give a locally covert algorithm for efficiently learning any Fourier-sparse function (technically, our notion of learning is improper, agnostic, and with respect to the uniform distribution). Our result holds unconditionally and for computationally unbounded adversaries. Prior to our work, such an algorithm was known only for the special case of -juntas, and only with servers, Ishai et al. (Crypto 2019).
Our main technical observation is that the original Goldreich-Levin algorithm only utilizes i.i.d. pairs of correlated queries, where each half of every pair is uniformly random. We give a simple generalization of this algorithm in which pairs are replaced by -tuples in which any components are jointly uniform. The cost of this generalization is that the number of queries needed grows exponentially with
Bounded-depth Frege complexity of Tseitin formulas for all graphs
We prove that there is a constant K such that Tseitin formulas for a connected graph G requires proofs of size 2tw(G)javax.xml.bind.JAXBElement@531a834b in depth-d Frege systems for [Formula presented], where tw(G) is the treewidth of G. This extends Håstad's recent lower bound from grid graphs to any graph. Furthermore, we prove tightness of our bound up to a multiplicative constant in the top exponent. Namely, we show that if a Tseitin formula for a graph G has size s, then for all large enough d, it has a depth-d Frege proof of size 2tw(G)javax.xml.bind.JAXBElement@25a4b51fpoly(s). Through this result we settle the question posed by M. Alekhnovich and A. Razborov of showing that the class of Tseitin formulas is quasi-automatizable for resolution
- …