41 research outputs found

    Agnostic Learning of Disjunctions on Symmetric Distributions

    Full text link
    We consider the problem of approximating and learning disjunctions (or equivalently, conjunctions) on symmetric distributions over {0,1}n\{0,1\}^n. Symmetric distributions are distributions whose PDF is invariant under any permutation of the variables. We give a simple proof that for every symmetric distribution D\mathcal{D}, there exists a set of nO(logā”(1/Ļµ))n^{O(\log{(1/\epsilon)})} functions S\mathcal{S}, such that for every disjunction cc, there is function pp, expressible as a linear combination of functions in S\mathcal{S}, such that pp Ļµ\epsilon-approximates cc in ā„“1\ell_1 distance on D\mathcal{D} or Exāˆ¼D[āˆ£c(x)āˆ’p(x)āˆ£]ā‰¤Ļµ\mathbf{E}_{x \sim \mathcal{D}}[ |c(x)-p(x)|] \leq \epsilon. This directly gives an agnostic learning algorithm for disjunctions on symmetric distributions that runs in time nO(logā”(1/Ļµ))n^{O( \log{(1/\epsilon)})}. The best known previous bound is nO(1/Ļµ4)n^{O(1/\epsilon^4)} and follows from approximation of the more general class of halfspaces (Wimmer, 2010). We also show that there exists a symmetric distribution D\mathcal{D}, such that the minimum degree of a polynomial that 1/31/3-approximates the disjunction of all nn variables is ā„“1\ell_1 distance on D\mathcal{D} is Ī©(n)\Omega( \sqrt{n}). Therefore the learning result above cannot be achieved via ā„“1\ell_1-regression with a polynomial basis used in most other agnostic learning algorithms. Our technique also gives a simple proof that for any product distribution D\mathcal{D} and every disjunction cc, there exists a polynomial pp of degree O(logā”(1/Ļµ))O(\log{(1/\epsilon)}) such that pp Ļµ\epsilon-approximates cc in ā„“1\ell_1 distance on D\mathcal{D}. This was first proved by Blais et al. (2008) via a more involved argument

    CSP-Completeness And Its Applications

    Get PDF
    We build off of previous ideas used to study both reductions between CSPrefutation problems and improper learning and between CSP-refutation problems themselves to expand some hardness results that depend on the assumption that refuting random CSP instances are hard for certain choices of predicates (like k-SAT). First, we are able argue the hardness of the fundamental problem of learning conjunctions in a one-sided PAC-esque learning model that has appeared in several forms over the years. In this model we focus on producing a hypothesis that foremost guarantees a small false-positive rate while minimizing the false-negative rate for such hypotheses. Further, we formalize a notion of CSP-refutation reductions and CSP-refutation completeness that and use these, along with candidate CSP-refutatation complete predicates, to provide further evidence for the hardness of several problems

    Does it pay to optimize AUC?

    Full text link
    The Area Under the ROC Curve (AUC) is an important model metric for evaluating binary classifiers, and many algorithms have been proposed to optimize AUC approximately. It raises the question of whether the generally insignificant gains observed by previous studies are due to inherent limitations of the metric or the inadequate quality of optimization. To better understand the value of optimizing for AUC, we present an efficient algorithm, namely AUC-opt, to find the provably optimal AUC linear classifier in R2\mathbb{R}^2, which runs in O(n+nāˆ’logā”(n+nāˆ’))\mathcal{O}(n_+ n_- \log (n_+ n_-)) where n+n_+ and nāˆ’n_- are the number of positive and negative samples respectively. Furthermore, it can be naturally extended to Rd\mathbb{R}^d in O((n+nāˆ’)dāˆ’1logā”(n+nāˆ’))\mathcal{O}((n_+n_-)^{d-1}\log (n_+n_-)) by calling AUC-opt in lower-dimensional spaces recursively. We prove the problem is NP-complete when dd is not fixed, reducing from the \textit{open hemisphere problem}. Experiments show that compared with other methods, AUC-opt achieves statistically significant improvements on between 17 to 40 in R2\mathbb{R}^2 and between 4 to 42 in R3\mathbb{R}^3 of 50 t-SNE training datasets. However, generally the gain proves insignificant on most testing datasets compared to the best standard classifiers. Similar observations are found for nonlinear AUC methods under real-world datasets.Comment: 16 pages, AAA

    Computational Aspects of Game Theory and Microeconomics

    Get PDF
    The purpose of this thesis is to study algorithmic questions that arise in the context of game theory and microeconomics. In particular, we investigate the computational complexity of various economic solution concepts by using and advancing methodologies from the fields of combinatorial optimization and approximation algorithms. We first study the problem of allocating a set of indivisible goods to a set of agents, who express preferences over combinations of items through their utility functions. Several objectives have been considered in the economic literature in different contexts. In fair division theory, a desirable outcome is to minimize the envy or the envy-ratio between any pair of players. We use tools from the theory of linear and integer programming as well as combinatorics to derive new approximation algorithms and hardness results for various types of utility functions. A different objective that has been considered in the context of auctions, is to find an allocation that maximizes the social welfare, i.e., the total utility derived by the agents. We construct reductions from multi-prover proof systems to obtain inapproximability results, given standard assumptions for the utility functions of the agents. We then consider equilibrium concepts in games. We derive the first subexponential algorithm for computing approximate Nash equilibria in 22-player noncooperative games and extend our result to multi-player games. We further propose a second algorithm based on solving polynomial equations over the reals. Both algorithms improve the previously known upper bounds on the complexity of the problem. Finally, we study game theoretic models that have been introduced recently to address incentive issues in Internet routing. A polynomial time algorithm is obtained for computing equilibria in such games, i.e., routing schemes and payoff allocations from which no subset of agents has an incentive to deviate. Our algorithm is based on linear programming duality theory. We also obtain generalizations when the agents have nonlinear utility functions.Ph.D.Committee Chair: Lipton, Richard; Committee Member: Ding, Yan; Committee Member: Duke, Richard; Committee Member: Randall, Dana; Committee Member: Vazirani, Vija

    Computational learning theory : new models and algorithms

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1989.Includes bibliographical references (leaves 116-120).by Robert Hal Sloan.Ph.D

    On the computational complexity of ethics: moral tractability for minds and machines

    Get PDF
    Why should moral philosophers, moral psychologists, and machine ethicists care about computational complexity? Debates on whether artificial intelligence (AI) can or should be used to solve problems in ethical domains have mainly been driven by what AI can or cannot do in terms of human capacities. In this paper, we tackle the problem from the other end by exploring what kind of moral machines are possible based on what computational systems can or cannot do. To do so, we analyze normative ethics through the lens of computational complexity. First, we introduce computational complexity for the uninitiated reader and discuss how the complexity of ethical problems can be framed within Marrā€™s three levels of analysis. We then study a range of ethical problems based on consequentialism, deontology, and virtue ethics, with the aim of elucidating the complexity associated with the problems themselves (e.g., due to combinatorics, uncertainty, strategic dynamics), the computational methods employed (e.g., probability, logic, learning), and the available resources (e.g., time, knowledge, learning). The results indicate that most problems the normative frameworks pose lead to tractability issues in every category analyzed. Our investigation also provides several insights about the computational nature of normative ethics, including the differences between rule- and outcome-based moral strategies, and the implementation-variance with regard to moral resources. We then discuss the consequences complexity results have for the prospect of moral machines in virtue of the trade-off between optimality and efficiency. Finally, we elucidate how computational complexity can be used to inform both philosophical and cognitive-psychological research on human morality by advancing the moral tractability thesis

    An Improved Algorithm for Learning to Perform Exception-Tolerant Abduction

    Get PDF
    Abstract Inference from an observed or hypothesized condition to a plausible cause or explanation for this condition is known as abduction. For many tasks, the acquisition of the necessary knowledge by machine learning has been widely found to be highly effective. However, the semantics of learned knowledge are weaker than the usual classical semantics, and this necessitates new formulations of many tasks. We focus on a recently introduced formulation of the abductive inference task that is thus adapted to the semantics of machine learning. A key problem is that we cannot expect that our causes or explanations will be perfect, and they must tolerate some error due to the world being more complicated than our formalization allows. This is a version of the qualification problem, and in machine learning, this is known as agnostic learning. In the work by Juba that introduced the task of learning to make abductive inferences, an algorithm is given for producing k-DNF explanations that tolerates such exceptions: if the best possible k-DNF explanation fails to justify the condition with probability , then the algorithm is promised to find a k-DNF explanation that fails to justify the condition with probability at most , where n is the number of propositional attributes used to describe the domain. Here, we present an improved algorithm for this task. When the best k-DNF fails with probability , our algorithm finds a k-DNF that fails with probability at most (i.e., suppressing logarithmic factors in n and ).We examine the empirical advantage of this new algorithm over the previous algorithm in two test domains, one of explaining conditions generated by a ā€œnoisy k-DNF rule, and another of explaining conditions that are actually generated by a linear threshold rule. We also apply the algorithm on the real world application Anomaly explanation. In this work, as opposed to anomaly detection, we are interested in finding possible descriptions of what may be causing anomalies in visual data. We use PCA to perform anomaly detection. The task is attaching semantics drawn from the image meta-data to a portion of the anomalous images from some source such as web-came. Such a partial description of the anomalous images in terms of the meta-data is useful both because it may help to explain what causes the identified anomalies, and also because it may help to identify the truly unusual images that defy such simple categorization. We find that it is a good match to apply our approximation algorithm on this task. Our algorithm successfully finds plausible explanations of the anomalies. It yields low error rate when the data set is large(\u3e80,000 inputs) and also works well when the data set is not very large(\u3c 50,000 examples). It finds small 2-DNFs that are easy to interpret and capture a non-negligible