83,189 research outputs found

    Redundancy, Deduction Schemes, and Minimum-Size Bases for Association Rules

    Full text link
    Association rules are among the most widely employed data analysis methods in the field of Data Mining. An association rule is a form of partial implication between two sets of binary variables. In the most common approach, association rules are parameterized by a lower bound on their confidence, which is the empirical conditional probability of their consequent given the antecedent, and/or by some other parameter bounds such as "support" or deviation from independence. We study here notions of redundancy among association rules from a fundamental perspective. We see each transaction in a dataset as an interpretation (or model) in the propositional logic sense, and consider existing notions of redundancy, that is, of logical entailment, among association rules, of the form "any dataset in which this first rule holds must obey also that second rule, therefore the second is redundant". We discuss several existing alternative definitions of redundancy between association rules and provide new characterizations and relationships among them. We show that the main alternatives we discuss correspond actually to just two variants, which differ in the treatment of full-confidence implications. For each of these two notions of redundancy, we provide a sound and complete deduction calculus, and we show how to construct complete bases (that is, axiomatizations) of absolutely minimum size in terms of the number of rules. We explore finally an approach to redundancy with respect to several association rules, and fully characterize its simplest case of two partial premises.Comment: LMCS accepted pape

    Threshold phenomena in random graphs

    Get PDF
    In the 1950s, random graphs appeared for the first time in a result of the prolific hungarian mathematician Pál Erd\H{o}s. Since then, interest in random graph theory has only grown up until now. In its first stages, the basis of its theory were set, while they were mainly used in probability and combinatorics theory. However, with the new century and the boom of technologies like the World Wide Web, random graphs are even more important since they are extremely useful to handle problems in fields like network and communication theory. Because of this fact, nowadays random graphs are widely studied by the mathematical community around the world and new promising results have been recently achieved, showing an exciting future for this field. In this bachelor thesis, we focus our study on the threshold phenomena for graph properties within random graphs

    On the cavity method for decimated random constraint satisfaction problems and the analysis of belief propagation guided decimation algorithms

    Full text link
    We introduce a version of the cavity method for diluted mean-field spin models that allows the computation of thermodynamic quantities similar to the Franz-Parisi quenched potential in sparse random graph models. This method is developed in the particular case of partially decimated random constraint satisfaction problems. This allows to develop a theoretical understanding of a class of algorithms for solving constraint satisfaction problems, in which elementary degrees of freedom are sequentially assigned according to the results of a message passing procedure (belief-propagation). We confront this theoretical analysis to the results of extensive numerical simulations.Comment: 32 pages, 24 figure

    Scale-Free Random SAT Instances

    Full text link
    We focus on the random generation of SAT instances that have properties similar to real-world instances. It is known that many industrial instances, even with a great number of variables, can be solved by a clever solver in a reasonable amount of time. This is not possible, in general, with classical randomly generated instances. We provide a different generation model of SAT instances, called \emph{scale-free random SAT instances}. It is based on the use of a non-uniform probability distribution P(i)∼i−βP(i)\sim i^{-\beta} to select variable ii, where β\beta is a parameter of the model. This results into formulas where the number of occurrences kk of variables follows a power-law distribution P(k)∼k−δP(k)\sim k^{-\delta} where δ=1+1/β\delta = 1 + 1/\beta. This property has been observed in most real-world SAT instances. For β=0\beta=0, our model extends classical random SAT instances. We prove the existence of a SAT-UNSAT phase transition phenomenon for scale-free random 2-SAT instances with β<1/2\beta<1/2 when the clause/variable ratio is m/n=1−2β(1−β)2m/n=\frac{1-2\beta}{(1-\beta)^2}. We also prove that scale-free random k-SAT instances are unsatisfiable with high probability when the number of clauses exceeds ω(n(1−β)k)\omega(n^{(1-\beta)k}). %This implies that the SAT/UNSAT phase transition phenomena vanishes when β>1−1/k\beta>1-1/k, and formulas are unsatisfiable due to a small core of clauses. The proof of this result suggests that, when β>1−1/k\beta>1-1/k, the unsatisfiability of most formulas may be due to small cores of clauses. Finally, we show how this model will allow us to generate random instances similar to industrial instances, of interest for testing purposes

    Solving incomplete markets models by derivative aggregation

    Get PDF
    This article presents a novel computational approach to solving models with both uninsurable idiosyncratic and aggregate risk that uses projection methods, simulation and perturbation. The approach is shown to be both as efficient and as accurate as existing methods on a model based on Krusell and Smith (1998), for which prior solutions exist. The approach has the advantage of extending straightforwardly, and with reasonable computational cost, to models with a greater range of diversity between agents, which is demonstrated by solving both a model with heterogeneity in discount-rates and a lifecycle model with incomplete markets
    • …
    corecore