38 research outputs found

    Generating Random Instances of Weighted Model Counting:An Empirical Analysis with Varying Primal Treewidth

    Get PDF

    Semi-Supervised Learning for Neural Keyphrase Generation

    Full text link
    We study the problem of generating keyphrases that summarize the key points for a given document. While sequence-to-sequence (seq2seq) models have achieved remarkable performance on this task (Meng et al., 2017), model training often relies on large amounts of labeled data, which is only applicable to resource-rich domains. In this paper, we propose semi-supervised keyphrase generation methods by leveraging both labeled data and large-scale unlabeled samples for learning. Two strategies are proposed. First, unlabeled documents are first tagged with synthetic keyphrases obtained from unsupervised keyphrase extraction methods or a selflearning algorithm, and then combined with labeled samples for training. Furthermore, we investigate a multi-task learning framework to jointly learn to generate keyphrases as well as the titles of the articles. Experimental results show that our semi-supervised learning-based methods outperform a state-of-the-art model trained with labeled data only.Comment: To appear in EMNLP 2018 (12 pages, 7 figures, 6 tables

    Context-dependent feature analysis with random forests

    Full text link
    In many cases, feature selection is often more complicated than identifying a single subset of input variables that would together explain the output. There may be interactions that depend on contextual information, i.e., variables that reveal to be relevant only in some specific circumstances. In this setting, the contribution of this paper is to extend the random forest variable importances framework in order (i) to identify variables whose relevance is context-dependent and (ii) to characterize as precisely as possible the effect of contextual information on these variables. The usage and the relevance of our framework for highlighting context-dependent variables is illustrated on both artificial and real datasets.Comment: Accepted for presentation at UAI 201

    Optimal Planning with State Constraints

    Get PDF
    In the classical planning model, state variables are assigned values in the initial state and remain unchanged unless explicitly affected by action effects. However, some properties of states are more naturally modelled not as direct effects of actions but instead as derived, in each state, from the primary variables via a set of rules. We refer to those rules as state constraints. The two types of state constraints that will be discussed here are numeric state constraints and logical rules that we will refer to as axioms. When using state constraints we make a distinction between primary variables, whose values are directly affected by action effects, and secondary variables, whose values are determined by state constraints. While primary variables have finite and discrete domains, as in classical planning, there is no such requirement for secondary variables. For example, using numeric state constraints allows us to have secondary variables whose values are real numbers. We show that state constraints are a construct that lets us combine classical planning methods with specialised solvers developed for other types of problems. For example, introducing numeric state constraints enables us to apply planning techniques in domains involving interconnected physical systems, such as power networks. To solve these types of problems optimally, we adapt commonly used methods from optimal classical planning, namely state-space search guided by admissible heuristics. In heuristics based on monotonic relaxation, the idea is that in a relaxed state each variable assumes a set of values instead of just a single value. With state constraints, the challenge becomes to evaluate the conditions, such as goals and action preconditions, that involve secondary variables. We employ consistency checking tools to evaluate whether these conditions are satisfied in the relaxed state. In our work with numerical constraints we use linear programming, while with axioms we use answer set programming and three value semantics. This allows us to build a relaxed planning graph and compute constraint-aware version of heuristics based on monotonic relaxation. We also adapt pattern database heuristics. We notice that an abstract state can be thought of as a state in the monotonic relaxation in which the variables in the pattern hold only one value, while the variables not in the pattern simultaneously hold all the values in their domains. This means that we can apply the same technique for evaluating conditions on secondary variables as we did for the monotonic relaxation and build pattern databases similarly as it is done in classical planning. To make better use of our heuristics, we modify the A* algorithm by combining two techniques that were previously used independently – partial expansion and preferred operators. Our modified algorithm, which we call PrefPEA, is most beneficial in cases where heuristic is expensive to compute, but accurate, and states have many successors

    Generalising weighted model counting

    Get PDF
    Given a formula in propositional or (finite-domain) first-order logic and some non-negative weights, weighted model counting (WMC) is a function problem that asks to compute the sum of the weights of the models of the formula. Originally used as a flexible way of performing probabilistic inference on graphical models, WMC has found many applications across artificial intelligence (AI), machine learning, and other domains. Areas of AI that rely on WMC include explainable AI, neural-symbolic AI, probabilistic programming, and statistical relational AI. WMC also has applications in bioinformatics, data mining, natural language processing, prognostics, and robotics. In this work, we are interested in revisiting the foundations of WMC and considering generalisations of some of the key definitions in the interest of conceptual clarity and practical efficiency. We begin by developing a measure-theoretic perspective on WMC, which suggests a new and more general way of defining the weights of an instance. This new representation can be as succinct as standard WMC but can also expand as needed to represent less-structured probability distributions. We demonstrate the performance benefits of the new format by developing a novel WMC encoding for Bayesian networks. We then show how existing WMC encodings for Bayesian networks can be transformed into this more general format and what conditions ensure that the transformation is correct (i.e., preserves the answer). Combining the strengths of the more flexible representation with the tricks used in existing encodings yields further efficiency improvements in Bayesian network probabilistic inference. Next, we turn our attention to the first-order setting. Here, we argue that the capabilities of practical model counting algorithms are severely limited by their inability to perform arbitrary recursive computations. To enable arbitrary recursion, we relax the restrictions that typically accompany domain recursion and generalise circuits (used to express a solution to a model counting problem) to graphs that are allowed to have cycles. These improvements enable us to find efficient solutions to counting fundamental structures such as injections and bijections that were previously unsolvable by any available algorithm. The second strand of this work is concerned with synthetic data generation. Testing algorithms across a wide range of problem instances is crucial to ensure the validity of any claim about one algorithm’s superiority over another. However, benchmarks are often limited and fail to reveal differences among the algorithms. First, we show how random instances of probabilistic logic programs (that typically use WMC algorithms for inference) can be generated using constraint programming. We also introduce a new constraint to control the independence structure of the underlying probability distribution and provide a combinatorial argument for the correctness of the constraint model. This model allows us to, for the first time, experimentally investigate inference algorithms on more than just a handful of instances. Second, we introduce a random model for WMC instances with a parameter that influences primal treewidth—the parameter most commonly used to characterise the difficulty of an instance. We show that the easy-hard-easy pattern with respect to clause density is different for algorithms based on dynamic programming and algebraic decision diagrams than for all other solvers. We also demonstrate that all WMC algorithms scale exponentially with respect to primal treewidth, although at differing rates

    AdaCC: cumulative cost-sensitive boosting for imbalanced classification

    Get PDF
    Class imbalance poses a major challenge for machine learning as most supervised learning models might exhibit bias towards the majority class and under-perform in the minority class. Cost-sensitive learning tackles this problem by treating the classes differently, formulated typically via a user-defined fixed misclassification cost matrix provided as input to the learner. Such parameter tuning is a challenging task that requires domain knowledge and moreover, wrong adjustments might lead to overall predictive performance deterioration. In this work, we propose a novel cost-sensitive boosting approach for imbalanced data that dynamically adjusts the misclassification costs over the boosting rounds in response to model’s performance instead of using a fixed misclassification cost matrix. Our method, called AdaCC, is parameter-free as it relies on the cumulative behavior of the boosting model in order to adjust the misclassification costs for the next boosting round and comes with theoretical guarantees regarding the training error. Experiments on 27 real-world datasets from different domains with high class imbalance demonstrate the superiority of our method over 12 state-of-the-art cost-sensitive boosting approaches exhibiting consistent improvements in different measures, for instance, in the range of [0.3–28.56%] for AUC, [3.4–21.4%] for balanced accuracy, [4.8–45%] for gmean and [7.4–85.5%] for recall
    corecore