55,794 research outputs found
Subsumption Algorithms for Three-Valued Geometric Resolution
In our implementation of geometric resolution, the most costly operation is
subsumption testing (or matching): One has to decide for a three-valued,
geometric formula, if this formula is false in a given interpretation. The
formula contains only atoms with variables, equality, and existential
quantifiers. The interpretation contains only atoms with constants. Because the
atoms have no term structure, matching for geometric resolution is hard. We
translate the matching problem into a generalized constraint satisfaction
problem, and discuss several approaches for solving it efficiently, one direct
algorithm and two translations to propositional SAT. After that, we study
filtering techniques based on local consistency checking. Such filtering
techniques can a priori refute a large percentage of generalized constraint
satisfaction problems. Finally, we adapt the matching algorithms in such a way
that they find solutions that use a minimal subset of the interpretation. The
adaptation can be combined with every matching algorithm. The techniques
presented in this paper may have applications in constraint solving independent
of geometric resolution.Comment: This version was revised on 18.05.201
Scavenger 0.1: A Theorem Prover Based on Conflict Resolution
This paper introduces Scavenger, the first theorem prover for pure
first-order logic without equality based on the new conflict resolution
calculus. Conflict resolution has a restricted resolution inference rule that
resembles (a first-order generalization of) unit propagation as well as a rule
for assuming decision literals and a rule for deriving new clauses by (a
first-order generalization of) conflict-driven clause learning.Comment: Published at CADE 201
Residual Weighted Learning for Estimating Individualized Treatment Rules
Personalized medicine has received increasing attention among statisticians,
computer scientists, and clinical practitioners. A major component of
personalized medicine is the estimation of individualized treatment rules
(ITRs). Recently, Zhao et al. (2012) proposed outcome weighted learning (OWL)
to construct ITRs that directly optimize the clinical outcome. Although OWL
opens the door to introducing machine learning techniques to optimal treatment
regimes, it still has some problems in performance. In this article, we propose
a general framework, called Residual Weighted Learning (RWL), to improve finite
sample performance. Unlike OWL which weights misclassification errors by
clinical outcomes, RWL weights these errors by residuals of the outcome from a
regression fit on clinical covariates excluding treatment assignment. We
utilize the smoothed ramp loss function in RWL, and provide a difference of
convex (d.c.) algorithm to solve the corresponding non-convex optimization
problem. By estimating residuals with linear models or generalized linear
models, RWL can effectively deal with different types of outcomes, such as
continuous, binary and count outcomes. We also propose variable selection
methods for linear and nonlinear rules, respectively, to further improve the
performance. We show that the resulting estimator of the treatment rule is
consistent. We further obtain a rate of convergence for the difference between
the expected outcome using the estimated ITR and that of the optimal treatment
rule. The performance of the proposed RWL methods is illustrated in simulation
studies and in an analysis of cystic fibrosis clinical trial data.Comment: 48 pages, 3 figure
Oversampling for Imbalanced Learning Based on K-Means and SMOTE
Learning from class-imbalanced data continues to be a common and challenging
problem in supervised learning as standard classification algorithms are
designed to handle balanced class distributions. While different strategies
exist to tackle this problem, methods which generate artificial data to achieve
a balanced class distribution are more versatile than modifications to the
classification algorithm. Such techniques, called oversamplers, modify the
training data, allowing any classifier to be used with class-imbalanced
datasets. Many algorithms have been proposed for this task, but most are
complex and tend to generate unnecessary noise. This work presents a simple and
effective oversampling method based on k-means clustering and SMOTE
oversampling, which avoids the generation of noise and effectively overcomes
imbalances between and within classes. Empirical results of extensive
experiments with 71 datasets show that training data oversampled with the
proposed method improves classification results. Moreover, k-means SMOTE
consistently outperforms other popular oversampling methods. An implementation
is made available in the python programming language.Comment: 19 pages, 8 figure
- …