12,633 research outputs found
Testing Identifiable Kernel P Systems Using an X-machine Approach
This paper presents a testing approach for kernel P systems (kP systems),
based on the X-machine testing framework and the concept of cover automaton. The
testing methodology ensures that the implementation conforms the speci cations, under
certain conditions, such as the identi ably concept in the context of kernel P systems
Search Based Software Engineering in Membrane Computing
This paper presents a testing approach for kernel P Systems (kP systems),
based on test data generation for a given scenario. This method uses Genetic Algorithms
to generate the input sets needed to trigger the given computation steps
Distinguishing cause from effect using observational data: methods and benchmarks
The discovery of causal relationships from purely observational data is a
fundamental problem in science. The most elementary form of such a causal
discovery problem is to decide whether X causes Y or, alternatively, Y causes
X, given joint observations of two variables X, Y. An example is to decide
whether altitude causes temperature, or vice versa, given only joint
measurements of both variables. Even under the simplifying assumptions of no
confounding, no feedback loops, and no selection bias, such bivariate causal
discovery problems are challenging. Nevertheless, several approaches for
addressing those problems have been proposed in recent years. We review two
families of such methods: Additive Noise Methods (ANM) and Information
Geometric Causal Inference (IGCI). We present the benchmark CauseEffectPairs
that consists of data for 100 different cause-effect pairs selected from 37
datasets from various domains (e.g., meteorology, biology, medicine,
engineering, economy, etc.) and motivate our decisions regarding the "ground
truth" causal directions of all pairs. We evaluate the performance of several
bivariate causal discovery methods on these real-world benchmark data and in
addition on artificially simulated data. Our empirical results on real-world
data indicate that certain methods are indeed able to distinguish cause from
effect using only purely observational data, although more benchmark data would
be needed to obtain statistically significant conclusions. One of the best
performing methods overall is the additive-noise method originally proposed by
Hoyer et al. (2009), which obtains an accuracy of 63+-10 % and an AUC of
0.74+-0.05 on the real-world benchmark. As the main theoretical contribution of
this work we prove the consistency of that method.Comment: 101 pages, second revision submitted to Journal of Machine Learning
Researc
Class Proportion Estimation with Application to Multiclass Anomaly Rejection
This work addresses two classification problems that fall under the heading
of domain adaptation, wherein the distributions of training and testing
examples differ. The first problem studied is that of class proportion
estimation, which is the problem of estimating the class proportions in an
unlabeled testing data set given labeled examples of each class. Compared to
previous work on this problem, our approach has the novel feature that it does
not require labeled training data from one of the classes. This property allows
us to address the second domain adaptation problem, namely, multiclass anomaly
rejection. Here, the goal is to design a classifier that has the option of
assigning a "reject" label, indicating that the instance did not arise from a
class present in the training data. We establish consistent learning strategies
for both of these domain adaptation problems, which to our knowledge are the
first of their kind. We also implement the class proportion estimation
technique and demonstrate its performance on several benchmark data sets.Comment: Accepted to AISTATS 2014. 15 pages. 2 figure
Classification with Asymmetric Label Noise: Consistency and Maximal Denoising
In many real-world classification problems, the labels of training examples
are randomly corrupted. Most previous theoretical work on classification with
label noise assumes that the two classes are separable, that the label noise is
independent of the true class label, or that the noise proportions for each
class are known. In this work, we give conditions that are necessary and
sufficient for the true class-conditional distributions to be identifiable.
These conditions are weaker than those analyzed previously, and allow for the
classes to be nonseparable and the noise levels to be asymmetric and unknown.
The conditions essentially state that a majority of the observed labels are
correct and that the true class-conditional distributions are "mutually
irreducible," a concept we introduce that limits the similarity of the two
distributions. For any label noise problem, there is a unique pair of true
class-conditional distributions satisfying the proposed conditions, and we
argue that this pair corresponds in a certain sense to maximal denoising of the
observed distributions.
Our results are facilitated by a connection to "mixture proportion
estimation," which is the problem of estimating the maximal proportion of one
distribution that is present in another. We establish a novel rate of
convergence result for mixture proportion estimation, and apply this to obtain
consistency of a discrimination rule based on surrogate loss minimization.
Experimental results on benchmark data and a nuclear particle classification
problem demonstrate the efficacy of our approach
- …