74,910 research outputs found
Learning and Interpreting Multi-Multi-Instance Learning Networks
We introduce an extension of the multi-instance learning problem where
examples are organized as nested bags of instances (e.g., a document could be
represented as a bag of sentences, which in turn are bags of words). This
framework can be useful in various scenarios, such as text and image
classification, but also supervised learning over graphs. As a further
advantage, multi-multi instance learning enables a particular way of
interpreting predictions and the decision function. Our approach is based on a
special neural network layer, called bag-layer, whose units aggregate bags of
inputs of arbitrary size. We prove theoretically that the associated class of
functions contains all Boolean functions over sets of sets of instances and we
provide empirical evidence that functions of this kind can be actually learned
on semi-synthetic datasets. We finally present experiments on text
classification, on citation graphs, and social graph data, which show that our
model obtains competitive results with respect to accuracy when compared to
other approaches such as convolutional networks on graphs, while at the same
time it supports a general approach to interpret the learnt model, as well as
explain individual predictions.Comment: JML
A study of hierarchical and flat classification of proteins
Automatic classification of proteins using machine learning is an important problem that has received significant attention in the literature. One feature of this problem is that expert-defined hierarchies of protein classes exist and can potentially be exploited to improve classification performance. In this article we investigate empirically whether this is the case for two such hierarchies. We compare multi-class classification techniques that exploit the information in those class hierarchies and those that do not, using logistic regression, decision trees, bagged decision trees, and support vector machines as the underlying base learners. In particular, we compare hierarchical and flat variants of ensembles of nested dichotomies. The latter have been shown to deliver strong classification performance in multi-class settings. We present experimental results for synthetic, fold recognition, enzyme classification, and remote homology detection data. Our results show that exploiting the class hierarchy improves performance on the synthetic data, but not in the case of the protein classification problems. Based on this we recommend that strong flat multi-class methods be used as a baseline to establish the benefit of exploiting class hierarchies in this area
Multilabel Classification with R Package mlr
We implemented several multilabel classification algorithms in the machine
learning package mlr. The implemented methods are binary relevance, classifier
chains, nested stacking, dependent binary relevance and stacking, which can be
used with any base learner that is accessible in mlr. Moreover, there is access
to the multilabel classification versions of randomForestSRC and rFerns. All
these methods can be easily compared by different implemented multilabel
performance measures and resampling methods in the standardized mlr framework.
In a benchmark experiment with several multilabel datasets, the performance of
the different methods is evaluated.Comment: 18 pages, 2 figures, to be published in R Journal; reference
correcte
Representing Conversations for Scalable Overhearing
Open distributed multi-agent systems are gaining interest in the academic
community and in industry. In such open settings, agents are often coordinated
using standardized agent conversation protocols. The representation of such
protocols (for analysis, validation, monitoring, etc) is an important aspect of
multi-agent applications. Recently, Petri nets have been shown to be an
interesting approach to such representation, and radically different approaches
using Petri nets have been proposed. However, their relative strengths and
weaknesses have not been examined. Moreover, their scalability and suitability
for different tasks have not been addressed. This paper addresses both these
challenges. First, we analyze existing Petri net representations in terms of
their scalability and appropriateness for overhearing, an important task in
monitoring open multi-agent systems. Then, building on the insights gained, we
introduce a novel representation using Colored Petri nets that explicitly
represent legal joint conversation states and messages. This representation
approach offers significant improvements in scalability and is particularly
suitable for overhearing. Furthermore, we show that this new representation
offers a comprehensive coverage of all conversation features of FIPA
conversation standards. We also present a procedure for transforming AUML
conversation protocol diagrams (a standard human-readable representation), to
our Colored Petri net representation
- …