38,954 research outputs found
Semantically Consistent Regularization for Zero-Shot Recognition
The role of semantics in zero-shot learning is considered. The effectiveness
of previous approaches is analyzed according to the form of supervision
provided. While some learn semantics independently, others only supervise the
semantic subspace explained by training classes. Thus, the former is able to
constrain the whole space but lacks the ability to model semantic correlations.
The latter addresses this issue but leaves part of the semantic space
unsupervised. This complementarity is exploited in a new convolutional neural
network (CNN) framework, which proposes the use of semantics as constraints for
recognition.Although a CNN trained for classification has no transfer ability,
this can be encouraged by learning an hidden semantic layer together with a
semantic code for classification. Two forms of semantic constraints are then
introduced. The first is a loss-based regularizer that introduces a
generalization constraint on each semantic predictor. The second is a codeword
regularizer that favors semantic-to-class mappings consistent with prior
semantic knowledge while allowing these to be learned from data. Significant
improvements over the state-of-the-art are achieved on several datasets.Comment: Accepted to CVPR 201
ERBlox: Combining Matching Dependencies with Machine Learning for Entity Resolution
Entity resolution (ER), an important and common data cleaning problem, is
about detecting data duplicate representations for the same external entities,
and merging them into single representations. Relatively recently, declarative
rules called "matching dependencies" (MDs) have been proposed for specifying
similarity conditions under which attribute values in database records are
merged. In this work we show the process and the benefits of integrating four
components of ER: (a) Building a classifier for duplicate/non-duplicate record
pairs built using machine learning (ML) techniques; (b) Use of MDs for
supporting the blocking phase of ML; (c) Record merging on the basis of the
classifier results; and (d) The use of the declarative language "LogiQL" -an
extended form of Datalog supported by the "LogicBlox" platform- for all
activities related to data processing, and the specification and enforcement of
MDs.Comment: Final journal version, with some minor technical corrections.
Extended version of arXiv:1508.0601
Learning Interpretable Rules for Multi-label Classification
Multi-label classification (MLC) is a supervised learning problem in which,
contrary to standard multiclass classification, an instance can be associated
with several class labels simultaneously. In this chapter, we advocate a
rule-based approach to multi-label classification. Rule learning algorithms are
often employed when one is not only interested in accurate predictions, but
also requires an interpretable theory that can be understood, analyzed, and
qualitatively evaluated by domain experts. Ideally, by revealing patterns and
regularities contained in the data, a rule-based theory yields new insights in
the application domain. Recently, several authors have started to investigate
how rule-based models can be used for modeling multi-label data. Discussing
this task in detail, we highlight some of the problems that make rule learning
considerably more challenging for MLC than for conventional classification.
While mainly focusing on our own previous work, we also provide a short
overview of related work in this area.Comment: Preprint version. To appear in: Explainable and Interpretable Models
in Computer Vision and Machine Learning. The Springer Series on Challenges in
Machine Learning. Springer (2018). See
http://www.ke.tu-darmstadt.de/bibtex/publications/show/3077 for further
informatio
ERBlox: Combining Matching Dependencies with Machine Learning for Entity Resolution
Entity resolution (ER), an important and common data cleaning problem, is
about detecting data duplicate representations for the same external entities,
and merging them into single representations. Relatively recently, declarative
rules called matching dependencies (MDs) have been proposed for specifying
similarity conditions under which attribute values in database records are
merged. In this work we show the process and the benefits of integrating three
components of ER: (a) Classifiers for duplicate/non-duplicate record pairs
built using machine learning (ML) techniques, (b) MDs for supporting both the
blocking phase of ML and the merge itself; and (c) The use of the declarative
language LogiQL -an extended form of Datalog supported by the LogicBlox
platform- for data processing, and the specification and enforcement of MDs.Comment: To appear in Proc. SUM, 201
Using concept lattices to mine functional dependencies
Concept Lattices have been proved to be a valuable tool to represent
the knowlegde in a database.
In this paper we show how functional dependencies in databases
can be extracted using Concept Lattices, not preprocessing the original
database,
but providing a new closure operator. We also prove that this method
generalizes the previous methods and
closure operators that are being used to find association rules in binary
databases.Postprint (published version
- …