13,072 research outputs found
Numeric Input Relations for Relational Learning with Applications to Community Structure Analysis
Most work in the area of statistical relational learning (SRL) is focussed on
discrete data, even though a few approaches for hybrid SRL models have been
proposed that combine numerical and discrete variables. In this paper we
distinguish numerical random variables for which a probability distribution is
defined by the model from numerical input variables that are only used for
conditioning the distribution of discrete response variables. We show how
numerical input relations can very easily be used in the Relational Bayesian
Network framework, and that existing inference and learning methods need only
minor adjustments to be applied in this generalized setting. The resulting
framework provides natural relational extensions of classical probabilistic
models for categorical data. We demonstrate the usefulness of RBN models with
numeric input relations by several examples.
In particular, we use the augmented RBN framework to define probabilistic
models for multi-relational (social) networks in which the probability of a
link between two nodes depends on numeric latent feature vectors associated
with the nodes. A generic learning procedure can be used to obtain a
maximum-likelihood fit of model parameters and latent feature values for a
variety of models that can be expressed in the high-level RBN representation.
Specifically, we propose a model that allows us to interpret learned latent
feature values as community centrality degrees by which we can identify nodes
that are central for one community, that are hubs between communities, or that
are isolated nodes. In a multi-relational setting, the model also provides a
characterization of how different relations are associated with each community
Multi-task CNN Model for Attribute Prediction
This paper proposes a joint multi-task learning algorithm to better predict
attributes in images using deep convolutional neural networks (CNN). We
consider learning binary semantic attributes through a multi-task CNN model,
where each CNN will predict one binary attribute. The multi-task learning
allows CNN models to simultaneously share visual knowledge among different
attribute categories. Each CNN will generate attribute-specific feature
representations, and then we apply multi-task learning on the features to
predict their attributes. In our multi-task framework, we propose a method to
decompose the overall model's parameters into a latent task matrix and
combination matrix. Furthermore, under-sampled classifiers can leverage shared
statistics from other classifiers to improve their performance. Natural
grouping of attributes is applied such that attributes in the same group are
encouraged to share more knowledge. Meanwhile, attributes in different groups
will generally compete with each other, and consequently share less knowledge.
We show the effectiveness of our method on two popular attribute datasets.Comment: 11 pages, 3 figures, ieee transaction pape
From Data Fusion to Knowledge Fusion
The task of {\em data fusion} is to identify the true values of data items
(eg, the true date of birth for {\em Tom Cruise}) among multiple observed
values drawn from different sources (eg, Web sites) of varying (and unknown)
reliability. A recent survey\cite{LDL+12} has provided a detailed comparison of
various fusion methods on Deep Web data. In this paper, we study the
applicability and limitations of different fusion techniques on a more
challenging problem: {\em knowledge fusion}. Knowledge fusion identifies true
subject-predicate-object triples extracted by multiple information extractors
from multiple information sources. These extractors perform the tasks of entity
linkage and schema alignment, thus introducing an additional source of noise
that is quite different from that traditionally considered in the data fusion
literature, which only focuses on factual errors in the original sources. We
adapt state-of-the-art data fusion techniques and apply them to a knowledge
base with 1.6B unique knowledge triples extracted by 12 extractors from over 1B
Web pages, which is three orders of magnitude larger than the data sets used in
previous data fusion papers. We show great promise of the data fusion
approaches in solving the knowledge fusion problem, and suggest interesting
research directions through a detailed error analysis of the methods.Comment: VLDB'201
- …