36,580 research outputs found
Context-based Object Viewpoint Estimation: A 2D Relational Approach
The task of object viewpoint estimation has been a challenge since the early
days of computer vision. To estimate the viewpoint (or pose) of an object,
people have mostly looked at object intrinsic features, such as shape or
appearance. Surprisingly, informative features provided by other, extrinsic
elements in the scene, have so far mostly been ignored. At the same time,
contextual cues have been proven to be of great benefit for related tasks such
as object detection or action recognition. In this paper, we explore how
information from other objects in the scene can be exploited for viewpoint
estimation. In particular, we look at object configurations by following a
relational neighbor-based approach for reasoning about object relations. We
show that, starting from noisy object detections and viewpoint estimates,
exploiting the estimated viewpoint and location of other objects in the scene
can lead to improved object viewpoint predictions. Experiments on the KITTI
dataset demonstrate that object configurations can indeed be used as a
complementary cue to appearance-based viewpoint estimation. Our analysis
reveals that the proposed context-based method can improve object viewpoint
estimation by reducing specific types of viewpoint estimation errors commonly
made by methods that only consider local information. Moreover, considering
contextual information produces superior performance in scenes where a high
number of object instances occur. Finally, our results suggest that, following
a cautious relational neighbor formulation brings improvements over its
aggressive counterpart for the task of object viewpoint estimation.Comment: Computer Vision and Image Understanding (CVIU
Answering Visual-Relational Queries in Web-Extracted Knowledge Graphs
A visual-relational knowledge graph (KG) is a multi-relational graph whose
entities are associated with images. We explore novel machine learning
approaches for answering visual-relational queries in web-extracted knowledge
graphs. To this end, we have created ImageGraph, a KG with 1,330 relation
types, 14,870 entities, and 829,931 images crawled from the web. With
visual-relational KGs such as ImageGraph one can introduce novel probabilistic
query types in which images are treated as first-class citizens. Both the
prediction of relations between unseen images as well as multi-relational image
retrieval can be expressed with specific families of visual-relational queries.
We introduce novel combinations of convolutional networks and knowledge graph
embedding methods to answer such queries. We also explore a zero-shot learning
scenario where an image of an entirely new entity is linked with multiple
relations to entities of an existing KG. The resulting multi-relational
grounding of unseen entity images into a knowledge graph serves as a semantic
entity representation. We conduct experiments to demonstrate that the proposed
methods can answer these visual-relational queries efficiently and accurately
Recurrent Collective Classification
We propose a new method for training iterative collective classifiers for
labeling nodes in network data. The iterative classification algorithm (ICA) is
a canonical method for incorporating relational information into
classification. Yet, existing methods for training ICA models rely on the
assumption that relational features reflect the true labels of the nodes. This
unrealistic assumption introduces a bias that is inconsistent with the actual
prediction algorithm. In this paper, we introduce recurrent collective
classification (RCC), a variant of ICA analogous to recurrent neural network
prediction. RCC accommodates any differentiable local classifier and relational
feature functions. We provide gradient-based strategies for optimizing over
model parameters to more directly minimize the loss function. In our
experiments, this direct loss minimization translates to improved accuracy and
robustness on real network data. We demonstrate the robustness of RCC in
settings where local classification is very noisy, settings that are
particularly challenging for ICA
Cognitive Database: A Step towards Endowing Relational Databases with Artificial Intelligence Capabilities
We propose Cognitive Databases, an approach for transparently enabling
Artificial Intelligence (AI) capabilities in relational databases. A novel
aspect of our design is to first view the structured data source as meaningful
unstructured text, and then use the text to build an unsupervised neural
network model using a Natural Language Processing (NLP) technique called word
embedding. This model captures the hidden inter-/intra-column relationships
between database tokens of different types. For each database token, the model
includes a vector that encodes contextual semantic relationships. We seamlessly
integrate the word embedding model into existing SQL query infrastructure and
use it to enable a new class of SQL-based analytics queries called cognitive
intelligence (CI) queries. CI queries use the model vectors to enable complex
queries such as semantic matching, inductive reasoning queries such as
analogies, predictive queries using entities not present in a database, and,
more generally, using knowledge from external sources. We demonstrate unique
capabilities of Cognitive Databases using an Apache Spark based prototype to
execute inductive reasoning CI queries over a multi-modal database containing
text and images. We believe our first-of-a-kind system exemplifies using AI
functionality to endow relational databases with capabilities that were
previously very hard to realize in practice
Improving Visual Relation Detection using Depth Maps
State-of-the-art visual relation detection methods mostly rely on object
information extracted from RGB images such as predicted class probabilities, 2D
bounding boxes and feature maps. Depth maps can additionally provide valuable
information on object relations, e.g. helping to detect not only spatial
relations, such as standing behind, but also non-spatial relations, such as
holding. In this work, we study the effect of using different object
information with a focus on depth maps. To enable this study, we release a new
synthetic dataset of depth maps, VG-Depth, as an extension to Visual Genome
(VG). We also note that given the highly imbalanced distribution of relations
in VG, typical evaluation metrics for visual relation detection cannot reveal
improvements of under-represented relations. To address this problem, we
propose using an additional metric, calling it Macro Recall@K, and demonstrate
its remarkable performance on VG. Finally, our experiments confirm that by
effective utilization of depth maps within a simple, yet competitive framework,
the performance of visual relation detection can be significantly improved
Open-World Visual Recognition Using Knowledge Graphs
In a real-world setting, visual recognition systems can be brought to make
predictions for images belonging to previously unknown class labels. In order
to make semantically meaningful predictions for such inputs, we propose a
two-step approach that utilizes information from knowledge graphs. First, a
knowledge-graph representation is learned to embed a large set of entities into
a semantic space. Second, an image representation is learned to embed images
into the same space. Under this setup, we are able to predict structured
properties in the form of relationship triples for any open-world image. This
is true even when a set of labels has been omitted from the training protocols
of both the knowledge graph and image embeddings. Furthermore, we append this
learning framework with appropriate smoothness constraints and show how prior
knowledge can be incorporated into the model. Both these improvements combined
increase performance for visual recognition by a factor of six compared to our
baseline. Finally, we propose a new, extended dataset which we use for
experiments
Graph Based Classification Methods Using Inaccurate External Classifier Information
In this paper we consider the problem of collectively classifying entities
where relational information is available across the entities. In practice
inaccurate class distribution for each entity is often available from another
(external) classifier. For example this distribution could come from a
classifier built using content features or a simple dictionary. Given the
relational and inaccurate external classifier information, we consider two
graph based settings in which the problem of collective classification can be
solved. In the first setting the class distribution is used to fix labels to a
subset of nodes and the labels for the remaining nodes are obtained like in a
transductive setting. In the other setting the class distributions of all nodes
are used to define the fitting function part of a graph regularized objective
function. We define a generalized objective function that handles both the
settings. Methods like harmonic Gaussian field and local-global consistency
(LGC) reported in the literature can be seen as special cases. We extend the
LGC and weighted vote relational neighbor classification (WvRN) methods to
support usage of external classifier information. We also propose an efficient
least squares regularization (LSR) based method and relate it to information
regularization methods. All the methods are evaluated on several benchmark and
real world datasets. Considering together speed, robustness and accuracy,
experimental results indicate that the LSR and WvRN-extension methods perform
better than other methods.Comment: 12 page
Collective Semi-Supervised Learning for User Profiling in Social Media
The abundance of user-generated data in social media has incentivized the
development of methods to infer the latent attributes of users, which are
crucially useful for personalization, advertising and recommendation. However,
the current user profiling approaches have limited success, due to the lack of
a principled way to integrate different types of social relationships of a
user, and the reliance on scarcely-available labeled data in building a
prediction model. In this paper, we present a novel solution termed Collective
Semi-Supervised Learning (CSL), which provides a principled means to integrate
different types of social relationship and unlabeled data under a unified
computational framework. The joint learning from multiple relationships and
unlabeled data yields a computationally sound and accurate approach to model
user attributes in social media. Extensive experiments using Twitter data have
demonstrated the efficacy of our CSL approach in inferring user attributes such
as account type and marital status. We also show how CSL can be used to
determine important user features, and to make inference on a larger user
population
A Comprehensive Survey of Graph Embedding: Problems, Techniques and Applications
Graph is an important data representation which appears in a wide diversity
of real-world scenarios. Effective graph analytics provides users a deeper
understanding of what is behind the data, and thus can benefit a lot of useful
applications such as node classification, node recommendation, link prediction,
etc. However, most graph analytics methods suffer the high computation and
space cost. Graph embedding is an effective yet efficient way to solve the
graph analytics problem. It converts the graph data into a low dimensional
space in which the graph structural information and graph properties are
maximally preserved. In this survey, we conduct a comprehensive review of the
literature in graph embedding. We first introduce the formal definition of
graph embedding as well as the related concepts. After that, we propose two
taxonomies of graph embedding which correspond to what challenges exist in
different graph embedding problem settings and how the existing work address
these challenges in their solutions. Finally, we summarize the applications
that graph embedding enables and suggest four promising future research
directions in terms of computation efficiency, problem settings, techniques and
application scenarios.Comment: A 20-page comprehensive survey of graph/network embedding for over
150+ papers till year 2018. It provides systematic categorization of
problems, techniques and applications. Accepted by IEEE Transactions on
Knowledge and Data Engineering (TKDE). Comments and suggestions are welcomed
for continuously improving this surve
Improving Information Extraction from Images with Learned Semantic Models
Many applications require an understanding of an image that goes beyond the
simple detection and classification of its objects. In particular, a great deal
of semantic information is carried in the relationships between objects. We
have previously shown that the combination of a visual model and a statistical
semantic prior model can improve on the task of mapping images to their
associated scene description. In this paper, we review the model and compare it
to a novel conditional multi-way model for visual relationship detection, which
does not include an explicitly trained visual prior model. We also discuss
potential relationships between the proposed methods and memory models of the
human brain
- …