37,538 research outputs found
Recommended from our members
Semantic Concept Co-Occurrence Patterns for Image Annotation and Retrieval.
Describing visual image contents by semantic concepts is an effective and straightforward way to facilitate various high level applications. Inferring semantic concepts from low-level pictorial feature analysis is challenging due to the semantic gap problem, while manually labeling concepts is unwise because of a large number of images in both online and offline collections. In this paper, we present a novel approach to automatically generate intermediate image descriptors by exploiting concept co-occurrence patterns in the pre-labeled training set that renders it possible to depict complex scene images semantically. Our work is motivated by the fact that multiple concepts that frequently co-occur across images form patterns which could provide contextual cues for individual concept inference. We discover the co-occurrence patterns as hierarchical communities by graph modularity maximization in a network with nodes and edges representing concepts and co-occurrence relationships separately. A random walk process working on the inferred concept probabilities with the discovered co-occurrence patterns is applied to acquire the refined concept signature representation. Through experiments in automatic image annotation and semantic image retrieval on several challenging datasets, we demonstrate the effectiveness of the proposed concept co-occurrence patterns as well as the concept signature representation in comparison with state-of-the-art approaches
Deep View-Sensitive Pedestrian Attribute Inference in an end-to-end Model
Pedestrian attribute inference is a demanding problem in visual surveillance
that can facilitate person retrieval, search and indexing. To exploit semantic
relations between attributes, recent research treats it as a multi-label image
classification task. The visual cues hinting at attributes can be strongly
localized and inference of person attributes such as hair, backpack, shorts,
etc., are highly dependent on the acquired view of the pedestrian. In this
paper we assert this dependence in an end-to-end learning framework and show
that a view-sensitive attribute inference is able to learn better attribute
predictions. Our proposed model jointly predicts the coarse pose (view) of the
pedestrian and learns specialized view-specific multi-label attribute
predictions. We show in an extensive evaluation on three challenging datasets
(PETA, RAP and WIDER) that our proposed end-to-end view-aware attribute
prediction model provides competitive performance and improves on the published
state-of-the-art on these datasets.Comment: accepted BMVC 201
K-Space at TRECVid 2007
In this paper we describe K-Space participation in
TRECVid 2007. K-Space participated in two tasks, high-level feature extraction and interactive search. We present our approaches for each of these activities and provide a brief analysis of our results. Our high-level feature submission utilized multi-modal low-level features which included visual, audio and temporal elements. Specific concept detectors (such as Face detectors) developed by K-Space partners were also used. We experimented with different machine learning approaches including logistic regression and support vector machines (SVM). Finally we also experimented with both early and late fusion for feature combination. This year we also participated in interactive search, submitting 6 runs. We developed two interfaces which both utilized the same retrieval functionality. Our objective was to measure the effect of context, which was supported to different degrees in each interface, on user performance.
The first of the two systems was a āshotā based interface,
where the results from a query were presented as a ranked
list of shots. The second interface was ābroadcastā based,
where results were presented as a ranked list of broadcasts.
Both systems made use of the outputs of our high-level feature submission as well as low-level visual features
- ā¦