1,135 research outputs found
Semantic Concept Co-Occurrence Patterns for Image Annotation and Retrieval.
Describing visual image contents by semantic concepts is an effective and straightforward way to facilitate various high level applications. Inferring semantic concepts from low-level pictorial feature analysis is challenging due to the semantic gap problem, while manually labeling concepts is unwise because of a large number of images in both online and offline collections. In this paper, we present a novel approach to automatically generate intermediate image descriptors by exploiting concept co-occurrence patterns in the pre-labeled training set that renders it possible to depict complex scene images semantically. Our work is motivated by the fact that multiple concepts that frequently co-occur across images form patterns which could provide contextual cues for individual concept inference. We discover the co-occurrence patterns as hierarchical communities by graph modularity maximization in a network with nodes and edges representing concepts and co-occurrence relationships separately. A random walk process working on the inferred concept probabilities with the discovered co-occurrence patterns is applied to acquire the refined concept signature representation. Through experiments in automatic image annotation and semantic image retrieval on several challenging datasets, we demonstrate the effectiveness of the proposed concept co-occurrence patterns as well as the concept signature representation in comparison with state-of-the-art approaches
Finding media illustrating events
We present a method combining semantic inferencing and visual analysis for finding automatically media (photos and videos) illustrating events. We report on experiments vali-dating our heuristic for mining media sharing platforms and large event directories in order to mutually enrich the de-scriptions of the content they host. Our overall goal is to design a web-based environment that allows users to explore and select events, to inspect associated media, and to dis-cover meaningful, surprising or entertaining connections be-tween events, media and people participating in events. We present a large dataset composed of semantic descriptions of events, photos and videos interlinked with the larger Linked Open Data cloud and we show the benefits of using semantic web technologies for integrating multimedia metadata
Socializing the Semantic Gap: A Comparative Survey on Image Tag Assignment, Refinement and Retrieval
Where previous reviews on content-based image retrieval emphasize on what can
be seen in an image to bridge the semantic gap, this survey considers what
people tag about an image. A comprehensive treatise of three closely linked
problems, i.e., image tag assignment, refinement, and tag-based image retrieval
is presented. While existing works vary in terms of their targeted tasks and
methodology, they rely on the key functionality of tag relevance, i.e.
estimating the relevance of a specific tag with respect to the visual content
of a given image and its social context. By analyzing what information a
specific method exploits to construct its tag relevance function and how such
information is exploited, this paper introduces a taxonomy to structure the
growing literature, understand the ingredients of the main works, clarify their
connections and difference, and recognize their merits and limitations. For a
head-to-head comparison between the state-of-the-art, a new experimental
protocol is presented, with training sets containing 10k, 100k and 1m images
and an evaluation on three test sets, contributed by various research groups.
Eleven representative works are implemented and evaluated. Putting all this
together, the survey aims to provide an overview of the past and foster
progress for the near future.Comment: to appear in ACM Computing Survey
A Study of Actor and Action Semantic Retention in Video Supervoxel Segmentation
Existing methods in the semantic computer vision community seem unable to
deal with the explosion and richness of modern, open-source and social video
content. Although sophisticated methods such as object detection or
bag-of-words models have been well studied, they typically operate on low level
features and ultimately suffer from either scalability issues or a lack of
semantic meaning. On the other hand, video supervoxel segmentation has recently
been established and applied to large scale data processing, which potentially
serves as an intermediate representation to high level video semantic
extraction. The supervoxels are rich decompositions of the video content: they
capture object shape and motion well. However, it is not yet known if the
supervoxel segmentation retains the semantics of the underlying video content.
In this paper, we conduct a systematic study of how well the actor and action
semantics are retained in video supervoxel segmentation. Our study has human
observers watching supervoxel segmentation videos and trying to discriminate
both actor (human or animal) and action (one of eight everyday actions). We
gather and analyze a large set of 640 human perceptions over 96 videos in 3
different supervoxel scales. Furthermore, we conduct machine recognition
experiments on a feature defined on supervoxel segmentation, called supervoxel
shape context, which is inspired by the higher order processes in human
perception. Our ultimate findings suggest that a significant amount of
semantics have been well retained in the video supervoxel segmentation and can
be used for further video analysis.Comment: This article is in review at the International Journal of Semantic
Computin
Social User Mining: User Profiling of Social Media Network Based on Multimedia Data Mining
In recent years, the pervasive use of social media has generated extraordinary amounts of data that has started to gain an increasing amount of attention. Each social media source utilizes different data types such as textual and visual. For example, Twitter is used to transmit short text messages, whereas Flickr is used to convey images and videos. Moreover, Facebook uses all of these data types. From the social media users’ standpoint, it is highly desirable to find patterns from different data formats. The result of the huge amount of data from different sources or types has provided many opportunities for researchers in the fields of data mining and data analytics. Not only the methods and tools to organize and manage such data have become extremely important, but also methods and tools to discover hidden knowledge from such data, which can be used for a variety of applications. For example, the mining of a user's profile on social media could help to discover any missing information, including the user's location or gender information. However, the task of developing such methods and tools is very challenging. Social media data is unstructured and different from traditional data because of its privacy settings, data noise, and large capacity of data. Moreover, combining image features and text information annotated by users reveals interesting properties of social user mining, and serves as a useful tool for discovering unknown information about the users. Minimal research has been conducted on the combination of image and text data for social user mining. To address these challenges and to discover unknown information about users, we proposed a novel mining framework for social user mining that includes: 1) a data assemble module for different media source, 2) a data integration module, and 3) mining applications. First, we introduced a data assemble module in order to process both the textual and the visual information from different media sources, and evaluated the appropriate multimedia features for social user mining. Then, we proposed a new data integration method in order to integrate the textual and the visual data. Unlike the previous approaches that used a content based approach to merge multiple types of features, our main approach is based on image semantics through a semi-automatic image tagging system. Lastly, we presented two different application as an example of social user mining, gender classification and user location
Information trust, inference and transfer in social and information networks
In this thesis, our overarching goal is to aggregate crowdsourced information that is collected from
computing systems based on social networks and represented in information networks. Due to the autonomous nature of
such a social computing paradigm, the crowdsourced information is often subject to low quality, contributed by susceptible
information sources without a reliant quality control scheme. Thus, to reveal the trustworthiness of the involved information sources, we aim to explore the social dependency behind the social networks where information contributors are prone to be influenced by each other. We explored the impact of such social dependency between sources on the information trust, aggregation and quality in social computing models. On the other hand, we will also investigate the structure underlying information shared by sources to reveal their trustworthiness.
Our study will deepen our understanding of the patterns and behaviors of information sources and their reliability from both social and information aspects. Several closely related problems are investigated in this thesis: (1) the source trustworthiness, which aims to distinguish the untrustworthy sources from the trustworthy ones; (2) social signal processing, which aims to aggregate the multi-source contributed information to recover the true signals behind the problems such as the correct answers to a question and the true labels for an image; (3) the social dependency, which reveals the mutual influences among different sources; and (4) the nature of information structure, such as the information dependency underlying low-rank structure and visual similarities. Our goal is to propose a unified probabilistic model to explain the social and information phenomena behind these problems. In this thesis, we designed several algorithms which are tested in several real social and information network scenarios. Superior performances have been achieved compared with many existing state-of-the-art technologies in the areas
- …